Introduction
The available literature exposes that the law and its validity impact both economic progress (Chowdhury et al., Reference Chowdhury, Audretsch and Belitski2019) and politics (Campos and Giovannoni, Reference Campos and Giovannoni2017). However, laws are not always effective, and some do not produce corresponding results (Ben-Bassat and Dahan, Reference Ben-Bassat and Dahan2008; Chilton and Versteeg, Reference Chilton and Versteeg2016, Reference Chilton and Versteeg2017). Particular attention in this context has been devoted to the issue of constitutional compliance, that is the degree to which a constitution is complied with.
Several works have tried to better understand potential determinants of this phenomenon (Law and Versteeg, Reference Law and Versteeg2013; Lewkowicz and Lewczuk, Reference Lewkowicz and Lewczuk2023; Metelska-Szaniawska, Reference Metelska-Szaniawska2020). However, we are still far from satisfactorily explaining the divergence between constitutional text and constitutional reality (Voigt, Reference Voigt2021). It seems likely that this is due to the fact that some factors have not received sufficient attention from researchers.
One aspect that has been largely ignored relates to the way constitutions have been written. After all, deciding about whether or not political elites act in accordance with constitutional regulations hinges on the interpretation of the constitutional text and thus may depend on its precision, linguistic complexity or the extent to which it refers to facts and emotions (Alicea, Reference Alicea2022). While this has been acknowledged in the literature (Law and Versteeg, Reference Law and Versteeg2013; Voigt, Reference Voigt2020), the existing empirical studies hardly use textual data and if they do their focus is typically only on quite basic text characteristics such as the number of words, or the number of rights promised in a constitution.
The present paper tries to complement this research by taking a deeper look at the information encoded in constitutional text. The three specific aspects of constitutions which we look at are polarity (the degree to which the text is emotionally positive or negative, that is whether it refers more to guarantees/safeguards or to restrictions), subjectivity (the degree to which the text leaves room for discretion for judges, politicians or other social actors) and readability (the degree to which the text is easy to understand). To the best of our knowledge, neither of these text features has been analysed in the context of constitutional compliance to date. The paper closest to ours is that by Gutmann et al. (Reference Gutmann, Khesali and Voigt2021), yet their focus is different from ours in two important aspects. First, their attention is solely on constitutional comprehensibility. Second, the authors concentrate on the impact of constitutional comprehensibility on mass protests. In our work we measure not only the readability of constitutions, but also two other linguistic dimensions. In addition, our focus is on the direct relationship between the text features and constitutional compliance.
That said, our analysis should not be seen through the lenses of a standard enquiry aimed at uncovering causal relationships. After all, words, although they structure our behaviour, cannot be deemed the reasons for our actions. On top of that, our econometric strategy is likely to suffer from omitted variables and selection bias. Notwithstanding these caveats, having a closer look at the extent to which text features correlate with certain behaviour may be very useful for our understanding of when and why constitutions may (or may not) be complied with.
In the analysis, we use data on constitutional texts which were valid until 2019 in a sample of 94 democratic countries (electoral democracies, as indicated by Bjørnskov and Rode, Reference Bjørnskov and Rode2020). This is motivated by the fact that the role of constitutions in democracies is likely to be different from its importance in autocracies (Elkins et al., Reference Elkins, Ginsburg, Melton, Ginsburg and Simpser2014; Voigt, Reference Voigt2020). Our work raises the following points. First, it suggests that constitutions containing more words that are emotionally negative and usually straightforward about penalizing certain behaviours seem to be more complied with by the top representatives of the various government branches. Second, a similar observation can be made for constitutions that are more concise. Third, our econometric analysis does not suggest any systematic association between the level of constitutional compliance and the degree to which the text leaves room for different interpretations. A different perspective, however, is suggested by machine learning models which indicate that this text feature should be considered a potentially important determinant of constitutional compliance. Fourth, while we find that constitutions which are easy to read have, on average, higher levels of compliance, these results remain rather inconclusive as they depend on the exact measure of readability that we use.
Constitution's structure and constitutional compliance
It has been argued that the precision of constitutional rules can positively affect constitutional compliance through collective action (Voigt, Reference Voigt2021). This is because the more precise the rules, the less room for their interpretation. Coordinated understanding of what constitutes a transgression, in turn, facilitates collective action to resist the government and sanction non-compliance, thus increasing its costs (Weingast, Reference Weingast1997). This view has led researchers to focus their research efforts on measuring constitutional comprehensibility/readability (Gutmann et al., Reference Gutmann, Khesali and Voigt2021).
This idea, however, rests on a questionable assumption that citizens are engaged in constitutional affairs as fully as would be desirable.Footnote 1 A potential solution, in the form of unambiguous interpretation of constitutional norms, might theoretically be provided by courts. However, as argued by Vanberg (Reference Vanberg2011), relying on constitutional review might not solve the problem, as judicial interpretation can importantly vary over time or across jurisdictions.
The above discussion has naturally directed researchers to look at another attribute of the constitution text – its length. It has been stated that shorter constitutions are typically more entrenched (Versteeg and Zackin, Reference Versteeg and Zackin2016). Constitutional entrenchment, in turn, can be considered important for constitutional compliance if constitutions more difficult to amend may increase the likelihood that the constitution will be ignored (Voigt, Reference Voigt2021). That being said, as mentioned by Versteeg and Zackin (Reference Versteeg and Zackin2016) themselves, the majority of constitutions today are subject to frequent revisions, and thus their long-term entrenchment can be questioned. In addition, longer constitutions do not need to be less entrenched than shorter constitutions because in longer constitutions some articles or principles can be given higher levels of entrenchment (Dixon and Landau, Reference Dixon and Landau2018). Relatedly, as argued by Voigt (Reference Voigt2009), the length of the constitution is also not a good approximation of how much the text allows for discretionary interpretation as longer legislation is more likely to be inherently conflicting than shorter legislation. This led researchers to pay more attention to the precision of constitutional text. Indeed, potential ambiguities in the interpretation of constitutional provisions can be exploited by politicians who would like to circumvent the existing constraints. As argued by Vanberg (Reference Vanberg2011), in terms of enforcing constitutional norms, procedural constraints offer significant advantages over regulations that aim at securing broader substantive values. This is because the former constraints often rely on specific language and may tap into focal understandings that emerge out of a shared political history. Thus, they leave little room for interpretation. Substantive constraints instead employ broad, aspirational language that provides room for interpretation.
Obviously, the issue is how to capture these features in empirical analysis. The nature of the language in which a rule is framed is unlikely to be captured by simple readability metrics which focus on word or sentence length. Nor can it be caught by comprehensibility metrics, which typically refer to the per cent of words that a reader easily understands. Put differently, assessing the extent to which constraints in question are susceptible to different interpretations requires capturing another linguistic dimension. This is what we want to capture with our subjectivity measure (for details see below).
Yet another point that may matter in this context relates to the fact that legal norms can be classified with respect to their semantic type, like prohibitions, duties, permissions or guarantees (Waltl et al., Reference Waltl, Bonczek, Scepankova and Matthes2019). Whether the law is full of sanctions bringing the fear of punishment or simple guarantees and freedoms affects people's belief in the law and law obedience (Tyler, Reference Tyler2006). This, in turn, clearly corresponds to the distinction between the deterrence model and the one based on legitimacy or morality (Darley et al., Reference Darley, Tyler, Bliz, Hogg and Cooper2003; Kahan, Reference Kahan1999). The literature suggests that some norms are enforced even by emotions (Alicea, Reference Alicea2022; Sajo, Reference Sajo2011), so that the tone of regulations matters. Indeed, ‘emotionally loaded words quickly attract attention, and bad words (war, crime) attract attention faster than do happy words (peace, love)’ (Kahneman, Reference Kahneman2011: 301). This suggests that the paragraphs of constitutional texts which specify punishments for transgressions or define the list of things which are prohibited can be more thoroughly processed than the paragraphs referring to humans' dignity or human rights. This textual characteristic we aim at capturing with our polarity measure (see below).
Empirical methodology
Data and empirical design
Data
Constitutional compliance: In order to measure constitutional compliance, we use a novel variable from the Comparative Constitutional Compliance Database (Gutmann et al., Reference Gutmann, Metelska-Szaniawska and Voigt2024). It is a continuous indicator measuring to what extent representatives of the state, such as cabinet members, legislators, and members of the country's top court(s), respect the constitution in the four legal areas: (i) property rights and the rule of law, (ii) political rights, (iii) civil rights, and (iv) basic human rights. The higher is constitutional compliance, the higher values it takes. Its construction is based on combining two data sources. Data on de jure rules in national constitutions comes from the Comparative Constitutions Project (Elkins et al., Reference Elkins, Ginsburg and Melton2009). Data on de facto compliance, in turn, comes from the Varieties of Democracy dataset (Coppedge et al., Reference Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, Altman, Bernhard, Cornell, Fish, Gastaldi, Gjerløw, Glynn, Grahn, Hicken, Hindle, Ilchenko, Kinzelbach, Krusell, Marquardt, McMann, Mechkova, Medzihorsky, Paxton, Pemstein, Pernes, Rydén, von Römer, Seim, Sigman, Skaaning, Staton, Sundström, Tzelgov, Wang, Wig, Wilson and Ziblatt2022, see also the note below Table A1 in the online Appendix). Figure A1 in the online Appendix shows the geographical distribution of constitutional compliance across electoral democracies.
While the indicator offers several advantages over the other existing measures, it is still worth bearing in mind the following points.Footnote 2 First, constitutional compliance has always been a contested idea. Consequently, there is scope for reasonable disagreement as to what counts as compliance, especially given the fact that judgments about compliance are complicated by the time frame and often depend on formal procedures/decisions made by different institutions. In addition, there are some limitations to interpreting the measure of constitutional compliance in ‘cardinal’ terms. Furthermore, the comparability between countries is somewhat limited – as experts responsible for coding different countries might have had a different understanding of constitutional compliance or violations. These issues should be kept in mind while reading our results.
Structure of constitutional texts: In order to construct the variables capturing text features of the existing constitutions, we scraped the constitutional texts of 94 democratic countries (for the full list see online Table A2) whose constitutions were in effect in 2019. The source of these texts was The Constitute Project.Footnote 3 The reason to focus on one year and not to investigate changes in constitutional texts is driven by the fact that in many countries constitutions are relatively stable and can remain unchanged for decades. In our analysis, we skipped the preamble and titles of the constitution and focused solely on the body of the constitutional text. Next, we ran a basic sweep (e.g. removing special characters and excessive white spaces) of the raw text and calculated the readability score using five metrics: the Flesch reading-ease score (flesch_score), Flesch-Kincaid grade level (fleschkincaid_score), Gunning Fog index (gunningfog_score), Simple Measure of Gobbledygook (smog_score), and Dale-Chall readability indices (dalechall_score). These measures refer to how difficult it is to read and understand a text, with higher scores indicating higher comprehension difficulty that readers come upon when reading a text.Footnote 4 Given the popularity of the Flesch score and Dale-Chall index, we focus on these two measures, alongside an aggregated measure of readability based on all five scores mentioned above.
In order to measure the subjectivity and polarity scores, a deeper cleaning was subsequently carried out, including tokenization, lemmatization, and stop words removal. Subjectivity is used to measure the degree to which a text leaves room for interpretation, while polarity is used to indicate the degree to which a text is emotionally positive or negative. Both scores are computed using a trained language model, which contains a dictionary that stores the sentiment scores for each word. The overall score is the sum of all scores averaged by the number of words.Footnote 5 The subjectivity score (subjectivity) ranges from 0 to 1, with 0 being very objective and 1 being very subjective,Footnote 6 whilst the polarity score (polarity) ranges from −1 to 1, with −1 being very negative and 1 being very positiveFootnote 7 (detailed examples illustrating the two measures are presented at the end of this sub-section).
While both measures have the potential to uncover new insights on the relationship between constitutional text and constitutional compliance, they both have some shortcomings. Regarding the subjectivity measure, it should be mentioned that lack of precision in constitutional provisions may not be controversial thanks to consistent and unambiguous judicature. Furthermore, provisions presenting unqualified rights (thus having low subjectivity score) may be accompanied by general limitations (which can be very imprecise) found elsewhere in the constitution. Concerning our polarity measure, in turn, it should be emphasized that our focus is on constitutional text as a whole and we do not take into account that some constitutional provisions may be of greater importance than others. Consequently, our approach fails to take into account that some articles/paragraphs may have a larger effect on politicians' behaviour than others. Similarly, it may matter who the relevant commands are directed to. These shortcomings should definitely be kept in mind when interpreting our results.Footnote 8
The geographical distribution of the scores for the variables: polarity, subjectivity and aggregate readability as well as their histograms can be found in the online Appendix (Figures A2–A5). In order to have a first look at the potential relationship between constitutional compliance and the variables capturing the structure of constitutional texts, we start with the analysis of Figure A6, which shows the relational plots between our dependent variable and each of the key independent variables. Both for polarity and subjectivity, we observe almost a neutral relationship with constitutional compliance. The length of the constitutional text, in turn, seems to be negatively related to constitutional compliance. Likewise, we observe a negative relationship between constitutional compliance and various indices of readability. Overall, these plots seem to suggest that the sentiment of constitutions, at least as captured by our polarity and subjectivity variables, does not necessarily matter for constitutional compliance by the government.
Other potential determinants of constitutional compliance: One of the covariates that the literature suggests to control for while investigating the level of constitutional compliance is the age of constitutions. The effects of constitutional longevity found in empirical studies, however, are rather mixed (see, e.g. Ben-Bassat and Dahan, Reference Ben-Bassat and Dahan2008; Law and Versteeg, Reference Law and Versteeg2013; Metelska-Szaniawska, Reference Metelska-Szaniawska2020; Metelska-Szaniawska and Lewkowicz, Reference Metelska-Szaniawska and Lewkowicz2021). Interestingly, the existing literature considers only the longevity of the current constitution without taking into account the constitutional amendments and historical constitutions. Importantly, constitutional amendments may be designed to adapt to the current circumstances (Lutz, Reference Lutz1994; Novak, Reference Novak2023) and reduce the de jure/de facto gap (Lewkowicz and Metelska-Szaniawska, Reference Lewkowicz and Metelska-Szaniawska2021).Footnote 9 In addition, they may translate into the length and detail of constitutions (Dixon and Landau, Reference Dixon and Landau2018; Versteeg and Zackin, Reference Versteeg and Zackin2016). As such, they can affect constitutional compliance also through the entrenchment channel described above (Voigt, Reference Voigt2021).
In order to capture the effects of constitutional age and the number of constitutional amendments, we construct several new variables in addition to current_age, based on the chronological records of each country in the Comparative Constitution Project.Footnote 10 To have an idea of how often a particular constitution has been amended, we divide current_age by the number of amendments received by the current constitutions (current_avg_amend); the higher the ratio value, the more stable the corresponding constitution.
To capture the effect of historical constitutions and their amendments, in turn, we consider two types of the weighted averages. The first one is a simple weighted average (hist_savg_amend), which takes the following form:
where N is the total longevity of a country's constitution measured in years, n t is the longevity of the constitution at time t, m t is the number of constitutional amendments at time t. Finally, we aggregate the result by summing them up.
The second one is the exponential weighted average (EWA), which takes the following form:
In addition to the simple weighted average, we assign a weight w t accordingly to the year in which the constitution died out. If the constitution ended before 1920, the weight w t is 0, 0.1 from 1921 to 1930, 0.2 from 1931 to 1940, etc., with the weight w t increasing uniformly until it reaches 1. Our presumption is that the impact of historical constitutions gradually diminishes. Online Figure A7 presents distributions of the above-mentioned indicators.
In our analysis, we also try to capture the fact that countries often borrow constitutional provisions from states that share similar cultures/traditions and legal origin is the most dominant channel of constitutional diffusion (Goderis and Versteeg, Reference Goderis and Versteeg2013). To do so, we follow the methodology of La Porta et al. (Reference La Porta, Lopez-de-Silanes and Shleifer2008). Out of 94 countries in our sample, we have 45 constitutions classified as of French origin, 28 of British origin, 16 of German origin and 5 of Scandinavian origin.
Another feature that we want to control for refers to the fact that, in the early 1990s, a group of countries in Eastern Europe and Western Asia declared their independence and rewritten their constitutions. To distinguish this, we introduce a dummy variable former_socialist to indicate whether the country has been under socialist rule. The basic statistics of the variables described above are presented in Table 1.
Source: Own study. Note that the word count has been scaled down by a thousand.
Table 2 shows the top five constitutions with the highest as well as the lowest comprehensibility reported by different readability tests. Croatia and Chile are regarded as the most difficult constitutions to comprehend by all five readability tests, followed by the USA (4), Australia (2) and France (2).
Source: Own calculation.
On the other hand, all five tests agree that Denmark, Netherlands and Ghana are the easiest to comprehend, while four opt for Bhutan, and two for Switzerland. To take into account potential differences between the different measures, we also introduce an aggregate index (agg_read) that takes the average of five scores after normalization, using the formula (3) below. The results of the aggregate indicator are shown in the last column of Table 2.
In order to better understand the potential interdependencies between the different features of constitutional text, we examine the relationship between the length of the constitution and the three aspects we propose looking at (online Figure A8). Interestingly, we find hardly any correlation between the length of the constitution and our subjectivity measure or how readable it is. This seems to be in contrast to the general assumption typically made in the literature that longer constitutions are more detailed and provide less room for interpretation than the shorter constitutional texts. Our findings seem to suggest that you can have vague (precise) wording in both short constitutions and long ones.
We further illustrate this point with an example proposed by Williams (Reference Williams2017) and also elaborated by Young (Reference Young2023). It focuses on the two passages on freedom of expression, which are taken from Iceland's and Poland's constitutions, respectively. The description of freedom of expression in Iceland's constitution is both wordy and reliant on indeterminate language. In Poland's constitution, in turn, the protection of free speech is much more concise and leaves virtually no room for differences in interpretation, notwithstanding the fact that Poland's constitution is almost five times longer than Iceland's. The two passages in question are quoted below:
Iceland's constitution:
Art 73
‘Freedom of expression may only be restricted by law in the interests of public order or the security of the State, for the protection of health or morals, or for the protection of the rights or reputation of others, if such restrictions are deemed necessary and in agreement with democratic traditions.’
Poland's constitution:
Art 54
‘The freedom to express opinions, to acquire and to disseminate information shall be ensured to everyone.’
This example can also give a good indication of how our measure of subjectivity works. Given how different the two passages are (in terms of language use, precision of expression, etc.), we expect our subjectivity measure to take higher values for the passage from Iceland's constitution than for the passage from Poland's constitution. This is exactly what we observe. The relevant statistic for the passage from Iceland's constitution is 0.6417, whereas for the passage from Poland's constitution it is 0.
The only text feature that is somehow correlated (negatively) with the length of the constitution is polarity. This finding seems to be in line with the proposition made by Tsebelis and Nardi (Reference Tsebelis and Nardi2014), who argue that longer constitutions are more restrictive. To get a better sense of our polarity measure, consider the following three consecutive passages from the Spanish constitution on environmental protection (Section 45).
Spain's constitution:
Section 45
1. ‘Everyone has the right to enjoy an environment suitable for the development of the person, as well as the duty to preserve it.’
2. ‘The public authorities shall watch over a rational use of all natural resources with a view to protecting and improving the quality of life and preserving and restoring the environment, by relying on an indispensable collective solidarity.’
3. ‘For those who break the provisions contained in the foregoing paragraph, criminal or, where applicable, administrative sanctions shall be imposed, under the terms established by the law, and they shall be obliged to repair the damage caused.’
Intuitively, while the first passage might be considered relatively positive, the second one is more neutral, and the latter one is the most negative of the three considered here. Indeed, the third passage is most restrictive in its wording. This is exactly what we capture with our polarity measure. While the first passage receives a polarity score of 0.411; the second is already less positive (gets a polarity score of 0.166), whereas the third, which is largely devoted to penalties for offenses committed, gets a polarity score of −0.400.
Empirical methodology
Econometric model: To verify the relationship between the structure of constitutional text and constitutional compliance, our econometric model takes the following form:
where const_compliance is the dependent variable measured by the degree to which a government respects the constitution in country i, polarity, subjectivity, readability and word_count are results obtained through text mining analysis of the constitutions, endurance_of_constitutions is a vector that contains all variables related to the longevity of constitutions, such as current average amendments, current year in force, historical average amendments, whereas legal_origin is a set of three dummy variables indicating constitutions of British, German and Scandinavian origin respectively. French origin acts as a reference group. former_socialist is a dummy variable that indicates whether the country i has been under a socialist regime. $\varepsilon _i$ is the error term that contains all latent variables.
Machine learning models: To complement the regression exercise, we use two tree-based machine learning models to reveal the potential influences of the independent variables on the dependent variable that a linear model is unable to capture. To this end, we intentionally overfit the machine learning models by training the entire dataset instead of dividing the dataset into a training set and a test set, which is used for generalization (Géron, Reference Géron2019).
A regression decision tree progressively divides the dataset into smaller subsets based on a set of if-then-else rules, while minimizing the mean squared errors, until the stopping criterion is satisfied, e.g. maximum depth is reached (Breiman et al., Reference Breiman, Friedman, Olshen and Stone2017). A random forest is a collection of decision trees chosen randomly via bootstrap or other sampling methods. Despite the simplicity, a random forest can often achieve prominent results, particularly when the dataset has high dimensionality (Géron, Reference Géron2019). However, a random forest is considered a ‘black-box’ algorithm that has weak interpretability.
For this reason, we use feature importance and Shapley (SHAP) values for model interpretation and comparison. Feature importance measures the relative influence of each independent variable in the dataset by removing the corresponding variable and looking at the increase in model errors. SHAP, in turn, is an additive explanatory model built based on the Shapley value, which is able to measure and quantify the contribution of each observation and variable to the model output (Molnar, Reference Molnar2022).
Challenges in modelling
In addition to the caveats stated before, it is worth mentioning that there are a few challenges in this paper that could lead to inaccuracies in estimations. First, we limit the source of our text data to one: the Constitution Project. Hence, our results rely heavily on the availability of the data. One obstacle is that not all revisions of constitutions are captured. Nevertheless, we assume that the amended texts do not alter the constitution's sentiment and readability by a significant amount. Second, each constitutional text has been translated into English. Although this allows for a fair comparison and analysis, it still remains unclear whether there is any information of the document lost in translation. Third, constitutions in our sample were written at different points in time. One cannot exclude, therefore, that some changes in language might have occurred over the years and some words can be perceived differently than they used to be. Fourth, constitutions are specific documents that are written in technical language, which is likely to differ from the language model that is trained on for sentiment analysis, as the algorithms which we use for analysing subjectivity or polarity of constitutional texts were originally trained on non-legal texts.
The other issue that needs to be kept in mind is that the endogeneity problem (Voigt, Reference Voigt2013) cannot be fully eliminated. In the absence of a valid instrument variable, some researchers tackle this issue by including a time lag (Ben-Bassat and Dahan, Reference Ben-Bassat and Dahan2008; Goderis and Versteeg, Reference Goderis and Versteeg2013), and some use matching (Chilton and Versteeg, Reference Chilton and Versteeg2016), which are not achievable in our paper due to a lack of data.
Empirical results
Ordinary least squares
The results of the baseline econometric models are presented in Table 3. Subsequent columns use a different variable capturing the structure of the constitution as the key variable of interest.
Source: Own calculation. French legal origin is the base category. Note that * p < 0.1; ** p < 0.05; ***p < 0.01.
As shown, our measure of polarity seems to be negatively correlated with constitutional compliance, implying that constitutions with more negatively worded phrases tend to be more complied with. Second, as regards our subjectivity measure, our estimates do not suggest any statistically significant impact on constitutional compliance. Third, a negative link can be identified between our readability scores and constitutional compliance, which suggests that comprehensibility is positively related to constitutional compliance. In addition, we consistently find that in countries with shorter constitutions the level of compliance with the constitution is relatively high. Furthermore, countries with constitutions of Scandinavian legal origins score higher on compliance than countries with constitutions of French origin (which serve as a reference group). Finally, in regard to the age of the constitution, we observe that countries with more stable constitutions (a higher ratio of age to amendments) show lower levels of compliance. According to the F statistics, all baseline models are significant and have passed model diagnostic tests. However, the explainable power of these models remains moderate at most – with R 2 values ranging from 0.23 to 0.3.
In the next step, we check what happens to the previous results on our key covariates of interest once we include them all in one model. The relevant results are reported in Table 4.
Source: Own calculation. French legal origin is the base category. Note that * p < 0.1; ** p < 0.05; ***p < 0.01.
In columns (1) and (2), we get qualitatively the same results as before. Most importantly, polarity is negatively correlated with constitutional compliance.Footnote 11 The same holds for the length of the constitution and its stability. On the other hand, no significant impact of subjectivity measure can be observed. In columns (3) and (4), however, in which we use Dale-Chall or the aggregated readability measure, the effect of polarity vanishes.
Machine learning models
In Figure 1, we present the feature importance reported by the decision tree model (on the left, with R 2 = 100%) as well as the random forest model (on the right, with R 2 = 87.0%). We see that word_count is ranked first in both models. Interestingly, variables concerning the structure of the constitutions, including subjectivity, are considered more influential than those capturing the impact of constitutional amendments or legal origin. In addition, polarity, which is deemed significant in the econometric model, while being considered one of the important variables by machine learning models, according to this method is not as meaningful as subjectivity or readability.
As can be seen from the partial dependency plot (Figure 2), there is a clear nonlinear effect of polarity on constitutional compliance. In case of the decision tree model, most of the observations have little impact in terms of the overall Shapley values, with the exception of some observations for polarity ranging between 0.08 and 0.12. For our random forest model, in turn, a large share of observations positively influencing the dependent variable is clustered around a polarity of 0.08–0.14. However, there are also quite dispersed observations for polarity between 0.02 and 0.08, which seem to have a negative effect on constitutional compliance.
When we look at Shapley values, as depicted in Figure 3 (decision tree model), the effect of the polarity variable on the model output is more significant than that from the variables capturing legal origin (see the ranking of features along the y-axis). In addition, the impact is moderately negative as shown by the cluster of red dots on the left from zero (x-axis). Compared to polarity, the distribution of subjectivity is slightly more dispersed. Its impact on constitutional compliance, however, also appears to be negative, as suggested by the red dots clustered to the left from zero. Similar conclusions can be drawn for the length of the constitution, which, according to this approach, seems to be a crucial contributor to the model. The variable of age of the current constitution behaves similarly, but with a narrower scope of impact and seems to have a more important role than constitutional amendments. It is worth mentioning that legal origins do not have a clear effect on the target variable. Finally, the fact of having experienced socialist dictatorship in the past (former_socialist) does not provide a meaningful contribution.
When we look at the SHAP summary plot for a random forest model (Figure 4), we find that the observations for most of the variables tend to be more clustered around x = 0. polarity, subjectivity, readability, and the length of constitutional text, remain the most influential features in the random forest model. As such, these results seem to be very much in line with what we observed previously.
Discussion and conclusions
The main aim of this paper was to analyse the relationship between the structure of constitutional text and constitutional compliance of the top representatives of the various government branches. The former was approximated by the following measures – text polarity, subjectivity, readability, and length. The empirical analysis was based on the sample of 94 democratic countries and their constitutions, which were valid until 2019.
Our econometric regressions provide some evidence that shorter constitutions seem to be more complied with. This finding seems to fit a popular premise that constitutions should be relatively concise. This way we add a thread to the discussion and theory wrapped around the concept that the length of constitution can be linked with constitutional entrenchment, which is regarded as relevant for constitutional compliance (Dixon and Landau, Reference Dixon and Landau2018; Versteeg and Zackin, Reference Versteeg and Zackin2016; Voigt, Reference Voigt2009, Reference Voigt2021). We also provide some evidence that more comprehensible constitutions may be more complied with. This finding, however, depends on which readability measure we use.
In addition, we show that countries with more emotionally negative worded expressions (prohibitions) in their constitutions seem to show higher levels of constitutional compliance than countries with constitutions with more positive wording (guarantees or freedoms). This finding fits the deterrence model of legal compliance, relying mostly on restrictions or sanctions, the spectre of which prompts compliance with the law (Kahan, Reference Kahan1999; Kahneman, Reference Kahneman2011; Tyler, Reference Tyler2006). However, further work is needed to better explore this issue. Investigating potential interactions between text polarity and readability of the constitution would be one option. Taking into account that the commented positive or negative sentiment can be linked to various types of constitutional provisions would be another.
At the same time, the econometric models do not indicate that there is a statistically significant relationship between constitutional compliance and our subjectivity measure. This could be due to the fact that politicians cannot use interpretational ambiguities to their advantage and/or that notwithstanding the vagueness of constitutional language courts are able to provide consistent and clear interpretation of the law. Future studies could check whether this explanation is correct.
Moreover, our results show some support for the claim that countries with a high ratio of age of constitution to the number of amendments are more likely to be characterized by high constitutional compliance. This partially endorses Madison and Voigt's (Reference Voigt2020, Reference Voigt2021) argument that a constitution that is frequently updated is more likely to be complied with. Please note, though, that we talk here only about correlation and not about causality. One interpretation could posit that the lack of changes in constitutional texts lead to lower compliance. An alternative, however, could posit that amendments are a response to low compliance.
Last but not least, Scandinavian legal origins appear to be positively associated with constitutional compliance. Other legal origins or socialist legacy do not appear to be of importance. Thus, our results do not provide strong support for the argument that legal origins, which are considered a channel of constitutional diffusion (Goderis and Versteeg, Reference Goderis and Versteeg2013), are related to constitutional compliance.
This picture is largely supported by the output emerging from machine learning models, which, instead of presenting statistical relevance of particular factors in the ceteris paribus way, show the contribution of variables or observations of interest to the output when taken together. Both decision-tree and random forest models suggest that the structure of constitutions can be considered potentially important for constitutional compliance. In this context, it might be noted that all variables of our interest (including subjectivity) are assigned with a greater impact on the models' output than variables capturing legal origins and a dummy distinguishing former socialist countries. This is important, as the latter two variables are often considered important in determining constitutional compliance. Interestingly, simple word count as an approximation of the length of constitutions has an outstanding explanatory effect on constitutional compliance. That said, the results of machine learning models have to be taken with caution due to their limited interpretability.
These findings obviously have their limitations and suggest potential avenues for future research. One of them includes developing a language model that would be more fine-tuned for legal vocabulary. The other potential avenue involves research on a larger dataset that could draw on the future advancement of machine translation. Additional measures other than sentiment analysis could also be used when new text mining techniques emerge.
Supplementary Material
The supplementary material for this article can be found at https://doi.org/10.1017/S1744137424000201.
Acknowledgements
This research is part of a Beethoven project funded by the Polish National Science Centre (NCN, UMO-2016/23/G/HS4/04371) and the Deutsche Forschungsgemeinschaft (DFG, #381589259). The authors thank Niclas Berggren, Christian Bjørnskov, Jerg Gutmann, Bernd Hayo, Jarosław Kantorowicz, Anna Lewczuk, Katarzyna Metelska-Szaniawska, Stefan Voigt, as well as the participants of the 2022 North America Summer Meeting of the Econometric Society, the 2022 Scottish Economic Society Conference and anonymous reviewers for valuable comments. The views, thoughts, and opinions exposed in the text belong solely to the authors, and not necessarily to the authors' employer, organization, committee or other group or individual. Jacek Lewkowicz gratefully acknowledges the support of the Foundation for Polish Science (FNP).
Conflict(s) of Interest
None.