Sunday, February 14, 2010

The Spirit Level is junk science part deux (updated)

Obs: Don't forget to read my updated post about the Spirit Level, where the author responds to me.

Obs II: He responded again, and it is as weak as the last response. Go the the bottom of the post.

Obs III: Their claim about innovation cannot be replicated either (scroll down).

The “Evidence” presented in the Spirit Level falls apart very easily if you poke even a little at it.

I first redid the main argument, the relationship between life expectancy and inequality, now using OECD data (instead of UN data). The good thing about the OECD data is that it measures inequality after taxes and government subsidies are taken into account.

Again there is no statistically significant relationship between inequality among 28 OECD countries and life expectancy. The p value is 0.78. As a rule of thumb you need a p value no higher than 0.1, preferably 0.05, to be able to argue that something is statistically significant. The core argument of the book is not significant with the standard measure of inequality. This is a joke.

The country with the highest life expectancy, Japan, has the 9th highest level in inequality measured by Gini after taxes and subsidies. The Spirit Level gets around this “problem” by using a very selective measure of inequality, the ratio of two groups of people, instead of the standard measure in social sciences, Gini , which looks at every member of society, including the poor. Furthermore, they use an "index", instead of the straigtforwards measure: how long people live.

As every trained economist knows, correlation is not causation. The relationship between life expectancy and income equality is problematic, because of reverse causality, and because third factors (basically all social problems) cause income inequality AND lower life expectancy.

One way to at least mitigate the problem is to not to look at levels, but at change. If the theory of the book is correct, as countries become more unequal life expectancy should fall, relative to countries are are becoming more equal.

Here is the change in Gini for 19 OECD countries where there was data between what the OECD defines as “mid 1980s” and “mid 2000s”. As you see the relationship is again not statistically significant (p 0.35), and again the opposite of what the Spirit Level claims.

The countries whose income distribution became more unequal had faster growth in life expectancy!

Since we have higher standards than the authors of the Spirit Level, let us not pretend this means anything, dreaming up ad-hoc stories how more equality kills people. Probably just a coincidence or an artifact of the complex relationship between inequality and other factors that influence life expectancy (such as GDP growth, human capital accumulation).

Microeconomic studies can mitigate (but not solve) the causality problem, by controlling for variables. I did a quick search of the literature. It seems that microeconomic studies, including a thorough study in Sweden and one in Norway that tried to reduce endogeneity problem do not find a relationship between inequality and mortality. Same result in Finland and the U.K.

The Social Democrats in Sweden need to learn about requirements in empirical social sciences, such as statistical significance and robustness. Most importantly, they need to learn about causality. If the Social Democrats, the postmodernists and the Social liberals had not destroyed the school system, basic foundations of logic would be taught at least in high school.

Let me give another example of Social Democratic inability to understand causality: crime and punishment. The Swedish left, including often leftist libertarians, believe the have “proven” that long prison sentences does not reduce crime, because countries with longer sentences have more crime. In Swedish leftists logic, this means that long punishments CAUSE high crime! Just look at the U.S!

But punishments are a costly countermeasure to crime: you only increase punishments when crime becomes a problem. Countries with high underlying crime are forced to respond by increasing punishments. If these countries had less strict punishment they would have even higher crime.

The analogy that I use is head-ache pills and head-aches. We don’t go around claiming that aspirin causes head-pain, just because people with head-ache are more likely to take aspirin. Why would you accept that the positive correlation between crime (the illness) and punishment (the medicine) implies that the medicine caused the illness?

Many leftists are smart, but they have not been taught how to think in a stringent way (worse, they have been ideologically indoctrinated against objectivity and reason).

The first thing someone should have asked in the seminar presenting The Spirit Level is “how did you get around the endogeniety problem?”. Instead the left turned it into a religious meeting, embracing the claims made in the book, even though they are not supported by real evidence.

What you need to get around complex relationships such as prison length and crime, or inequality and life expectancy, is some form of exogenous treatment. You need controlled experiments, quasi-experiments, or good so called instruments.

In crime versus punishment scientists worked hard at finding these type of experiments, such as the recent prison reform in Italy which almost randomly gave different prisoners different punishments for the same crime. This experiment showed us that punishment does reduce crime. One such study with a clear identification is worth 1000 correlations, when endogeneity is a problem. Social Democrats would understand this if they understood causality.

I don’t want to single out Social Democrats either. The Swedish Social Liberals (Folkparti) are almost as bad, as evident by the editorial page of DN.

More on this:

In order to put the nail in the coffin of these charlatans, I further investigate the claim by Spirit Level that inequality causes bad health.

For 28 OECD countries, I measure 11 health outcomes from OECD Health Data 2009, and compare them with inequality as measured by post taxes and transfers Gini coefficient defined as “mid 2000s” by the OECD.

All health measures are the average for the years 1998-2007, in order to correspond with mid 2000s. I include more years in order to be generous to The Spirit Level, if I only look at 2005 their claims do even worse (for example infant mortality is no longer statistically significant). The reason is that for some of these measures some countries have missing variables in many years.

The 11 health measures are:

1. Life expectancy at Birth
2. Infant mortality, Deaths per 1,000 live births
3. Suicides, deaths per 100 000 population
4. Cerebro-vascular diseases, deaths per 100,000 population
5. Cancer, deaths per 100,000 population
6. Diseases of the respiratory system, deaths per 100,000 population
7. Diabetes, deaths per 100,000 population
8. Tobacco consumption, % of population
9. Alcohol consumption, Litres per capita
10. Obesity, percentage, percentage of population
11. Overweight, percentage, percentage of population

Out of 11 health measures, Inequality only had a statistically significant relationship at the generous 10% level with two variables: Infant mortality (p 3.2%) and obesity (p 6.1%).

(Obesity is driven entirely by one fat and unequal country, the U.S. Without the U.S the p-value is 34.4%.) Infant mortality is the one variable I would definitely give them. Here are these two graphs.

Again, life expectancy at birth has no statistically significant relationship with inequality (p value 74.1%). I have already included the graph for 2005, the 1998-2007 average is close to identical. Bear with me as I show you the graphs for the other 8 variables.

Remember the nice, convincing graphs The Spirit Level presents to it’s readers? You will not find them here, when reproducing some of the claims.

Suicides are negatively related to inequality, the more unequal the country, the fewer suicides. The relationship is not statistically significant however (p 10.9%)

Deaths in Cerebro-vascular diseases are not statistically related to inequality (p 31.7%).

Deaths from Cancer are not statistically related to inequality (p 60.1%). In fact countries with higher inequality have fewer deaths in cancer.

Deaths from Diseases of the respiratory system are not statistically related to inequality (p 19.4%).

Death from Diabetes is not statistically related to inequality (p 16.3%).

Smoking is not statistically related to inequality (p 76.2%). In fact people smoke slightly less in countries with more inequality.

Drinking is not statistically related to inequality (p 59.2%). In fact people drink slightly less in countries with more inequality.

The share of overweight people is not statistically related to inequality (p 55.5%). (remember, obesity was related, this is the share of overweight but not obese people).

2 out of 11 ofvariables being statistically significant is not particularly impressive at all. 4 out of the 11 health measures have the opposite sign that they predict, with unequal countries doing better than equal ones! Yet the Spirit Level has through selective presentation given the impression that inequality is very strongly related to health outcomes. They must have known how weak the unerlying data was. I conclude that they are simply misleading their readers.

One more, just for fun. The under-appreciated blogger Dan Nordling noticed an odd claim by The Spirit Level, that more than 25% of Americans and Brits were mentally ill, compared to only 10% of Germans and Italians. Such enormous disparities in illness between similar countries seems oddly high, quite frankly. I had never heard anywhere that Brits had two and a half time more mental ilnes than Germans.

I tracked down international comparisons of Mental Health from the WHO. The WHO measure, age standardized DALY per 100.000, for what they refer to as "Neuropsychiatric conditions", which includes:

1. Unipolar depressive disorders
2. Bipolar disorder
3. Schizophrenia
4. Epilepsy
5. Alcohol use disorders
6. Alzheimer and other dementias
7. Parkinson disease
8. Multiple sclerosis
9. Drug use disorders
10. Post-traumatic stress disorder
11. Obsessive-compulsive disorder
12. Panic disorder
13. Insomnia (primary)
14. Migraine

The WHO results are much more intuitive. the U.K has 10% more mental illness as Germany, not two and a half times more. The U.S has 25% more mental illness than Belgium, not more than twice as much (the U.S has much more drug use). Does the Spirit Level stand closer scrutiny?

First, here is the graph from The Spirit Level, too convincing by half as they say.

Here is the graph linking Gini from the OECD to mental health problems.

As you notice, we have another variable that is not only not statistically significant (p value 28.6%), but that goes in the opposite direction of what The Spirit Level claims: according to WHO data more unequal countries have less mental illness!

Again, the lesson is: Don't trust anything written by these people. They believe in their story so much that they are willing to fudge the data if that is what is needed to convince the pubic.

Update II

It just gets better and better. Just one more for the heck of it.

Another fishy looking claim in "The Spirit Level" is that more equal countries are more innovative. Here is another one of those really convincing graphs:

Notice that the United States is one of the least innovative countries according to "The Spirit Level". Now, no matter how dogmatically leftists you are, it is hard to claim that the U.S., the most technologically advanced country in the world, winner of 60% of scientific Nobel prizes in the post war period, is one of the least innovative advanced nations on earth, no more innovative than Portugal.

So I went to the homepage of the World Intellectual Patent Organization. They had themselves calculated a measure of patents adjusted for population "Resident patent filings per million population (1995-2007)".

Here is the relationship between patents and inequality (the outliers are Japan and South Korea, I knew from before that Japan is much more patent intensive than others, although I don't know why).

The correlation is not statistically significant, and once again goes in the opposite direction of what the books claims (inequality is correlated with more patents).

Another one of the books graphs disappear when you try to independently replicate it. If I was not afraid of taxing the patience of my readers, I think I could do this for days, deconstructing this house of cards.

In the comments I suggested people simply refute Wilkinson by poiting out that there is no statistically significant relationship between life expectancy and Gini in the OECD.

I should say that this is the easy question. He does not have a response to it, so they hide it in their book. Everyone will understand this critique.

It is not the true core of the problem. A trained economist would ask him: How do you account for the endogeneity problem. This means: how does he establish causality when he has variables that are related in complex ways.

He claims Inequality causes bad health outcomes. But bad health can cause inequality. More importantly, third factors, basically all social problems, simultaneously lead to low health outcomes and to low income, causing what is referred to as a spurious correlation.

For example in Sweden a social problem is poor integration of immigrants. The unemployed immigrants in Rosengård have lower health outcomes, they smoke more, use more drugs, are often victims of crime. And they have lower income than other Swedes. In the data this looks like a correlation between inequality and bad health. But it is just a correlation, not causality.

It is not the fact that Swedes in in Malmö and Vellinge are rich that is CAUSING Rosengård dysfunctional, as Wilkinson claims. If Vellinge had an economic crisis and became poor, this would have no effect of health in Rosengård (or probably a negative effect, since the hospitals would have less money). The causality is more complex.

People in corrupt southern Italy have lower health outcomes and lower economic outcomes than North Italians. The casual link is that Mafia, lack of trust, and low education make south Italians poorer, and it makes them have lower health outcomes.

The Spirit Level thinking instead childishly interprets the complex relationship that North Italians are rich makes South Italians unhealthy, because of the stress of knowing they are doing worse than North Italy.

Because of the problem of what econometricians call reverse causality and missing variables, the correlation studies used in the Spirit Level are not accepted as scientific evidence by trained economists.

A very bad sign for their hypothesis is that as soon as you put some controls the relationship between inequality and health vanishes. The micro-studies with controls that I linked to find no (and some factors are unobservable, so we cannot even control for them). Using levels for OECD countries we have no statistically significant relationship. Using levels for UN we still have no statistically significant relationship, and even find the opposite of what the book claims. Using change we find no statistically significant relationship, and the opposite of what they claim.

This does not mean Wilkinson is wrong. It just means he has no evidence for his hypothesis. Wilkinson and people who think inequality causes lower health (for example through stress) need to find exogenous experiment to verify their hypothesis. Until they have done that we cannot accept they claims as science. But not only have they not done that (to my knowledge), they are going ahead and selling their story as if they had evidence!

This is deeply unethical, because ordinary people trust academics. Ordinary people do not necessarily know about endogeneity problem in empirics. They think if a Social Scientists claims they have 800 studies, and puts up some correlations on a powerpoint, than the weight and status of science is behind him. Naïve DN readers trust “science”.

What the Spirit Level is doing is essentially fraud.


Wilkinson reponds again:


"Sorry - far to busy to follow up this stuff. I know the right will be doing everything to get rid of our material but we don't have time for blogs etc now. I notice country names are not given and assume he has included countries which are not among the richest in the world and so should control for GNP before looking at inequality. Should also try looking at mortality among infants and working age populations."

My answer:

"* I do include country names. My first regression is EXACTLY the same 21 countries Wilkinson uses, the only difference is that I use Life Expectancy, instead of the "index" they have built. The relationship is not statistically significant.

* I have run a simple regression of life expectancy on Gini and Per capita income (all data from OECD for mid 2000s and 2005) for the 28 OECD countries:

p value for Gini: 0.783
p value for per capita GDP: 0.55

This means neither value is even close to being statistically significant (why is GDP not statistically significant? One reasons is that the OECD countries already come from a selected sample of high income, per capita income is a restricted variable).

Now Wilkinson's: 21 countries
p value for Gini: 0.451
p value for per capita GDP: 0.986

No statistically significant. Not even close.

* I have already run infant mortality, and have written that infant mortality (and to a lesser extent obesity) are the only health variables out of (now) 13 investigated that are linked to inequality in a statistically significant way.

* I traced down adult mortality for people aged 15 to 60 from the WHO. It not even close to being related in a statistically significant way to inequality. For the full 28 sample the p value is 0.63, for his selected sample of 21 countries the p value is 0.67.

Please don't waste my time more with wild goose chases that just weakens you own story, and answer the direct questions: Why is life expectancy not related in a statistically significant way to inequality, if inequality is a major killer?"

By the way, notice that he had time for a 750 word response as recent as yesterday. How convenient that he doesn't have time now...


  1. I might have the opportunity to go and see a talk (which might be recorded) by one of the authors of the Spirit Level in a few weeks time in Oxford. If you could come up with one question designed to be maximally destructive to the argument he is going to put forward, what would it be?

  2. Yes:
    How can he claim that inequality kills, when there is no statistically significant relationship between the standard measure of inequality (Gini) and the standard measure of health (Life expectancy)?

    Remember: This is true regardless if he uses OECD data or UN data.

    Why did he use an odd measure of inequality and some index instead of the standard measures? When the standard measures refute his story, why does he not mention this to his audience, as a scientist would?

    See my updated post.

  3. Dear Tino,

    For someone making such a fuzz about causality, you should consider taking thinking more seriously.

    Consider the following case: (1) There exists a set of policies that can reduce income inequality but that do not require any direct health measures (think redistributing taxes). (2) Across countries there are varying preferences for income equality.

    Given (1) and (2) a aggregate correlation between income inequality and health is unlikely unless more equal incomes causally effects health (via absolute or relative income effects).

    If you deny (1) and (2) then your aversion towards a causal interpretation is ok. If you maintain (1) and (2) you will have a harder time to argue against a causal interpretation.

    [Argument left out awaiting your comment]

    In this respect I would follow Kant, that causality is something that the cognitive subject necessarily adds to experience. It is not a property of das Ding an sich.

    Your comment to Hannes also demonstrates that your are really not interested in empirical results. Of course you can dismiss the BJM study based on large individual database put then you condemn yourself to a sectarian existence where your influence outside the sphere of your brethren will be very small.

    One observation is that you approve of correlation studies when they support your ideas (the case studies referred above) but not when they go against you (the BJM meta study). Paper mining, I would call that.

    Finally, if you would like to challenge Wilkinsons results on aggregate correlation between health and inequality, then you would really have to address his 2006 meta study, and preferably get it published.

    Moreover, the challenge Wilkinson makes in the Spirit level is for researchers to present an alternative indicator that correlates as well as his inequality measure with the different socio-economic indicators. This is, according to expert statisticians, a recommended approach to empirical data. Every individual outcome measure contains different types of error, But if you can demonstrate that a variable is correlated with a broad range of outcome measures, this makes it increasingly likely that your variable capture an important dimension in your data.

  4. Bo:

    1. Your claim that given income re-distribution, “a aggregate correlation between income inequality and health is unlikely” is just absurd. You know little about health research. Health is for example very strongly related to human capital, even when there is income redistribution. Health is very strongly connected to Social Capital, even when there is income redistribution. In health theory they often talk about the connection of health capital and the discount rate, which is economic terminology for self-control and how well you take care of yourself.

    Let me give you a simple example. In Sweden 60% of people on “Social Bidrag” smoke, compared to a much smaller fraction of those with college degrees. Do you think smoking effects health? Do you think it is a coincidence that the people who drop out of school, smoke, commit crime, have long periods of unemployment, also earn less than others?

    Now, which do you think is more likely: That it is factors about those people (such as home environment, immigrant status) that makes them do bad in various measures, or is it the fact that there are more rich people who CAUSES them to smoke, not work, not study, out of stress?
    The second is the theory put forward by

    2. You have accepted the typical Swedish Socialdemoratic idea that everything is policy. That is why you think Swedes live longer because of Social Democratic rule, even though every single western nation lives much longer now than 1932. Life is more complex than that. Demography, culture and history are reasons that regions with essentially the same policy (Småland and Norrland, southern Italy and Northern Italy, Minnesota and Alabama) has different outcomes.


    “Your comment to Hannes also demonstrates that your are really not interested in empirical results. Of course you can dismiss the BJM study based on large individual database put then you condemn yourself to a sectarian existence where your influence outside the sphere of your brethren will be very small."

    Your comment to this shows that you do not understand modern methodology in social sciences. I “condemn myself” to the ‘sect’ that consist of every single student in economics, public policy, or political science that has graduated from every university the last 20 years and has understood the undergraduate econometrics class: correlation is not causality when you have endogeneity/missing variables.

    The size of the database has no bearing what so ever if there is endogeniety in the varialbles.

    Let me say this simpler. If I as a student put up Wilkinstons graph at a seminar in a top U.S University or Stockholm School of Economics and claimed it proved causality, they would either laugh at me or become angry that I am wasting their time. If I defended it because the database was "large" they would fail me in a statistics class (and beleive me, I am not a great student).

  5. 5.

    “One observation is that you approve of correlation studies when they support your ideas (the case studies referred above) but not when they go against you (the BJM meta study). Paper mining, I would call that.”
    No dear Bo, since I EXPLICITLY wrote that the correlation studies did “not solve” the causality problem, it only mitigated it. Are you illiterate?

    6. “if you would like to challenge Wilkinsons results on aggregate correlation between health and inequality, “

    I just did. There is no statistically significant correlation between life expectancy and inequality among Wilkinsons 21 or my 28 OECD countries. Do you have an answer to this?

    7. “then you would really have to address his 2006 meta study, and preferably get it published.”

    I don’t need to refute every single junk study that has been written in the world to be able to point out they are junk, especially since I told you over and over again that their methodology is flawed. In serious academia people are ignoring Wilkinston already, the only reason I am using time is because he is selling his story to an audience that does not understand the flaws: the public.

    Second, why in god’s name would I trust Wilkinson meta-study, when I just demonstrated to you and the rest of the world that he is a data-miner?

    8. “Moreover, the challenge Wilkinson makes in the Spirit level is for researchers to present an alternative indicator that correlates as well as his inequality measure with the different socio-economic indicators.”

    Bo: He is tricking you. His measures are not strongly tied to inequality at all. His most important measure, health and inequality, only seem to be related because he used an “index”, instead of actual life expectancy (he also cherry-picked his sample).

    You really have to be ideologically desperate to want to believe him even though he is fooling you.


    “This is, according to expert statisticians, a recommended approach to empirical data. Every individual outcome measure contains different types of error, But if you can demonstrate that a variable is correlated with a broad range of outcome measures, this makes it increasingly likely that your variable capture an important dimension in your data.”

    No Bo, here you misunderstand systematic and unsystematic error. If there is systematic bias in form of endogeneity, it does not matter how big your sample is or how many variables it ties to.

    The variable you are talking about here, inequality, essentially captures one group doing worse than another in the same country. If you have a group that does bad in earning income, it is very likely they will do bad in health (all health measures are connected, because they all depend on how well you take care of yourself), in social measures (all social measures are connected, since they all depend on human capital, trust, social capital).

  6. I have updatet my blog with a short responde from Wilkinson.

  7. Tino, my congratulations to an interesting and convincing answer.

  8. @Tino: Check the comments field in my blog again.

  9. Wilkinson have sent me new gini-correlations:

    Think you will find it interesting!

  10. Hi you are the best I whant you to now I like your blogs this Really good information and I feel owe you one and I hope you see to may blogs

    Forx trading

    One Blog For Forex this the best site talk about currency trading and covers all partitions Forex Trading Contains an explanation of What is forex Terminology us in this trading and how you profit from forex There are many Strategies profitable in Forex market

    Pictures For You

    The best hd wallpaper. Quality 1800p and Size 1900x1200
    this the best of the best images from games and nature and travel and more and more

  11. If you have a real case to refute Wilkinson then please publish in a peer-reviewed journal. Though some peer-reviewed material may be poor quality, the quality is far better than that on blogs.
    For your case: Gini is not the only measure of inequality that should be used.
    (See my post on your other page).
    If you can't get anything into a peer-reviewed journal then your case is almost certainly unfounded.
    Good Luck!

  12. The abysmal you dig into the content and communicate us the precise aggregation is appreciable.
    unsecuredloans online

  13. that guy It's the best time to make a few plans for the future and it's time to be happy. I've read this publish and if I may I want to suggest you some attention-grabbing issues or suggestions. talks about it Perhaps you can write subsequent articles relating to this article. I want to read more issues about it! talking to


Google Analytics Alternative