Currently I’m getting into Data Science and as way to practice I’m exploring data on previous Portuguese Local Elections.

One interesting thing that I found is that municipalities with a higher number of voters enrolled also have a higher percentage of people not voting. There seems to be some kind of inverse exponential relationship. Each data point is a municipality.

Pearson Correlation Coefficient: 0.453047145972

p-value: 5.4002095121e-17

This is more clear if I use the the natural logarithm of the number of voters:

Pearson Correlation Coefficient: 0.69994788675

p-value: 1.20211071448e-46

The results are similar when using data from the 2009 and 2005 elections.

Pearson Correlation Coefficient: 0.579329081296

p-value: 5.27916030293e-29

Pearson Correlation Coefficient: 0.579329081296

p-value: 5.27916030293e-29

Funny. Maybe in smaller places people feel their votes make a bigger difference? Does the relationship holds in other elections beyond local elections?

Correlation Coefficient: 0.209724326747

p-value: 0.000209736784349

Correlation Coefficient: 0.113888507278

p-value: 0.0458131679479

Well, the correlation is weaker and less significant but it still exists. Did anyone noticed this before? Or I’m making something wrong?

You can find data and code here.