The Value of Outliers

The Value of Outliers

Tue, 03 Oct 2017 02:49:30 -0700

Tags: academic, philosophical


Moving coast-to-coast has taken most of my energy since the last post but we're finally established in Vancouver so I can come back to blogging more regularly. This post is about a piece of advice from the classic book How To Lie With Statistics extended to groups and society in general. It continues ideas from two other blog posts (What Random Forests Tell Us About Democracy and Have Humans Evolved to be Inaccurate Decision Makers?) regarding statistics, decision making and politics.

Let's sample 10 (pseudo)random numbers from a normal distribution centered around 100:

       >>> map(lambda x:int(random.normalvariate(100,10)), range(10))
       [85, 99, 97, 78, 87, 93, 91, 112, 90, 91]

Now, if you look at these numbers, you'll be tempted to conclude that 112 is just... wrong. A measurement error. That it does not belong there. However, the average for the 10 numbers is 92.3, still far from the real mean (100) but within the sigma we used to generate the sample (10). If we were to drop 112, the average for the remaining numbers will go down to 90.1, making it a worse estimator than before.

I believe the same happens in the realm of ideas. If each person has a piece of the truth, shutting down their contributions, irrespective of how far from the truth their might sound, will lead you farther away from the truth. A similar concept in the business world is Groupthink. This of course does not mean that outliers need to dominate, just not eliminated completely.

And if you haven't read the 1954 book by Darrell Huff, it is very short and makes for a great read. Its starting premise "democracy needs voters informed on basic statistical matters" is as up-to-date in our data-driven world as ever.


Your name:

URL (optional):

Your e-mail (optional, won't be displayed):

Something funny using the word 'elephant' (spam filter):

Your comment: