The Squawk Point

Organisational Mechanics

  • Home
  • Blog
    • People
    • Data
    • Process
    • Wild Cards
    • Index
  • Podcast
  • Book

There Are Lies, Damned Lies and Descriptive Statistics

1 February, 2018 by James Lawther 7 Comments

Anscombe’s quartet

Here are some interesting numbers for you. Interesting in a geeky sort of way…

Anscombe's Quartet

They are 4 sets of readings for two variables, X and Y.

That is a horribly algebraic. It brings back memories of a comprehensive school deep in the 1980’s. All polyester blazers and fermenting gym kit. Let me bring the X’s and the Y’s a little more up to date:

  • X versus Y
  • Sales rate versus call handle time
  • Information collected versus first time fix
  • Quality versus cost

There are lots of X’s and Y’s in business. My time in Mr Gilpin’s maths class wasn’t wasted.

The question

Are these X’s and Y’s different or are they the same. How do they relate to each other?

In the world of big data and analytics (when did analysis become a noun?) the solution is easy to find, draw up some “descriptive statistics”. Summarise the data so that you can look at it.

Nothing to see here

I have saved your spreadsheet blushes and calculated the statistics for you. Here they are…

Anscombe's Statistics

What do they say?

  • The mean of X and the mean of Y — all four are the same.
  • The standard deviation of X and the standard deviation of Y — all four are the same.
  • The correlation between X and Y — all four are the same.
  • The regression equation for X and Y — all four are the same.

So there you have it, the 4 groups of X’s and Y’s are all the same. Nothing has changed, there is nothing to worry about.

Or is there?

If I draw some scatter plots the data sets look very different.

The point of the story

If you want to understand what is going on in your business, don’t rely on the accountants and analysts with their beautiful tables of numbers. Draw some graphs and look at the data instead.

Better still, go and have a look at the shop floor. It will be far more interesting than your maths class ever was.

If you enjoyed this post click here to receive the next

Read another opinion

Image by Blondinrikard Fröberg

Filed Under: Blog, Operations Analysis Tagged With: Anscombe's quartet, average, data is not information, gemba, key performance indicators, management by wandering around

About the Author

James Lawther
James Lawther

James Lawther is a middle-aged, middle manager.

To reach this highly elevated position he has worked in numerous industries, from supermarket retailing to tax collecting.  He has had several operational roles, including running the night shift in a frozen pea packing factory and carrying out operational research for a credit card company.

As you can see from his C.V. he has either a wealth of experience or is incapable of holding down a job.  If the latter is true this post isn’t worth a minute of your attention.

Unfortunately, the only way to find out is to read it and decide for yourself.

www.squawkpoint.com/

Comments

  1. Emma Robbins says

    2 February, 2018 at 8:29 pm

    Very clever…..

    Reply
    • James Lawther says

      2 February, 2018 at 8:30 pm

      Glad you liked it. Shame it wasn’t me who thought of it.

      Reply
  2. Chip Bell says

    5 February, 2018 at 2:04 pm

    This is an absolutely delightful post using a clever way to make an important point. It reminds me of a great piece I read in the ’70’s on “Salt Passage Research.” http://bit.ly/2GRGyjB. But the best warning about the illusion of stats to communicate the truth comes from John Steinbeck’s book Sea of Cortez, in which he describes a fishing expedition off the coast of Los Cabos, Mexico in this way:

    “The Mexican sierra has 17 plus 15 plus nine spines in the dorsal fin. These can easily be counted. But if the sierra strikes hard on the line so that our hands are burned, if the fish sounds and nearly escapes and finally comes in over the rail, his colors pulsing and his tail beating in the air, a whole new relational externality has come into being-an entity which is more than the sum of the fish plus the fisherman.

    The only way to count the spines of the sierra unaffected by this second relational reality is to sit in a laboratory, open an evil-smelling jar, remove a stiff colorless fish from the formalin solution, count the spines and write the truth. There you have recorded a reality which cannot be assailed-probably the least important reality concerning either the fish or yourself.”

    I have fished for Mexican Sierra off the coast of Los Cabos and I can verify his accurate description of the experience, despite what the research may report.

    Thanks, James for your great work on this! May never make a Type 2 error and always remain a few standard deviations away from the mean of the gullible crowd!!

    Reply
  3. Michael Lowenstein says

    5 February, 2018 at 2:36 pm

    Once again, making the case for understanding the differences between correlation/regression (simple or multiple) and the causation, or qualitatively and quantitatively drawn real drivers of results. Maybe it shouldn’t be so surprising to me that, even with so much evidence to challenge use of correlation data, so few analysts and companies seem to comprehend this.

    Reply
  4. Stewart Irvine says

    5 February, 2018 at 5:22 pm

    Interesting read, you summed it up well. Data visualization is crucial in developing a practical statistical model, but just as statistics, data visualization can be misinterpreted or misrepresented because users may not understand the fundamental concerns with charts, graphs and maps that apply to their world.

    Reply
  5. James Lawther says

    5 February, 2018 at 9:20 pm

    Thank you for the comments, glad you enjoyed the post.

    Chip, in England we have sticklebacks, and as any schoolboy with a net knows, they are about half an inch long and have either 3 or 5 spines on the dorsal fin. Perhaps the Mexican Sierra is its bigger brother… Great story.

    Reply
  6. Andrew Rudin says

    7 February, 2018 at 9:28 am

    Great article, well stated. A few years ago, I read an observation, “statistics aren’t facts, they’re interpretations.” I don’t know who wrote it, but I never forgot the idea. Since then, whenever I read a statistic, the first question I ask myself is ‘what is the point the writer is attempting to prove or reinforce?’ Then, I seek to understand what his or her vested interest is. I often find one.

    A good book on the topic is A Field Guide to Lies – Critical Thinking in the Digital Age by Daniel Levitin. I recommend it to anyone who considers statistics when making decisions.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Explore

accountability assumptions beliefs best practice blame bureaucracy capability clarity command and control communication complexity continuous improvement cost saving culture customer focus data is not information decisions employee performance measures empowerment error proofing fessing up gemba human nature incentives information technology innovation key performance indicators learning management style measurement motivation performance management poor service process control purpose reinforcing behaviour service design silo management systems thinking targets teamwork test and learn trust video waste

Receive Posts by e-Mail

Get the next post delivered straight to your inbox

Creative Commons

This information from The Squawk Point is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Creative Commons Licence
Customer Experience Update

Try This:

  • Fish Bone Diagrams – Helpful or Not?

  • Regression to The Mean

  • Brilliance Alone Won’t Take You Far

  • Glory Lasts Forever

Connect

  • E-mail
  • LinkedIn
  • RSS
  • YouTube
  • Cookies
  • Contact Me

Copyright © 2025 · Enterprise Pro on Genesis Framework · WordPress · Log in