Skip to main content

Lies, damn lies and statistics

The term, "Lies, damn lines and statistics" was popularised by Mark Twain and is used to describe the persuasive power of statistics, particularly when they are used to bolster a weak argument or make a spurious claim.

As a performance tester, I’m often given raw data and asked to find some hidden meaning in it.

Often this is pretty simple, test data may clearly demonstrate that more than 100 users on a system cause it to crash or for response times to become unacceptable to users.

Determining cause and effect is difficult when changing one variable in a test doesn’t always result in a dramatic outcome. When looking for patterns in data, it’s important to repeat tests in controlled conditions to ensure that your results aren’t causing you to jump to the wrong conclusion.

The statistics below from Tyler Vigen’s “Spurious Correlations” site, demonstrate just how spurious correlations can be.

For example: Who’d have thought that training more biomedical scientists could cause an increase in alcohol poisoning?


But thank goodness that we’re eating less beef than we used to…
  ….as you can see below, eating less beef is reducing lightning strikes and saving lives.

As well as leaping to conclusions, testers should avoid the temptation to extrapolate results. Sometimes the answer just isn’t where you’re looking, in which case you should look again or approach the problem from a different angle as our friend Mark Tomlinson explained at TestBash 3 last month.

To hear more about web performance and testing from Trust IV, subscribe to our monthly newsletter.

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.