Statistical thinking will one day be as necessary for effective citizenship as the ability to read and write.-H. G. Wells, who created national hysteria with his radio adaptation of his science fiction book The War of the Worlds The War of the Worlds British MD and quack buster Ben Goldacre, contributor of the next chapter next chapter, is well known for ill.u.s.trating how people can be fooled by randomness. He uses the following example: If you go to a c.o.c.ktail party, what"s the likelihood that two people in a group of 23 will share the same birthday? One in 100? One in 50? In fact, it"s one in two. Fifty percent.
To become better at spotting randomness for what it is, it"s important to understand the concept of "p-value," which you"ll see in all good research studies. It answers the question: how confident are we that this result wasn"t due to random chance?
To demonstrate (or imply) cause-and-effect, the gold standard for studies is a p-value of less than 0.05 (p < 0.05),="" which="" means="" a="" less="" than="" 5%="" likelihood="" that="" the="" result="" can="" be="" attributed="" to="" chance.="" a="" p-value="" of="" less="" than="" 0.05="" is="" also="" what="" most="" scientists="" mean="" when="" they="" say="" something="" is="" "statistically="">
An example makes this easy to understand.
Let"s say you are a professional coin flipper, but you"re unethical. In hopes of dominating the coin-flipping gambling circuit, you"ve engineered a quarter that should come up heads more often than a normal quarter. To test it, you flip it and a normal quarter 100 times, and the results seem clear: the "normal" quarter came up heads 50 times, and your designer quarter came up heads 60 times!
Should you take out a second mortgage and head to Vegas?
[image]
The above sample size estimation tool, created by the web design and a.n.a.lytics firm WebShare, says: probably not, if you want to keep the house.
If we look at 20% improvement (60 flips vs. 50 flips = 10 more flips) at the top and scan down to see how many coin flips you"d need per coin to be 95% confident in your results (p = 0.05), you"d need 453 flips.
In other words, you better make sure that 20% holds up with at least 453 flips with each coin. In this case, 10 extra flips out of 100 doesn"t prove cause-and- effect at all.
Three points to remember about p-values and "statistical significance":
* Just because something seems miraculous doesn"t mean it is. People are fooled by randomness all the time, as in the birthday example.* The larger the difference between groups, the smaller the groups can be. Critics of small trials or self-experimentation often miss this. If something appears to produce a 300% change, you don"t need that many people to show significance, a.s.suming you"re controlling variables.* It is not kosher to combine p-values from multiple experiments to make something more or less believable. That"s another trick of bad scientists and mistake of uninformed journalists.
[image]
TOOLS AND TRICKS.
The Black Swan by Na.s.sim Taleb by Na.s.sim Taleb ( (www.fourhourbody.com/blackswan) Taleb, also author of the bestseller Taleb, also author of the bestseller Fooled by Randomness, Fooled by Randomness, is the reigning king when it comes to explaining how we fool ourselves and how we can limit the damage. Our instinct to underestimate the occurrence of some events, while overestimating others, is a princ.i.p.al cause of enormous pain. This book should be required reading. is the reigning king when it comes to explaining how we fool ourselves and how we can limit the damage. Our instinct to underestimate the occurrence of some events, while overestimating others, is a princ.i.p.al cause of enormous pain. This book should be required reading.
The Corporation, DVD DVD ( (www.fourhourbody.com/corporation) This is a disturbing doc.u.mentary about the American corporation and its relentless pursuit of profit at the expense of our culture. This film gives you a glimpse into how heavily companies can skew health reports when they have a vested interest in the findings. See the This is a disturbing doc.u.mentary about the American corporation and its relentless pursuit of profit at the expense of our culture. This film gives you a glimpse into how heavily companies can skew health reports when they have a vested interest in the findings. See the next chapter next chapter.
"List of Cognitive Biases" (www.fourhourbody.com/biases) We are all susceptible to cognitive biases, including the scientists who produce "bad science." Review the list at this URL and ask yourself whether you"re mindlessly accepting as fact things you hear or read. We are all susceptible to cognitive biases, including the scientists who produce "bad science." Review the list at this URL and ask yourself whether you"re mindlessly accepting as fact things you hear or read.
[image]
End of Chapter Notes 3. Also called Also called population population, cohort cohort, or epidemiological epidemiological studies. studies.
4. The one exception is if the effect is so huge that it can"t be explained in any other way. For instance, the twentyfold increased risk of lung cancer that is a.s.sociated with cigarette smoking in multiple studies. The one exception is if the effect is so huge that it can"t be explained in any other way. For instance, the twentyfold increased risk of lung cancer that is a.s.sociated with cigarette smoking in multiple studies.
5. Michael Pollan, "Unhappy Meals," Michael Pollan, "Unhappy Meals," New York Times New York Times, January 28, 2007, sec. Magazine.
6. Caloric-value-of-100g/100g = X/number-of-grams-weighed. Caloric-value-of-100g/100g = X/number-of-grams-weighed.
7. Even without such tools, if you have large samples and the a.n.a.lysis is good, it is Even without such tools, if you have large samples and the a.n.a.lysis is good, it is sometimes sometimes possible to correct for subjective error and reconstruct the information you want. possible to correct for subjective error and reconstruct the information you want.
SPOTTING BAD SCIENCE 102.
So You Have a Pill...
This chapter was written by Dr. Ben Goldacre, who has written the weekly "Bad Science" column in the Guardian Guardian since 2003 and is a recipient of the Royal Statistical Society"s Award for Statistical Excellence in Journalism. He is a medical doctor who, among other things, specializes in unpacking sketchy scientific claims made by scaremongering journalists, questionable government reports, evil pharmaceutical corporations, PR companies, and quacks. since 2003 and is a recipient of the Royal Statistical Society"s Award for Statistical Excellence in Journalism. He is a medical doctor who, among other things, specializes in unpacking sketchy scientific claims made by scaremongering journalists, questionable government reports, evil pharmaceutical corporations, PR companies, and quacks.
What I"m about to tell you is what I teach medical students and doctors-here and there-in a lecture I rather childishly call "Drug Company Bulls.h.i.t". It is, in turn, what I was taught at medical school,1 and I think the easiest way to understand the issue is to put yourself in the shoes of a big pharma researcher. and I think the easiest way to understand the issue is to put yourself in the shoes of a big pharma researcher.
You have a pill. It"s OK, maybe not that brilliant, but a lot of money is riding on it. You need a positive result, but your audience aren"t homeopaths, journalists or the public: they are doctors and academics, so they have been trained in spotting the obvious tricks, like "no blinding", or "inadequate randomisation". Your sleights of hand will have to be much more elegant, much more subtle, but every bit as powerful.
What can you do?
Well, firstly, you could study it in winners. Different people respond differently to drugs: old people on lots of medications are often no-hopers, whereas younger people with just one problem are more likely to show an improvement. So only study your drug in the latter group. This will make your research much less applicable to the actual people that doctors are prescribing for, but hopefully they won"t notice. This is so commonplace it is hardly worth giving an example.
Next up, you could compare your drug against a useless control. Many people would argue, for example, that you should never never compare your drug against placebo, because it proves nothing of clinical value: in the real world, n.o.body cares if your drug is better than a sugar pill; they only care if it is better than the best currently available treatment. But you"ve already spent hundreds of millions of dollars bringing your drug to market, so stuff that: do lots of placebo-controlled trials and make a big fuss about them, because they practically guarantee some positive data. Again, this is universal, because almost all drugs will be compared against placebo at some stage in their lives, and "drug reps"-the people employed by big pharma to bamboozle doctors (many simply refuse to see them)-love the unambiguous positivity of the graphs these studies can produce. compare your drug against placebo, because it proves nothing of clinical value: in the real world, n.o.body cares if your drug is better than a sugar pill; they only care if it is better than the best currently available treatment. But you"ve already spent hundreds of millions of dollars bringing your drug to market, so stuff that: do lots of placebo-controlled trials and make a big fuss about them, because they practically guarantee some positive data. Again, this is universal, because almost all drugs will be compared against placebo at some stage in their lives, and "drug reps"-the people employed by big pharma to bamboozle doctors (many simply refuse to see them)-love the unambiguous positivity of the graphs these studies can produce.
Then things get more interesting. If you do have to compare your drug with one produced by a compet.i.tor-to save face, or because a regulator demands it-you could try a sneaky underhand trick: use an inadequate dose of the competing drug, so that patients on it don"t do very well; or give a very high dose of the competing drug, so that patients experience lots of side- effects; or give the competing drug in the wrong way (perhaps orally when it should be intravenous, and hope most readers don"t notice); or you could increase the dose of the competing drug much too quickly, so that the patients taking it get worse side-effects. Your drug will shine by comparison. You might think no such thing could ever happen. If you follow the references in the back, you will find studies where patients were given really rather high doses of old-fashioned antipsychotic medication (which made the new-generation drugs look as if they were better in terms of side-effects), and studies with doses of SSRI antidepressants which some might consider unusual, to name just a couple of examples. I know. It"s slightly incredible.
Of course, another trick you could pull with side-effects is simply not to ask about them; or rather-since you have to be sneaky in this field-you could be careful about how you ask. Here is an example. SSRI antidepressant drugs cause s.e.xual side-effects fairly commonly, including ano.r.g.a.s.mia. We should be clear (and I"m trying to phrase this as neutrally as possible): I really really enjoy the sensation of o.r.g.a.s.m. It"s important to me, and everything I experience in the world tells me that this sensation is important to other people, too. Wars have been fought, essentially, for the sensation of o.r.g.a.s.m. There are evolutionary psychologists who would try to persuade you that the entirety of human culture and language is driven, in large part, by the pursuit of the sensation of o.r.g.a.s.m. Losing it seems like an important side-effect to ask about. enjoy the sensation of o.r.g.a.s.m. It"s important to me, and everything I experience in the world tells me that this sensation is important to other people, too. Wars have been fought, essentially, for the sensation of o.r.g.a.s.m. There are evolutionary psychologists who would try to persuade you that the entirety of human culture and language is driven, in large part, by the pursuit of the sensation of o.r.g.a.s.m. Losing it seems like an important side-effect to ask about.
And yet, various studies have shown that the reported prevalence of ano.r.g.a.s.mia in patients taking SSRI drugs varies between 2 per cent and 73 per cent, depending primarily on how you ask: a casual, open-ended question about side-effects, for example, or a careful and detailed enquiry. One 3,000-subject review on SSRIs simply did not list any s.e.xual side-effects on its twenty-three-item side-effect table. Twenty-three other things were more important, according to the researchers, than losing the sensation of o.r.g.a.s.m. I have read them. They are not.
But back to the main outcomes. And here is a good trick: instead of a real-world outcome, like death or pain, you could always use a "surrogate outcome", which is easier to attain. If your drug is supposed to reduce cholesterol and so prevent cardiac deaths, for example, don"t measure cardiac deaths; measure reduced cholesterol instead. That"s much easier to achieve than a reduction in cardiac deaths, and the trial will be cheaper and quicker to do, so your result will be cheaper and and more positive. Result! more positive. Result!
Now you"ve done your trial, and despite your best efforts things have come out negative. What can you do? Well, if your trial has been good overall, but has thrown out a few negative results, you could try an old trick: don"t draw attention to the disappointing data by putting it on a graph. Mention it briefly in the text, and ignore it when drawing your conclusions. (I"m so good at this I scare myself. Comes from reading too many rubbish trials.) If your results are completely negative, don"t publish them at all, or publish them only after a long delay. This is exactly what the drug companies did with the data on SSRI antidepressants: they hid the data suggesting they might be dangerous, and they buried the data showing them to perform no better than placebo. If you"re really clever and have money to burn, then after you get disappointing data, you could do some more trials with the same protocol in the hope that they will be positive. Then try to bundle all the data up together, so that your negative data is swallowed up by some mediocre positive results.
Or you could get really serious and start to manipulate the statistics. For two pages only, this will now get quite nerdy. Here are the cla.s.sic tricks to play in your statistical a.n.a.lysis to make sure your trial has a positive result.
Ignore the protocol entirely Always a.s.sume that any correlation proves proves causation. Throw all your data into a spreadsheet programme and report-as significant-any relationship between anything and everything if it helps your case. If you measure enough, some things are bound to be positive just by sheer luck. causation. Throw all your data into a spreadsheet programme and report-as significant-any relationship between anything and everything if it helps your case. If you measure enough, some things are bound to be positive just by sheer luck.
Play with the baseline Sometimes, when you start a trial, quite by chance the treatment group is already doing better than the placebo group. If so, then leave it like that. If, on the other hand, the placebo group is already doing better than the treatment group at the start, then adjust for the baseline in your a.n.a.lysis.
Ignore dropouts People who drop out of trials are statistically much more likely to have done badly, and much more likely to have had side- effects. They will only make your drug look bad. So ignore them, make no attempt to chase them up, do not include them in your final a.n.a.lysis.
Clean up the data Look at your graphs. There will be some anomalous "outliers", or points which lie a long way from the others. If they are making your drug look bad, just delete them. But if they are helping your drug look good, even if they seem to be spurious results, leave them in.
"The best of five...no...seven...no...nine!"
If the difference between your drug and placebo becomes significant four and a half months into a six-month trial, stop the trial immediately and start writing up the results: things might get less impressive if you carry on. Alternatively, if at six months the results are "nearly significant", extend the trial by another three months.
Torture the data If your results are bad, ask the computer to go back and see if any particular subgroups behaved differently. You might find that your drug works very well in Chinese women aged fifty-two to sixty-one. "Torture the data and it will confess to anything", as they say at Guantanamo Bay.
Try every b.u.t.ton on the computer If you"re really desperate, and a.n.a.lysing your data the way you planned does not give you the result you wanted, just run the figures through a wide selection of other statistical tests, even if they are entirely inappropriate, at random.
And when you"re finished, the most important thing, of course, is to publish wisely. If you have a good trial, publish it in the biggest journal you can possibly manage. If you have a positive trial, but it was a completely unfair test, which will be obvious to everyone, then put it in an obscure journal (published, written and edited entirely by the industry). Remember, the tricks we have just described hide nothing, and will be obvious to anyone who reads your paper, but only if they read it very attentively, so it"s in your interest to make sure it isn"t read beyond the abstract. Finally, if your finding is really embarra.s.sing, hide it away somewhere and cite "data on file". n.o.body will know the methods, and it will only be noticed if someone comes pestering you for the data to do a systematic review. Hopefully, that won"t be for ages.
[image]
End of Chapter Notes 1. In this subject, like many medics of my generation, I am indebted to the cla.s.sic textbook In this subject, like many medics of my generation, I am indebted to the cla.s.sic textbook How to Read a Paper How to Read a Paper by Professor Greenhalgh at UCL. It should be a best-seller. by Professor Greenhalgh at UCL. It should be a best-seller. Testing Treatments Testing Treatments by Imogen Evans, Hazel Thornton, and Iain Chalmers is also a work of great genius, appropriate for a lay audience, and, amazingly, also free to download from by Imogen Evans, Hazel Thornton, and Iain Chalmers is also a work of great genius, appropriate for a lay audience, and, amazingly, also free to download from www.jameslindlibrary.org. For committed readers I recommend Methodological Errors in Medical Research Methodological Errors in Medical Research by Bjorn Andersen. It"s extremely long. The subt.i.tle is by Bjorn Andersen. It"s extremely long. The subt.i.tle is An Incomplete Catalogue. An Incomplete Catalogue.
THE SLOW-CARB DIET-194 PEOPLE The following Slow-Carb Diet data was collected with detailed questionnaires using CureTogether.com. 194 people responded to all questions, and 58% indicated it was the first diet they had ever been able to stick with.
The subjects were recruited via my top-1,000 blog (www.fourhourblog.com), Twitter (www.twitter.com/tferriss), and Facebook (www.facebook.com/timferriss).
Average Weight Lost (lbs.) Average Weight Lost (lbs.) Number of People Number of People Everyone 21 21.
194 194.
Vegetarian 23 23.
10 10.
Nonvegetarian
21.
178.
Age .
1520 16 16.
19 19.
2130 20 20.
86 86.
3140 22 22.
56 56.
4150 21 21.
26 26.
5160 30 30.
5 5.
61+.
11 11.
2 2.
Men
23.
150.
Women