How to Set the Bullshit Filter When the Bullshit is Thick

A while back I wrote a short piece in the New York Times Magazine about a researcher named John Ioannidis who had found that over half of all new research findings later prove false:

Many of us consider science the most reliable, accountable way of explaining how the world works. We trust it. Should we? John Ioannidis, an epidemiologist, recently concluded that most articles published by biomedical journals are flat-out wrong. The sources of error, he found, are numerous: the small size of many studies, for instance, often leads to mistakes, as does the fact that emerging disciplines, which lately abound, may employ standards and methods that are still evolving. Finally, there is bias, which Ioannidis says he believes to be ubiquitous. Bias can take the form of a broadly held but dubious assumption, a partisan position in a longstanding debate (e.g., whether depression is mostly biological or environmental) or (especially slippery) a belief in a hypothesis that can blind a scientist to evidence contradicting it. These factors, Ioannidis argues, weigh especially heavily these days and together make it less than likely that any given published finding is true.

Now I’m delighted (and chagrined, too, I admit, that I didn’t do the damn story) to see that David H. Freedman, author of Wrong: Why experts keep failing us — and how to know when not to trust them — has profiled Ioannidis at length in the current Atlantic.

He’s what’s known as a meta-researcher, and he’s become one of the world’s foremost experts on the credibility of medical research. He and his team have shown, again and again, and in many different ways, that much of what biomedical researchers conclude in published studies—conclusions that doctors keep in mind when they prescribe antibiotics or blood-pressure medication, or when they advise us to consume more fiber or less meat, or when they recommend surgery for heart disease or back pain—is misleading, exaggerated, and often flat-out wrong. He charges that as much as 90 percent of the published medical information that doctors rely on is flawed. His work has been widely accepted by the medical community; it has been published in the field’s top journals, where it is heavily cited; and he is a big draw at conferences. Given this exposure, and the fact that his work broadly targets everyone else’s work in medicine, as well as everything that physicians do and all the health advice we get, Ioannidis may be one of the most influential scientists alive. Yet for all his influence, he worries that the field of medical research is so pervasively flawed, and so riddled with conflicts of interest, that it might be chronically resistant to change—or even to publicly admitting that there’s a problem.

This is an important story, for it — or rather, Ioannidis’s work — calls into question how much we can trust the evidence base that people are calling on to support evidence-based practice. According to Ioannidis, there’s scarcely a body of medical research that’s not badly undermined by multiple factors that will create either bias or error. And these errors persist, he says, because people and institutions are invested in them.

Even when the evidence shows that a particular research idea is wrong, if you have thousands of scientists who have invested their careers in it, they’ll continue to publish papers on it,” he says. “It’s like an epidemic, in the sense that they’re infected with these wrong ideas, and they’re spreading it to other researchers through journals.”

This presents some really difficult problems for doctors, patients — and science and medical journalists. Ioannidis is not saying all studies are wrong; just a good healthy half or so of them, often more. In a culture that a — for good reason — wants testable knowledge to draw on, what are we to draw on if the better of the tests (the papers and findings, that is) are false? You can throw up your hands. You could, alternatively, figure that this wrong-much-of-the-time dynamic still leaves us ahead overall — advanced beyond what we were before, perhaps, but still not as far as we’d like.

The latter response makes some sense But it’s made more problematic by the high stakes involved when we’re talking about high-impact (and expensive) treatments like surgery or heavy-duty pharmaceuticals. A stunning review a couple years ago, for instance, found that the second-generation antipsychotics developed in the 1980s, hailed then as more effective and with lesser side-effects than the prior generation, actually worked no better and caused (different) side effects just as bad — even though they cost about 10 times as much.

Enormous expense and, I suspect, not a little harm. The hype and false confidence around those drugs — the conviction that they improved on the drugs available before — probably led many doctors to prescribe them (and patients to take them) when they might have taken a pass on prescribing the earlier generation. As with the generation of antidepressants popularized at around the same time, these ‘newer, better’ drugs gave fresh impetus to pharmacological responses to mental health issues just as the profession and the culture were growing cynical about existing meds. They resuscitated belief in psychopharmacology. But that new life was based on false data. The consequence was not trivial; it created a couple of decades — and counting — of heavy reliance and overselling of psychopharmaceuticals whose benefits were oversold and drawbacks downplayed.

There’s error and there’s error. It’s one thing to be wrong about low-impact treatments: to be wrong, for instance, about how much a low-impact drug like aspirin or glucosamine helps modest knee pain in athletes, or how much benefit you get from walking versus running, or whether coffee makes your smarter or just makes you feel smarter. The stakes run much higher when the treatments cost a lot in money or health. Yet little in our regulatory, medical, or journalistic cultures or practices acknowledges that.

Ioannidis hints at a way to compensate for this. He notes that the big expensive false reports tend to be generated and propagated by big moneyed interests. Ideally, skepticism should be applied accordingly. It’s not even that this science is more likely to be wrong (though that may be). It’s that the consequences may be more expensive. Here, as elsewhere, the smell of money should sharpen your bullshit filter.

update/addendum, 14 Oct, 2010, 2:01 PM EDT:

For yet more perspective on this, I recommend reading not only the Atlantic article cited above, but two otherss: Iaonnidis’s big-splash 2005 paper in PLOS (quite readable), “Why Most Research Findings Are False,” and a follow-up by some others, “Most Research Findings are False — But Replication Helps.” If you’re feeling hopeless from the above, as several people have expressed below and on Twitter, these may help.

It also helps to keep in mind the corollaries or risk factors that Iaonnidis sets out in that 2005 paper. Useful in adjusting your BS filter and in identifying the sorts of disciplines and fields and findings that deserve more skepticism.

Those corollaries:

Corollary 1: The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.

Corollary 2: The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.

Corollary 3: The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true.

Corollary 4: The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.

Corollary 5: The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.

Corollary 6: The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true.

He elaborates on this fruitfully.

Finally, J.R. Minkel alerts me to  a post at Seth’s blog that looks like a good addition. (I lack time to read it thoroughly at the moment b/c I have to finish up an assignment. Trying to, you know, get it right, against the odds.)

If in doubt, it’s always safe and sensible to apply to any novel finding the old maxim that the great oceanographer Henry Bryant Bigelow reminded his brother of when his brother reporting seeing a donkey sail by during a hurricane in Cuba: “Interesting if true.”

Double-Stopped Bach & Rain Delays: A Very Different Music for Airports

Searching YouTube for a performance of one of my favorite pieces of music — and something to offer my readers while I try to finish a big feature about schizophrenia that’s been keeping me — I came across this mashup of Bach and an airport rain delay, and it took me only a few seconds to realize I liked this unlikely combination. It fits the mashedup, contrapuntal, almost discordant feel of this luscious adagio. The piece features a violin playing nothing but double stops — that is, two strings at a time, all the time, requiring great, fluid dexterity from the left hand to keep changing the double stop, and a fine touch, firm but supple, with the bow hand — laid over a spidery, mesmerizing line laid down by the keyboard. Easily one of the most beautiful and haunting pieces of music ever written, unique in the literature and very different from the rest of the violin-and-harpsichord sonatas that this adagio is taken from.

My one disappointment here is that the keyboard is a piano. It’s lovely, but the crucial spidery feel is lost. For the pure music, I can recommend this very affordable Naxos version, with Lucy Dael and Bob van Asperen.

Despite my reservations about the piano, I must say the version by Jaime Laredo and Glenn Gould, below, is rather stunning — and being Gould, quite different. Note how Laredo (possibly the world’s most overlooked great violinist) plays the first few bars of the double stops almost separately as two notes, before going to the full chordal treatment.

In any case, I like this mashup. That may be partly because I find some good sound-isolating earplugs and Bach a good way to endure airports — an experience echoed here. Just don’t get so entranced you miss your plane.

The Laredo/Gould version:

Bike Fever: Riding the Green Wave to Work in Copenhagen

The Green Wave in Copenhagen from Copenhagenize on Vimeo.

I brought no car to London but soon bought a bike. I’ve found the riding here to be wonderful, exhilarating, and — notwithstanding a few wooooooops-OTHER-side-of-the-road braincramp scares — safer-feeling than my last extensive urban riding, years ago, in Houston. Why? These London drivers, though generally fast and sometimes aggressive with pedestrians, seem to accept that bikes get their space. It helps too that the city has some great-to-decent bike lanes and routes. It’s a pretty bike-friendly place, and getting more so.

But no one beats Copenhagen for bike friendly. I’ve not been. Above, however, you can take a ride down one of Copenhagen’s “green waves” — a major commuter route in which the lights are set to allow a cyclist going 20 kmh (a nice moderate 12 mph) to sail nonstop into the city center without stopping. The vid shows exactly this — and gives a hint of that adrenal rush one gets riding in a city. Though here in London, I prefer a Led Zeppelin soundtrack.

Or — because it’s there — Queen:

The Bomb as a Really Expensive Marketing Tool

One of the best history books I’ve ever read is Richard Rhodes’ The Making of the Atomic Bomb. I read it more than two decades ago, and that’s how good it is: It sticks with me now as a stunning narrative and a deeply informative history. If you ever even pause while wandering by a place like WIRED, much less come in the door, you would love this book. Tech, science, culture, incredible human stories, da bomb, all handled masterfully.

I’m pleased to see Rhodes is still talking about the bomb. This time, in a talk about the future obsolescence of nuclear weaponry, he apparently discussed how most of the immense amounts we spent on nukes were essentially spent to market security.

From the incredible kottke:

Richard Rhodes recently gave a Long Now talk called The Twilight of the Bombs about the future obsolescence of nuclear weaponry. From Stewart Brand’s summary of the talk:

How much did the Cold War cost everyone from 1948 to 1991, and how much of that was for nuclear weapons? The total cost has been estimated at $18.5 trillion, with $7.8 trillion for nuclear. At the peak the Soviet Union had 95,000 weapons and the US had 20 to 40,000. America’s current seriously degraded infrastructure would cost about $2.2 trillion to fix — all the gas lines and water lines and schools and bridges. We spent that money on bombs we never intended to use — all of the Cold War players, major and minor, told Rhodes that everyone knew that the bombs must not and could not be used. Much of the nuclear expansion was for domestic consumption: one must appear “ahead,” even though numbers past a couple dozen warheads were functionally meaningless.

Wish Ida been there.

And I like this addition from Brand, at bottom of his post:

At dinner Rhodes reflected that nuclear weapons may come to be seen as a strange fetishistic behavior by nations at a certain period in history. They were insanely expensive and thoroughly useless. Their only function was to keep a bizarre form of score.


Vid: 1945-1998 (by Isao Hashimoto, Japan, © 2003. This shows the 2,053 nuclear explosions that humans set off from 1945 to 1998.

The Choke Chamber: In Which I Miss A Putt and Fork Over a Fiver

Sian Beilock, the author of “Choke” whose work I wrote about in a feature-length post published a few days ago, knows a lot of ways to make a person fail under pressure. Below, in a slightly tweaked version of an alternative opening for the feature, one I eventually left on the cutting-room floor for structural reasons, is an account of how she worked that magic on me when I visited her last year in Chicago.

If you are an athlete, even of the weekend sort, you best steer clear of the Human Performance Lab, for it exists pretty much just to make people choke — and possibly no one knows more about how to make people choke than the researcher who runs that lab, University of Chicago cognitive psychologist Sian Beilock.

I already knew this when I entered the “Choke Chamber,” as I prefer to call the plain, windowless room that is the Lab. In the weeks before flying to Chicago I had read the papers in which Beilock described the cruel pressures she applied to people in this little room. I entered hoping that knowing these ruses would protect me.

Beilock, with whom I had just enjoyed an amiable lunch, pitted me against a research assistant named Chase Coelho. We were to play a simple putting-accuracy game. From different spots on the smooth, synthetic putting green that formed the lab’s floor, we would putt toward a red dot of tape in the room’s center, trying to leave the ball right on the dot. The winner would be the one whose five putts, in the “real round,” had the least total error.

We practiced a few minutes, chatting, while Beilock looked on holding a clipboard. Then Beilock picked up A

measuring tape, and we began to play. “Two rounds, five putts each” said Beilock. The first would be a “practice round” that supposedly didn’t count, even though she would measure our errors. In the second round, the competition round, would come the pressure. I just didn’t know how.

I did pretty well the first round. I missed by a total of 59 centimeters error over 5 putts; Chase missed by 52.

Then came the money round. Just before we started, a couple strangers (grad students, I correctly guessed) just happened to walk in and assume positions against opposite walls. One on either side of me. No one introduced them. I said hello; they nodded. There they stood the rest of the game, unintroduced and unspeaking, watching our friendly little game.

This, I knew, was meant to bring a bit of audience pressure, or “spotlight anxiety.”

With these two in place, Beilock suggested we “add a little wager to make this interesting,” creating some basic financial pressure. Finally — a nice touch, this — she introduced an “image of failure” by suggesting I choose the amount of the wager, “in you case you lose.”

We began the second round.

“After you,” said Chase. As I set my ball on the floor, he said, “I can see you’ve played some.” This I recognized as a ploy to apply both peer pressure and high expectations. It was working. I could feel myself getting tense. But I listened to myself breathe and still putted well, twice.

But these people don’t let up. They know how to turn even success against you. When I left my second putt within two inches of the target — the best putt of the day so far — Chase said, “He can’t do that twice.” Then Beilock commented that “ That’s an unusual grip you’ve got there, with your finger along the back” — a not terribly subtle attempt to provoke in me a destructive “explicit monitoring” of my mechanics. Finally, before my fifth putt, she said, “Last one!” — a ploy so blatant it drew laughter from Beilock, Chase, me, and even the stonefaces leaning against the wall.

Yet though I recognized most of the tactics, they worked. Three of my putts improved on my first-round average, missing by only two or three inches each. The good putter in me was getting better. But Beilock’s gripe about my grip, just before my third putt, hit home; it caused me to overfocus on the finger along the back of the club, and I drilled the ball 40 centimeters long. Her last, most ridiculous ploy worked as well; after she said, “Last one!” and we all laughed … I knocked the final putt 22 centimeters long. These two chokes produced more error — 62 centimeters — than the combined errors from my entire first round. My 59 first-round score jumped 34 percent, to 79. Chase went from 52 to 62, or 19 percent. He got worse too. But not as bad as I did.

I felt a little better when they told me — later, of course — that Chase was a scratch golfer and had played a high level of golf in college. I liked the guy. But he still walked off with my five bucks.

Choose one: Squirrels masturbate (a) because they can or (b) to clean

Hard to imagine anyone missed this, but just in case: Over at Rocket Science, Ed Yong explains that Squirrels masturbate to avoid sexually transmitted infections — at least, that’s the new hypothesis from Jane Waterman, a scientist who has been watching certain squirrels very closely.

The goal was to try to answer the mystery: Why, from an evolutionary point of view, would an animal masturbate? (“Because it can” — the punch line to an old joke — is not considered adequate.)

With sperm being so important, it’s odd that some Cape ground squirrels regularly waste theirs. Yet that’s exactly what Jane Waterman saw while studying wild squirrels in Namibia. Some of them would masturbate, apparently squandering their precious sperm. What does squirrel masturbation look like? Apparently, it’s rather acrobatic. I’ll let Waterman describe it herself:

“An oral masturbation was recorded when a male sat with head lowered and an erect penis in his mouth, being stimulated with both mouth (fellatio) and forepaws (masturbation), while the lower torso moved forward and backwards in thrusting motions, finally culminating in an apparent ejaculation, after which the male appeared to consume the ejaculate.”

Many mammals masturbate including humans, other primates, rodents, and more. In every case, the same question remains – why waste the sperm? The most obvious explanation is that it’s what males do when they’re horny, but unsuccessful with it. This is the loftily named “sexual outlet hypothesis”. If it’s right, masturbation isn’t adaptive – it’s just a side effect of the intense sexual arousal generated in species where males mate with many females.

An alternative is that masturbation is actually beneficial. By flushing old sperm from the male’s testicles, it gets a higher proportion of competitive or fertile sperm ready for the next potential mating.

But Waterman thinks that both of these hypotheses are wrong, at least when it comes to Cape ground squirrels. She should know; she spent around 2000 hours spying on the animals with a pair of binoculars, noting every interaction between them, and every sexual act among the local males.

He’s not making this up. How he finds this stuff I’ve no idea. But do check out the whole story. It’s quite fascinating.

Yong’s post.

Reference: Waterman, J. (2010). The Adaptive Function of Masturbation in a Promiscuous African Ground Squirrel PLoS ONE, 5 (9) DOI: 10.1371/journal.pone.0013060

A Good Idea for a Vid Promo for Good Ideas

I rather like this little promo video for Stephen Johnson’s new book, Where Good Ideas Come From. As he notes in his own little blog post about it, there’s not a lot to it. It’s just Johnson talking, or possibly reading from his intro, and an artist drawing a sketch about how ideas form. But there’s something both relaxing and pleasantly engaging about watching the drawing develop while Johnson reads about how ideas develop.

Is David Simon the Right Pick for a MacArthur?

I’d say No.

Okay: He made The Wire. It’s a great show. But the criteria for the MacArthur awards are

  1. extraordinary creativity,
  2. promise for future advances, and
  3. “potential for the fellowship to facilitate subsequent creative work”

Simon clearly scores on 1 and 2. But 3? It’s hard to see how another $500,000 and some recognition will significantly increase David Simon’s capacity to keep doing his work. (To his credit, he said he felt he didn’t quite belong among the other 22 who got awards this year.) For other people, meantime, it would make all the difference.

It may seem petty to gripe about this. Yet it seems a shame when, somewhere out there, some person doing great but unrecognized creative work is either giving it up or doing only a fraction of what they could do because they have to keep a day job or lack other necessary resources. There must be thousands. The MacArthur program generally does a great job of finding these people. It found some wonderful ones this time around, like jazz pianist and composer James Moran, vioinist and music educator Sebastian Ruth, bee researcher Marla Spivak,  novelist Yiyun Li, and historians Shannon Lee Dawdy and Annette Gordon-Reed. It’s also nice to see, in a day when science funding is under fire, some scientists score this prize, my favorite this time being biophysicist John Dabiri, who uses jellyfish to study everything from evolution to the fluid dynamics of the human heart.

But I wish more of these prizes would go to the untenured, the independent, and others who work rather in the dark. People, for instance, like Ted Ames, a fisherman who worked for years collecting fishermen’s tales in Maine and turned them into a novel and important finding about cod populations along the New England coast. Both Ames and that study went almost completely unnoticed until the MacArthur award threw light on it. The MacArthur foundation would seem to have the resources to find more such people — people to whom some limelight and $500,000 over 5 years would be utterly novel and transformative, rather than an addition to existing security and recognition.

Please: Less TED, more Ted.

Simon was one of 23 award recipients. I’d love to know who was 24th on the list.

Malcolm Gladwell: Twitter, You’re No Martin Luther King

Malcolm Gladwell has roiled things up with an article arguing that fans of social media tools like Twitter and Facebook are wildly overstating the powers of these tools to change things. As an example, he uses the civil rights movement as it rose out of the Montgomery bus strike and (more immediately) the Greensboro Four, the four black students who catalyzed the civil rights movement by sitting down at a Woolworth’s lunch counter and asking for a cup of coffee.

As usual, this piece by Gladwell contains some wonderful writing, research, and storytelling. It certainly refreshes one’s awe at what the civil rights movement accomplished and at the extreme discipline and courage of its leaders and activists. Things go not so well, however, when Gladwell uses this extraordinary example to argue that those who argue that Twitter, Facebook, and other social media tools — and by extension, the power of the internet to forge new connections and ideas — are a bit wild-eyed and delusional.

His pushback produced some predictable wretching on Twitter. My favorite critique so far, however — remarkably quick, as Gladwell’s story just came out — emerged from Alexis Madrigal at the Atlantic. Madrigal offers a smart and evenhanded critique. He  praises some of Gladwell’s stronger points, then makes a couple quibbles: First, that Gladwell is wrong when he says Twitter forms only weak ties; and, second, that he overstates what he claims that networks can’t have hierarchies.

Continue reading →

The Tight Collar: The New Science of Choking Under Pressure

The Collar

Late in May 2008, perched in superb seats a few rows behind home plate at Chicago’s Cellular Field, I took in a White Sox-Indians game with Sian Beilock, a professor of psychology at the University of Chicago who studies what is surely, other than serious injury, the most feared catastrophe in sports: the choke.

This is an opportune time to finally run this feature, for the subject of the story, University of Chicago psychology professor Sian Beilock, has just published a book, Choke: What the Secrets of the Brain Tell You About Getting It Right When You Have To. She was working on the book when I researched and wrote this story in summer and fall of 2008. It was a sort of dream assignment for me: baseball and cognitive neuroscience. I went to Chicago and visited Beilock in her lab, where she made me choke in a putting game. (I lost $5 in the deal, too, which I forgot to bill the Times for.) That evening we went to a White Sox game to see someone choke, and were not disappointed. And later that summer, I went to see the White Sox play the Red Sox in Fenway — a splendid, tense game, one of the best I’ve ever seen, in which Chicago lost even as one of its stars redeemed himself while going hitless. Meanwhile, I was introduced to a novel view of what generates or destroys performance under pressure.

Beilock, who not long ago played some high-level lacrosse at University of California, San Diego, traces her own interest in choking back to high school, when she discovered that during the tense, game-beginning face-offs, she more often gained control of the ball if she sang to herself, “to keep me from thinking too much.” Later, in grad school, it occurred to her that if you could avoid choking by engaging your brain with singing, it followed that choking must rise from what neuroscientists like to call mechanisms — that is, systematic, causal chains of brain activity.

She has spent much of her time since then exposing and exploring those mechanisms. Her labs include a putting room where she can find a way to make virtually anyone screw up putts that were easy just moments before. Her work has brought her absurdly early tenure, a rain of prizes and grants, and a flashy book contract. She is a kind of queen of choke.

Which is what brought us to Cellular Field. I’d hate to say we were wishing for someone to choke; more like waiting. And given that baseball offers a hundred openings for pressure’s effects, and that this was a tense game between teams vying for first place — the White Sox led their longtime division rivals, the Indians, by a game and a half — we could wait in confidence, knowing that at some point a player would “suffer,” as Beilock politely phrased it, “a decrement under pressure.”

The game did not disappoint. Through seven innings the pitchers dominated, and the pressure slowly rose. Then, in the eighth, the White Sox, leading 2-1, got a chance to put the game away when the Indians’ pitcher C.C. Sabathia finally tired and was replaced by Jensen Lewis, a rookie, just as the White Sox were sending up their best hitters.

Lewis, perhaps suffering a bit of a decrement himself, walked the first hitter and then surrendered a double that left runners at second and third. When White Sox slugger Jim Thome, who had already homered once, came to bat, Lewis, on orders from the bench, walked him intentionally to get to the next batter.

A certain weight — the weight of great opportunity — falls upon any hitter who steps to the plate with the bases loaded. It falls heavier when the pitcher has just intentionally walked the previous batter.

Continue reading →