Is it Prozac? Or Placebo?

Illustration: Nick Dewar

Fight disinformation: Sign up for the free Mother Jones Daily newsletter and follow the news that matters.

Janis Schonfeld recalls the events that started her on her recovery from 30 years of depression with snapshot clarity: the newspaper ad she saw in 1997 seeking subjects for an antidepressant study; the chair she was sitting in when she called UCLA’s Neuropsychiatric Institute; the window she was looking out of when she first spoke with Michelle Abrams, the research nurse who shepherded her through the trial. She remembers being both nervous and hopeful when she arrived at the institute, and a little uncomfortable when a technician put gel on her head, attached a nylon cap shot through with electrodes, and recorded her brain activity for 45 minutes. But most of all she remembers getting the bottle of her new pills in a brown paper bag from the hospital pharmacy. “I was so excited,” she told me. “I couldn’t wait to get started on them.”

Within a couple of weeks, Schonfeld, then a 46-year-old interior designer, got quickly and dramatically better, able once again to care for herself and her husband and daughter, no longer so convinced of her own worthlessness that she’d consider killing herself. For the next two months, she came back weekly for more interviews and tests and EEGs. And by the end of the study, Schonfeld seemed to be yet another person who owed a nearly miraculous recovery to the new generation of antidepressants — in this case, venlafaxine, better known as Effexor.

But during her final visit to the institute, one of the doctors directing the research sat her down to deliver some disturbing news. “He told me I hadn’t been taking a medicine at all. I’d been on a placebo. I was totally shocked.” So was nurse Abrams. Both women knew that half the test subjects were getting placebos and that Schonfeld might be among them. But not only was she feeling better — she’d even experienced nausea, a side effect commonly associated with Effexor, so they had each assumed that she was in the drug group. Schonfeld was so certain of this that at first she didn’t believe the doctor. “I said to him, ‘Are you sure? Check those records again.'” But there was no doubt. The brown bag contained nothing but sugar pills. Which didn’t mean, he was quick to add, that she was making anything up, but only that her improvement couldn’t possibly be due to the pharmacological effects of the pills.

Schonfeld’s experience is hardly unique, although you wouldn’t know it from the ubiquitous advertisements for antidepressants — nor, if you were a doctor, would you know just how common it is from reading the medical journals. Psychiatrists and other mental-health professionals (I am a practicing therapist) know that any given antidepressant has only about a 50 percent chance of working with any given person. But what most people — patients and clinicians alike — don’t know is that in more than half of the 47 trials used by the Food and Drug Administration to approve the six leading antidepressants on the market, the drugs failed to outperform sugar pills, and in the trials that were successful, the advantage of drugs over placebo was slight. As it would hardly help drug sales, pharmaceutical companies don’t publish unsuccessful trials, so University of Connecticut psychology professor Irving Kirsch and his co-authors used the Freedom of Information Act to extract the data from the FDA. What they found has led them, and other researchers who’ve investigated antidepressants’ relatively poor showing against placebos, to conclude that millions of people may be spending billions of dollars on medicines that owe their popularity as much to clever marketing as to chemistry, and suffering serious side effects — not to mention becoming dependent on drugs for healing they might be able to do without them — in the bargain.

But many doctors remain convinced that antidepressants do work, that the flaw lies not in the capsules themselves but in the studies used to evaluate them. Clinical trials can consume half a drug’s patent life. And so pressure to bring the medicine to market leads researchers to adopt strategies — such as recruiting people whose depression is too mild to yield powerful results — better suited to clearing regulatory hurdles than generating useful scientific knowledge. That, and not the power of suggestion, is why antidepressants barely outperform placebos, these scientists say.

While some of this debate breaks down along familiar lines — psychologists resisting the tendency to reduce all mental suffering to biology versus psychiatrists more comfortable with matter than spirit — no one disputes that the statistics about antidepressant efficacy are dismal, and that they do little to clarify the question of whether people who get better on antidepressants do so because they are taking Prozac or Zoloft or because they are taking a pill — any pill.

BEFORE SCIENCE TOOK OVER the healing arts and focused physicians’ attention on biological causes of disease, mystics and alchemists and flimflam artists alike offered potions and powders to the ailing. Some of these remedies were bizarre, like usnea — the moss from the skull of a hanged man, used to treat nervous illness — and others merely fanciful, like powdered unicorn horn. Some were truly dangerous, like calomel, a mercury-based laxative that may have hastened George Washington’s death from the cold he famously caught while riding on a rainy night. Some — notably cinchona bark, the source of quinine — turned out to have actual healing powers, but there were so few of these that in 1860 Oliver Wendell Holmes, the doctor who fathered a Supreme Court justice, wrote, “If the whole materia medica could be sunk to the bottom of the sea, it would be all the better for mankind and all the worse for the fishes.”

But Holmes was not entirely correct. Despite their lack of specific healing properties, many ancient medicines worked — or at least people often got better after taking them, as they still do. Most illnesses remit as part of their natural course, but the placebo effect occurs far too frequently to be mere coincidence. No one really understands why, but doing something for an illness — especially if that something involves a pill — is usually better than doing nothing at all.

There’s no money to be made in sugar pills, so drug companies, which fund much of the drug research in the United States, have not looked very hard into this question. But placebos do figure prominently in their studies — as a stalking-horse for the potential new medications. Because any drug may well be acting as a placebo, it is not a sufficient test simply to give a new compound to sick people to see if they get better. To rule out the possibility that patients are recovering because of faith or a good sales pitch, and to ensure that the drug works by virtue of its biochemical properties, the FDA has, since the late 1970s, required that all drugs be tested against placebos. Typically, between 35 and 45 percent of people given placebos improve. If a candidate drug outperforms a placebo in two independent studies, and if it does so without untoward side effects, the FDA will approve it for use.

The FDA does not consider, however, the relative advantage that new drugs show over placebo. So long as the difference is statistically significant — meaning that the results are not merely random — a drug can be advertised as “safe and effective” whether clinical trials proved it to be 5 percent or 50 percent or 500 percent more effective than an inert pill. In the case of the Prozac generation of antidepressants, marketing efforts have paid off wildly. Some 92 million prescriptions were written for the top six antidepressants in 2002, a ubiquity that has, far more than any research, helped to bolster the theory that depression is the result of a biochemical imbalance that the drugs cure — a theory that has not been proved, despite more than 40 years of trying.

But critics, psychologists and psychiatrists alike, have been suspicious of the drugs since they were introduced, and it turns out they have some striking data on their side. “In the early ’90s, many of our psychiatric colleagues felt that patients did not do as wonderfully as all these reports of ‘magic pills’ would suggest,” recalled psychologist Roger Greenberg, a professor at the State Univer-sity of New York’s Upstate Medical Uni-versity. “So we went back to the literature.” Greenberg [no relation to the author] and his team analyzed all the data from Prozac’s clinical trials that had been published. They determined that the new drug showed negligible advantage over earlier antidepressants and that two-thirds of the patients would do as well or better with placebos.

Greenberg started with material hidden in the plain light of professional journals, but a bit of detective work by Irving Kirsch and his research team has turned up even more disturbing evidence about the low rates of antidepressant effectiveness. Kirsch is a soft-spoken and slight man who has spent more than 30 years studying the placebo effect. He has a native suspicion of biological explanations of depression and sees in the placebo effect the potential for self-healing without resorting to expensive and possibly dangerous drugs. While many researchers duplicated and refined Greenberg’s initial findings, Kirsch knew that there was a body of results that no one was looking at. Manufacturers don’t have to publish all their data in journals, but they do have to report every trial to the FDA. “This was all so controversial,” he told me. “And the defenders claimed that our data didn’t tell the whole story. So we figured, why not use the Freedom of Information Act to investigate?”

Kirsch requested the complete files on the six most widely prescribed antidepressants approved between 1987 and 1999: Prozac, Zoloft, Paxil, Effexor, Serzone, and Celexa — drugs that together had $8.3 billion in worldwide sales in 2002. Within a month, he had an even less drug-friendly story than the one told in the journals. In “The Emperor’s New Drugs,” published in the July 2002 issue of the American Psychological Association’s Prevention & Treatment, Kirsch’s team presented their findings: Of the 47 trials conducted for the six drugs, only 20 of them showed any measurable advantage of drugs over placebos, a much lower number than turns up in published research. This was not entirely unexpected — “publication bias” has long been known to be a problem in assessing the effectiveness of drugs — and Kirsch is quick to point out that even these meager numbers “leave no doubt that there is a difference between drug and placebo. But I was surprised at how small the difference was in clinical terms. The studies all used the same measure” — the Hamilton Depression Rating Scale, the nearly universal way clinicians assess a patient’s level of depression — “so it was easy to see how much clinical improvement there really was.” And there really wasn’t much at all: The average patient on drugs improved by about 10 points on the 52-point Hamilton, while a placebo patient improved by a little more than eight. “A two-point difference on the Hamilton — it’s just clinically meaningless. Trivial,” Kirsch says. “You can get that from having an improvement in sleep patterns, and if one of the side effects of the drugs is to induce drowsiness, the whole difference could be right there.” (Indeed, critics say the Hamilton is skewed toward physical symptoms of depression, those most likely to be affected by medication.)

Kirsch received copies of memos indicating that regulators had, in at least one case, raised questions about clinical significance. In 1998, Paul Leber, then director of the FDA‘s Division of Neuropharmacological Drug Products, wrote of Celexa, “There is clear evidence from more than one adequate and well-controlled clinical investigation that [Celexa] exerts an antidepressant effect. The size of that effect, and more importantly, the clinical value of that effect is not something that can be validly measured, at least not in the kind of experiments conducted.” A deputy agreed: “It is difficult to judge the clinical significance of this difference,” he wrote, but added that this shouldn’t be an impediment for bringing Celexa to market because “similar findings for…other recently approved antidepressants have been considered sufficient.”

Kirsch argues that by the FDA‘s own logic, it’s not even clear if the drugs’ small advantage is truly pharmacological. In trials, every drug response is assumed to be partially a placebo response, and the drug effect is only the additional benefit — in the case of the antidepressant studies, less than two points out of ten, or 20 percent of the overall improvement. This means, he said, that “80 percent of the drug effect is the placebo effect.” And even the remaining 20 percent could be due to placebo effects enhanced by the drugs’ side effects, amplified by the way the trials are conducted. “A person is brought into a clinical trial and told, ‘You may be getting placebo or drug. The real drug has the following side effects.’ Put yourself in this position. You’re certainly curious about what you’re getting. And you want to get better. You notice that your mouth is getting dry, which is one of the side effects they told you about, and that leads you to conclude that you’ve been assigned to the drug condition. Presumably, a placebo works by affecting a person’s expectancy about what is going to happen. If you know you’ve been assigned to the drug con-dition, you may have a stronger placebo effect because you’re now more convinced that you’re getting something that’s going to help you.” Greenberg’s research shows that both patients and raters in clinical trials often “break the blind” by guessing which condition they have been assigned and that the most powerful drug effects are reported when this occurs. The guesses don’t even have to be accurate. Janis Schonfeld experienced side effects on placebo, and this was part of what led her (and nurse Abrams, who was scoring the Hamilton) to assume she was on drugs. According to Kirsch’s theory, Schonfeld’s strong response (and Abrams’ rating of her progress) may have come about because they thought — due to symptoms caused by the power of suggestion — that she was on the drug.

Kirsch thinks it is possible to test his theory, but only with a radical redesign of the method used to validate drugs. Instead of two groups, a study would have four. Researchers would tell two groups of patients they were getting placebo and the other two that they were being given the drugs. But only half the patients would be told the truth. And the placebo would be a nonpsychoactive substance designed to mimic at least some of the side effects of the real drug. This way researchers could look directly at the role of suggestion in response to both placebo and drug. It is, however, currently considered unethical to deceive patients in this fashion.

But there is plenty of indirect evidence for Kirsch’s position, including a peculiar recent finding: Both placebo response and drug response for antidepressants have steadily increased over time, so much so that the best predictor of whether research shows positive results is the year the study was published. This result has yet to be explained, but Kirsch thinks it indicates the way the wide- spread publicity about antidepressants shapes patients’ expectations. “It suggests that over time the drugs have gotten more potent for reasons other than chemistry. I would suspect that it’s because of increased marketing.” Kirsch explains the way that marketing can capitalize on a central mechanism of depression: “The hopelessness of depression is the expectancy that a terrible state of affairs is not going to get better. Now you give somebody a treatment that’s been touted as the cure for the worst thing in their lives. What that does is to instill a hope, which is the opposite of depression.” Kirsch’s theory leads to an unsettling conclusion: Drug com- panies may have marketed their antidepressants beyond what statistics justify, but the barrage of advertising may also have inadvertently amplified the placebo effect and thus increased the effectiveness of the drugs they are selling.

WHEN I FILL OUT A treatment report explaining to an insurance company why they ought to pay for someone’s therapy, I am asked for a diagnosis. If the patient is depressed and not on antidepressants, I often must explain why not. Were it not for these bureaucratic demands — and for all the miracle-drug testimony found in advertising and casual talk — the FDA statistics would hardly be surprising or disturbing, because, like many clinicians, I have come to see that the effects of Prozac and its cousins are just about as pallid as those numbers would predict: The drugs are not panaceas, not solid evidence that depression is a chemical imbalance, but have proved to be moderately useful for some people (and moderately harmful to others). No scientist doubts the existence of the disconnect between the data and the way antidepressants are perceived and used, but Kirsch’s theory about it is far from the industry standard. Indeed, some simply dismiss it out of hand — like Donald Klein, a renowned psychiatry professor at Columbia University’s New York State Psychiatric Institute, who thinks that Kirsch’s work is so biased against antidepressants that, though asked, he declined to be among the respondents to “The Emperor’s New Drugs” — “for the same reason,” he told me, “that I don’t argue with creationists.”

Klein, who has conducted antidepressant trials for pharmaceutical companies, acknowledges that the data can leave the impression that the drugs don’t work very well. But he is among those who think this says more about the trials than the drugs. According to Klein, the FDA standard — two successful trials without untoward side ef- fects — won’t elicit a full body of knowledge about new drugs, and may even limit what the tests can tell us. “The job of the pharmaceutical company is to get FDA approval,” he says. “So you want to go in with a dose which is effective but doesn’t create side effects. It’s a real problem. Drugs are not being tested for their optimum efficacy.” Nor, given this strategy, are they tested for their maximum side effects — which may be why reports of agitation and suicidal impulses in excess of what the trials found have dogged the Prozac generation of antidepressants since they were introduced.

Clinical trials can become a game for drug companies to win rather than a venue for generating scientific knowledge. And it’s a game that establishes perverse incentives, in part because drugs’ limited patent lives — usually 20 years — begin before clinical trials, which can take a decade, start. “We’re talking real money here,” says Klein, noting it takes between $300 to $500 million to develop a new drug. Klein told me that within the industry the clinical trial period is thought to cost “a million dollars a day. That adds some pressure for finishing trials fast.”

Despite the bottom-line approach, “there are lots and lots of compounds that get evaluated and never approved,” notes Lawrence Price, a psychiatrist who directs research at Brown University’s Butler Hospital. A more nuanced criterion for a successful trial is possible, but, says Price, “it would just take forever. It’s not that there aren’t important questions, but you would get so bogged down in trying to nail down the details that you would just never make any progress with newer compounds.”

You also might not make any progress if you waited around for severely depressed people to test drugs on. “The problem with antidepressant studies,” according to Klein, “is that anything that can be confused with ordinary unhappiness gets in” — which means that subjects in clinical trials are insufficiently depressed, too close to normal to show dramatic improvement. Price, who has conducted clinical trials of antidepressants for 25 years, points out that recruitment techniques like the one that attracted Janis Schonfeld to UCLA can lead to a skewed sample. “If you go out and advertise in the newspaper for depressed people,” says Price, “you are going to get less ill people than if you are taking people who are brought in via the emergency room.”

Relatively high-functioning, moderately depressed people, those most likely to enroll in and finish a trial, are, as it happens, more likely to register a high placebo response. There are no biochemical markers of depression, no blood test or X-ray that confirms its presence, so it can be judged only by its appearance — which means, in trials, by the Hamilton, a test of subjective states scored by clinicians whose employers are paid up to $10,000 for each patient who completes a study. “If the investigator has directed his/her research assistant to rate liberally on the Hamilton,” says Price, “then you are going to have more people meeting the entry criterion,” typically, at least 17 points — the line dividing mildly and moderately depressed. (One of Price’s colleagues estimates that Hamilton scores are inflated by up to five points for clinical trials.)

The drug companies, of course, want more than speedy trials. They want successful ones. “Placebo is a killer for them,” Price explains, “because if they spend $40 million on a trial and get a placebo response rate of 50 percent, then they’ve just wasted that $40 million. There’s a huge interest in trying to address the high placebo response rate in depression. How can it be lowered? How can you identify the sample of people in whom these compounds are really going to work?”

The study in which Janis Schonfeld enrolled may provide some answers to these questions — although somewhat inadvertently. Hoping to eliminate the trial-and-error method used to match patients with antidepressants, the UCLA doctors were using electroencephalograms to determine if there was some neurochemical difference between the brains of people who respond to Effexor and those who respond to Prozac. The researchers found the differences they were looking for, but they also got a surprise.The EEGs of placebo responders were different from those of the drug responders, and similar to each other, a phenomenon that had never before been observed and that may be the first step to identifying the neurochemistry of the placebo response. This was welcome news to the drug companies, who’d like nothing more than to eliminate placebo responders from their studies.

Take away the people most likely to show a strong placebo effect, include the people most likely to respond to a drug, and the statistics become more favorable for the manufacturers and provide less ammunition for critics like Greenberg and Kirsch. Psychologist David Antonuccio, a professor at the University of Nevada, claims that the deck is already stacked. In addition to publication bias, inflated Hamilton scores, and broken blinds, he points to the placebo washout period that starts every clinical trial: All patients are given a week of placebo treatment, and the strongest responders are eliminated from the study. The idea, of course, is to get a more accurate estimate of the true drug effect, but “if you put everybody on an antidepressant and washed out everyone who responds, people would say, ‘That’s a very biased strategy against the drugs.’ Well, I believe we have a strategy here that’s biased against the placebo condition.”

AFTER JANIS SCHONFELD was debriefed, she was given her reward for participating in the study: a one-year supply of Effexor. She didn’t consider not taking the drug. “They told me that I’d gotten a good start, that if I’d done well on placebo, I’d probably do better on the drug.” And so she did. “After about a week or maybe two weeks it was like a fog was lifted from my eyes. I realized I had spent much of the last 20 years in that fog.” Schonfeld took Effexor for two and a half years and then “one day I just thought, ‘You know, I don’t think I need this medication anymore.’ I spent three weeks weaning off of it. That was about a year and a half ago, and I haven’t really felt that I needed it since.” She emphatically rules out the possibility that her improvement was a result of placebo effects, amplified or otherwise.

To Kirsch, Schonfeld’s is a case of lost opportunity. “Why not say to her, ‘You did this’? People respond the way they were expecting to respond, so why not work on that expectation? Why not teach her the strategies that she can use to make herself feel better?” Antonuccio says, “Placebo is a valid intervention in and of itself,” adding that people like Schonfeld have ample contact with trained staff during trials, which may itself be what accounts for the high placebo rates. “It’s possible that psychological treatments are mostly placebo as well,” he says — not, as he is quick to add, that there’s any- thing wrong with that. “We just ought not to see the placebo effect as some sort of inferior response or condition.”

But even though, as Kirsch notes, “more placebos have been administered to research participants than any single experimental drug,” they remain poorly understood and used, for the most part, only inadvertently and haphazardly. The discovery of biological underpinnings of the placebo effect may change this, as drug researchers grasp the potential of turning yet another neurochemical pathway into a pharmaceutical market by developing a placebo drug. Bizarre as this sounds, it may be the only incentive that will lead a profit-driven health care industry toward an understanding of humanity’s oldest means of healing.