Reducing Placebo Response: Triple Blinding & Setting Expectations

November 1, 2005

Applied Clinical Trials

Volume 0, Issue 0

Strategies for eliminating factors that influence the placebo response during the clinical trial process.

Multiple randomized clinical trials in depression and anxiety have failed to separate active drug from placebo. A number of reasons have been advanced to explain this observation, including subject and site interactions, trial design, site–sponsor interactions, and rating scale deficiencies. Key among the reasons for failure is the subject and rater interaction (the former with unstated expectations and beliefs and the latter with biases based on knowledge of the trial design and test article). The interaction between the two serves to skew the ratings in a way that reduces the apparent treatment effect, creating a so-called placebo response.

We propose a triple blind design by additionally blinding the site's primary efficacy rater from the trial design and the subject's chart. This technique is standard in Alzheimer's trials for one of the primary efficacy scales. Along with providing subjects with a structured education about expectations and modifying the site staff's interactions with the subject, triple blinding may also yield advantages in improved accuracy of drug effects in relation to placebo.

Common influences

Many trials in pain, anxiety, and depression fail because of the placebo response.1-7 In aggregate, subjects who receive a placebo may improve enough during the trial to erase the statistical difference from the active treatment arm(s). As many as 60% of depression trials fail to statistically separate from the placebo arm, which results in the loss of millions of dollars and the failure of some compounds to advance to market.2,8 For example, of five citalopram studies for depression, only two showed a difference from placebo.9-12, 18

Many reasons for the placebo effect are cited, but specific causes are difficult to quantify and have largely resisted correction by modification of subject requirements and trial design.5,6,9,13 The ethics of using a placebo arm is beyond the scope of this article. The fact that the FDA continues to require it makes continued efforts to reduce the placebo response a desirable objective.

The site–subject interaction is out of the control of the study sponsor. The interaction includes a supportive environment, assumed penetration of the blind, and unstated subject and staff expectations, all interactions that may adversely affect detecting the difference between treatment and placebo. Attempts to correct this by modifying the study design have met with mixed success, and studies continue to fail routinely. A variety of the site–subject interaction components will be discussed to set a context for the subsequent recommendations.

Subject psychological support and a therapeutic milieu

Typically, subject visits are conducted in a quiet and friendly clinic office. Subjects interact with the site staff, which usually includes a coordinator and research physician and possibly a separate psychometric rater. During the trial, subjects become increasingly familiar and comfortable with the staff and the office. This often triggers a therapeutic response in the subject.2-4,14

Since this effect occurs across all arms whether active or placebo, it is "additive" to any response from the pharmacological agent (i.e., in the setting of the clinic visit, there is a likelihood of improvement if a subject receives no other intervention). The standard remedy includes efforts to minimize the supportive and sympathetic gestures from the staff.14 These efforts are variably applied among multiple trial sites with a patchwork of results.

Expectation of benefit

Another site–subject interaction is the unvoiced expectation of benefit on the part of the study staff and the research subject.5,6,14,15,19 This occurs for several reasons. One reason is that the site staff knows the statistical likelihood of subject improvement during clinical trials (i.e., they know that virtually all subjects regardless of treatment assignment will improve, if for no other reason than the placebo effect). Since everyone is likely to get at least a little better, this bias, in effect, predetermines the improvement.

In addition, many studies have multiple active treatment arms. For example, in a typical depression trial, these may consist of two doses of study drug, possibly an active control arm plus the placebo, which results in a randomization ratio of three actively treated subjects for every one on placebo. With the rater's knowledge that three out of four subjects are under treatment, ratings will be biased in favor of improvement. Published studies observe that trials with more study arms are less likely to separate treatment from placebo.15,16 Sponsors have been unwilling, for reasons of increased costs and recruitment times, to increase the number of placebo patients to achieve a more balanced placebo to active treatment ratio.

Penetrating the blind

Both subject and raters speculate to which arm a subject has been randomized. Often these guesses are based on the subject's reported or felt improvement of symptoms. Another common reason for suspecting active treatment is the report or experience of a side effect, perhaps a typical one for the drug class. Often the opposite occurs, with no apparent benefit or side effect, which convinces the subject or rater that the subject is on placebo. Sometimes the guess is correct, but often it is not, with subsequent bias of the ratings.5,17 This is called penetrating the blind. Study staff who know both the adverse event profile of the test article and the subject's reported adverse events in the chart may think that they know what the patient is taking; right or wrong, the ratings are tainted as a result.

Subject expectations

Even knowing that they might not get active treatment, many subjects report symptom improvement while taking a placebo. Getting a pill in the course of a trial initiates long-standing expectations of benefit based on prior experience in the medical setting.2-4 In addition, the consent process can sometimes over-sell the investigational product, engendering enthusiasm within the subject and setting up inflated hopes.2

It is not uncommon for the site to attempt to orient the subject's expectations with educational statements. These may include statements about the possibility of the study drug not working as intended; they may get a placebo and consequently may not improve, and they may receive admonitions to report the actual symptoms honestly without attempting to "help" by accentuating the positive. There is no evidence that this practice is performed widely, uniformly between sites, or systematically within a site.

Ratings inflation

Most study designs require a minimum level of symptom severity to qualify for inclusion in the study. Measured severity tends to cluster just above the cut-off for inclusion in a non-Gaussian distribution. Worse, subsequent to randomization, scores improve in both groups to more resemble a normative distribution, inflating the placebo group's apparent response.

Knowledge of the cut-off score on the part of the rater apparently affects the baseline scores and muddies the subsequent distinction of the active and placebo groups. In response, some study designs have escalated the severity for inclusion so that even with the postrandomization deflation of scores there is still sufficient severity to identify a treatment effect. Another study design incorporates the opposite tack, reducing the required severity to avoid score inflation in the first place. Success of the different strategies is inconclusive at present.


Since the site must interact with the subject, to some extent it will always influence the measured ratings. Any approach should moderate the site–subject interaction in a way that is practicable within the ability and scope of the study site but potentially reduce the adverse influences on the data. We suggest an approach that combines a blinded rater with standardized didactic instruction of the subject.

In many centers, the study coordinator also performs the psychometric ratings. The combination of coordinating time with the subject plus knowledge of their side effects and symptom changes can both serve to create a therapeutic relationship and suggest a treatment arm to the rater. To prevent this from happening, one tactic long used in Alzheimer's trials is to employ a separate coordinator and blinded rater.

The Clinicians Interview Based Impression of Change (CIBIC, or more recently the CGIC) is a coprimary efficacy rating very similar to a Clinician's Global Impression of change score (CGI).17 It is a seven-point scale anchored in the middle at four with "no change," and either "minimally," "moderately," or "very much" improved or worsened. Because it is a subjective rating based on an interview of an informant and the subject, there is considerable room for influence from external factors. Many of the original Alzheimer's patients in the studies were on medications with known and expected gastrointestinal side effects. Therefore, the rater was blinded to the patient's chart and was instructed not to inquire about adverse events or other symptoms that potentially could unblind the rater.

A similar approach may help to reduce the placebo effect, although to be effective, a number of safeguards and ancillary procedures should be incorporated. For example, the rater should not be the coordinator or study physician and should have minimum contact with the subject. Ideally, the only contact should be during the rating of the primary outcome measure.

Although newer antidepressants and anxiolytics are less likely to have obvious blind-breaking side effects, the blind can be broken—or presumed broken—by the rater with biases in the results. Blinding the primary efficacy rater to the chart can substantially eliminate that presumption. If possible, the rater should not interact with the subject at any time before or during participation in the study except to perform the rating. This includes avoiding the qualifying diagnostic interview and other intake procedures. Optimally, the rater would not know the study specifics, including the randomization ratio and the expected drug side effects. Operationally, the subject would encounter the rater only for the time it takes to do the primary efficacy ratings. Other site personnel would carry out all the secondary ratings and coordinator duties.

By embargoing the side effects and other safety information from the blinded rater, he or she will rely principally on the actual interview. In addition, personal contact between the rater and the subject will significantly decrease, which should minimize the subjectivity of the ratings and the personal rapport between the two. The rater should also be blinded from the threshold score for qualification of the subject into the study. Without that knowledge, inflationary baseline scores should be reduced; however, this ideal would be difficult to accomplish in many clinical settings.

A critical corollary procedure would be to educate the subject about what to expect during the trial. Although the actual script may be left up to the site, the sponsor should specify the essential elements, much as they do with the informed consent. The intent is to help the subject understand the need for an honest report of their mood and to give them explicit permission to either improve or not improve. The site personnel would read the educational statement to the subject during the consent and review it at each succeeding visit. With reinforcement, this message should reduce the subject's anxieties about what to say in follow-up visits, increase understanding of the clinical trial process, and yield more accurate reports of the subject's actual experiences. The sponsor would incorporate steps to ensure conformity, with review of compliance made a part of the monitoring visit.


Any approach to reducing the placebo response must have some of the elements described here if studies are to improve separation of the placebo from treatment arms. Others have proposed alternative study designs for minimizing the placebo response, but none address the site–subject interactions and expectations in a systematic way. Sponsors looking to implement these proposals should begin at the outset of a clinical program by integrating the proposals into their site selection and site staff training. Changing site behavior is not easy, and the desired activities need to be reinforced at multiple opportunities. With education on both the rationale behind the proposals and specific site training, the site staff will more likely be willing and prepared to act accordingly.

Louis Kirby,* MD, is founder and medical director of Pivotal Research Centers, 13128 N. 94th Drive, Ste. 200, Peoria, AZ 85381, (623) 815-9714, email: Steve Borwege, MA, and James Christensen, MA, are psychometric raters with Pivotal Research Centers, Mesa, AZ. Christopher Weber, PhD, is former site manager with Pivotal Research Centers and is now with i3 Research. Craig McCarthy, MD, is principal investigator with Pivotal Research Centers, Mesa, AZ.


1. W. Brown, "The Placebo Response in Depression: A Modest Proposal," Harvard Mental Health Letter, 12 (6) (1995).

2. L. Lipton, "Placebo Response Can Confound Drug Trials," PsychiatricNews (July 2000).

3. G. Andrews, "Placebo Response in Depression: Bane of Research, Boon to Therapy," The British Journal of Psychiatry, 178, 192–194 (2001).

4. A.M.E. Ker, J. Stadler, M. Viljoen, "The Role of Placebo and the Placebo Response in Clinical Research," Geneeskunde: The Medicine Journal, 42 (8)(September 2000).

5. D.O. Antonuccio, D.D. Burns, W.G. Danton, "Antidepressants: A Triumph of Marketing Over Science?" [Commentary on The Emperor's New Drugs: An Analysis of Antidepressant Medication Data Submitted to the U.S. Food and Drug Administration], Prevention & Treatment, 5 (25) 1–21 (2002).

6. R. Lane, "Placebo Response to Antidepressants," German Journal of Psychiatry, 2 (3) 1–11 (1999).

7. D.B. Oosterbaan, A.J. van Balkom, P. Spinhoven, R. Van Dyck, "The Placebo Response in Social Phobia," Journal of Psychopharmacology, 15 (3) 199–203 (2001).

8. M.E. Thase, "Antidepressant Effects: The Suit May Be Small, But the Fabric is Real" [Commentary on The Emperor's New Drugs: An Analysis of Antidepressant Medication Data Submitted to the U.S. Food and Drug Administration], Prevention & Treatment, 5 (32) (2002).

9. I. Kirsch, T.J. Moore, A. Scoboria, S.S. Nicholls, "The Emperor's New Drugs: An Analysis of Antidepressant Medication Data Submitted to the U.S. Food and Drug Administration," Prevention & Treatment, 5 (23) (2002).

10. T.P. Laughren, Recommendation for Approval Action for Celexa (citalopram) for the Treatment of Depression [Memorandum], U.S. Food and Drug Administration, 1–5 (2 July 1998).

11. Forest Laboratories, Safety Update, Regulatory Status Update, World Literature Update [Correspondence], U.S. Food and Drug Administration, 1–108 (22 May 1998).

12. S. Lee, J.R. Walker, L. Jakul, K. Sexton, "Does Elimination of Placebo Responders in a Placebo Run-in Increase the Treatment Effect in Randomized Clinical Trials? A Meta-analytic Evaluation," Depression and Anxiety, 19, 10–19 (2004).

13. D.L. Zimbroff, "Placebo Response in Antidepressant Trials" [Letter to the editor], The British Journal of Psychiatry, 178, 573–574 (2001).

14. J. Fritze and H.J. M�r, "Design of Clinical Trials of Antidepressants: Should a Placebo Control Arm Be Included?" CNS Drugs, 15 (10) 755–764 (2001).

15. H. Brody, "The Placebo Response: Recent Research and Implications for Family Medicine," The Journal of Family Practice, 49, 649–654 (2000).

16. L.P. Rehm, "How Can We Better Disentangle Placebo and Drug Effects?" [Commentary on The Emperor's New Drugs: An Analysis of Antidepressant Medication Data Submitted to the U.S. Food and Drug Administration], Prevention & Treatment, 5 (31) (2002).

17. M. Oremus, A. Perrault, L. Demers, C. Wolfson, "Review of Outcome Measurement Instruments in Alzheimer's Disease Drug Trials: Psychometric Properties of Global Scales," Journal of Geriatric Psychiatry & Neurology, 13 (4) 197–205 (2000).

18. P. Leber, "Approval of Action on Forest Laboratories, Inc. NDA 20-822 Celexa (citalopram HBr) for the management of depression [memorandum], U.S. Food and Drug Administration,1-5 (2 July 1998).

19. R. Fuente-Fernandez and A. Stoessl, "The Biochemical Bases of the Placebo Effect," Science and Engineering Ethics, 10 (1) (2004).