Gender Bias in the Clinical Evaluation of Drugs

August 13, 2020
Younes Benjeaa

Yves Geysels

Gender remains as one of the most under appreciated variable in the clinical development of drugs.

It is increasingly apparent that many physiological and pathological functions as well as patterns in gene expression differ between women and men and gender differences have also manifested in the outcomes of treatments.1 Throughout the last decades, alarming evidence of gender-based differences in the safety profiles of treatments has accumulated. A study including 513,608 patients estimated that women experience a 1.5 to 1.7-time greater risk of developing adverse reactions to drugs than men.2 In 2001, a report from the US General Accounting Office GAO) made the observation that 70% of drugs withdrawn from the market between 1997 and 2000 presented greater health risks for women.3 On top of that is data suggesting that women are overdosed: the dose of zolpidem, for example, was reduced by 50% in women, after studies showed that women have 5 times more risk than men of driving impairment after 10 mg of zolpidem.4 Gender differences have also been spotted in effectiveness of treatments : 20 years after the Physician’s Health Study (a clinical trial that excluded women), a large placebo-controlled primary-prevention trial involving 39,876 women demonstrated that daily low dose aspirin has no significant efficacy on the risk of myocardial infarction or death from cardiovascular causes in women, as opposed to results in men.5

These differences in the outcomes of treatment can be explained by well-studied gender differences in all the phases of pharmacokinetics7,8 and the growing literature in pharmacodynamics describing gender differences in drug-receptor affinity, receptor density, or signal transduction pathways.7,8,9,10,11,12 Despite these evidences, gender is still is one of the most underappreciated variable in the clinical development of drugs. To address this short-coming, it is important to clearly define the place of clinical trials in identifying gender differences.

Enrolling an appropriate number of women

Since the thalidomide disaster of the 60’s, it became a common practice to protect fetuses fromclinal trials, rather than protecting them throughwell controlled clinical studies. In 1977, the FDA excluded “women of child-bearing potential” from Phase I and early Phase II clinical research.13This 1977 FDA guideline was however applied to all pharmaceutical research and the expression “woman of childbearing potential” was defined too extensively as any woman capable of becoming pregnant, regardless of their sexual activity, their use of contraceptives, their sexual orientation, the possible sterility of their partners, or even their desire to have a child. Additionally, several publications indicated that teratogenicity can also be transmitted via sperm.14,15,16,17This dangerous ban was in application until 1993 and countless drugs were put on the market through clinical research that underrepresented women, exposing (until this day) women to drugs that were not or less tested on them. The FDA 1993 guidance reversed the ban and recommended that clinical studies include men and women “in numbers appropriate to allow the detection of clinically significant gender differences in drug responses.”14,18

This raises several questions: Firstly, what is an appropriate enrollment of women? Gender bias is a complex problem and will not necessarily be solved by the simple achievement of a 50:50 distribution of participants’ gender. Rather than considering only rates of enrollment, a participation to prevalence ratio (PPR) superior to 0.8 is an indicator of appropriate enrollment of women and addresses gender differences in disease prevalence. In the beginning of the new millennium, the GAO suggested that women are “sufficiently represented in clinical trials”19 but several authors disagree.20,21,22 Regarding late phase clinical trials, Pool et al.estimated in 2009 that the participation of women between 2007 and 2009 was 43.3%, and 3.5% of the clinical trials still unspecify the sex ! (22) In 2020, Jinet al.revealed that among 740 cardiovascular trials conducted between 2010 and 2017, only 38.2% of participants were women. Furthermore, a participation to prevalence ratio inferior to 0.8 was associated with arrhythmia, coronary heart disease, acute coronary syndrome, and heart failure trials.23

Secondly, how early in the drug development should enrollment of women be a concern? Pinnow et al.estimate that only 30.6% of Phase I trial participants are women and 34.1% of early trials enroll exclusively men.24 These findings indicate that all the data imminent from early clinical trials, including dose tolerability, dose and use of a drug, the metabolic and pharmacologic data, side effects associated with increasing doses and even the choice of the investigational drug tested in large-scale trials, are mainly tailored based on men data. Still, is Phase I early enough to worry about avoiding gender bias? Most of our fundamental knowledge of drugs comes from non-human models including animals, tissues, cell lines but about 80 % of non-clinical studies use only male animals.25 The differences in drug response between female and male animals could forecast the differences in treatment outcomes between women and men patients. In cardiotoxicity studies of dofetilide in rabbit, 100% of female rabbits vs. 30% of males developed severe cardiac arrhythmias and female rabbits developed cardiac arrhythmias at doses 50% lower than those given to males.26 This can be explained by the protective action of testosterone through shortening of the baseline QT intervals.27 Since women are more sensitive to QT interval prolongation than men,28 testing drugs on male-only animals could lead to widely under-estimate cardiotoxicity.

Gender based analyses

Even if the rate of enrollment of women is appropriate, pooling data of both genders may still yield inexact results and problems of reproducibility. Several theoretical models and clinical trials have demonstrated that pooling data from women and men can mask important male and female differences in baseline data, treatment response and also in sex × treatment interactions and leads to biased results that are not adapted to women, nor are they rigorously adapted to men either.29

The Digitalis Investigation Group clinical trial is a prime example of the complexity of subgroup analysis. In 1997, the results of the digoxin trial were published, and the outcome was positive. In 2002, Rathore et al. obtained a public-use copy of the data base of the Digitalis Investigation Group and conducted the same study again but performing an analysis based on gender, and the results were alarming. The results for men were identical to the original results but in women, it was found that digoxin significantly increased mortality and digoxin-associated reduction in the rate of hospitalization for heart failure was smaller.30 The results of this post-hoc analysis were not reproductible in large observational studies.31,32 When gender-based analysis of randomized controlled trials (RCT) and observational studies are inconsistent, it may raise some confusion. On one hand, post hoc analysis of RCT have less statistical power to detect an interaction between gender and effectiveness of treatment, and even less power to detect interaction between gender and uncommon adverse events and are more prone to type 1 errors (false positive) to detect interaction. On the other hand, observational studies are not randomized, and are biased by unmeasured confounders that affect the interaction analysis. To reduce the weaknesses in the observational design in this regard, propensity-matched analysis reduce the bias due to confounding variables that could be found in an estimate of the treatment effect. To avoid overinterpretation, gender-based analysis must thus be performed but in very strict conditions:1. Ensure that there are sufficient numbers of subjects in both gender subgroups 2. Provide a rationale for performing the subgroup analysis 3. Perform a statistical test of interaction between gender subgroups is necessary but not sufficient 4. Adjust p-values for the number of comparisons being made 5. Emphasize overall findings: results of gender-based analyses should only be used to produce hypothesis that should be confirmed through post marketing (larger scale) studies. In 2007, it was reported that among RCT performing a gender-based subgroup analysis, only 35% of studies performed a proper subgroup analysis.33

Regarding the sample size, including women and men analysis does not necessarily imply to double of the number of patients to maintain statistical power. More efficient designs, including factorial design ( in which two experimental factors with multiple levels are tested, and data are collected across all possible combinations of factors and levels) allow to maintain power at low cost, by increasing the sample size by only 14 to 33%.34


In 2020, it is estimated that more than 300 000 clinical trials start every year. This represents therapeutic opportunities but also potential harm for subgroups that are not well represented in clinical trials. In order to be reliable and useful, clinical trials need to be both internally and externally valid. Lack of external validity has created this last years a call for more pragmatic “real life data” trials and the production of data that are applicable to all categories of patients. During this era of personalized medicine, “one-size-fits-all” is not acceptable anymore. Patients need a treatment that is well suited for different subgroups, and women and men are different. Sex is a fundamental variable that should be used to disaggregate data and explain differences in treatments outcomes. To avoid this bias, two main criteria are determinant: inclusion of women in the right number and correct gender-based analysis. In late phase clinical trials, there has been improvement, but men still predominate. In 1993, when the ban of exclusion of women in clinical trials was lifted, one on the main concerns was the underrepresentation of women in cardiovascular trials. In February 2020, the same concern is still raised concerning drug trials on cardiovascular diseases, the leading cause of death. In early phases clinical trials, women are still largely underrepresented or even excluded, fostering a gender bias in the data of dose tolerability, appropriate dosing, metabolic and clinical pharmacology. Moreover, when research lacks or excludes female subjects then the guidelines should clearly state that the evidence has been obtained mainly from men. Once a drug is marketed, the scientific or patient leaflet should mention the ratio and total number of women and men that participated in the clinical trials. Regarding analyses by subgroup, it is estimated that only half of clinical trials perform gender-based analysis and only 35% conduct proper subgroup analyses. Investigators should follow guidelines to ensure the proper conduct of subgroup analysis to prevent misleading conclusions from becoming adopted by clinicians. Overall, inclusion of women and gender-based analysis must continue to improve to ensure external validity, reduce gender bias, avoid distrust in controlled clinical trials and protect women’s health from drugs that are less adapted to them. 

Younes Benjeaa obtained a Master in Biomedical Sciences from the University of Namur, Belgium and Yves Geysels is a Professor of Clinical Trials at the University of Namur, Faculty of Medicine, Department of Biomedical Sciences, Belgium. He is the Honorary President of the Belgian Association of Clinical Research Professionals (BeACRP).


  1. Wizemann TM, Pardue ML, Committee on Understanding the Biology of Sex and Gender Differences, Board on Health Sciences Policy. Institute of Medicine. “Exploring the Biological Contributions to Human Health: Does Sex Matter ?”2001.
  2. Martin RM., Biswas PN. “Age and sex distribution of suspected adverse drug reactions to newly marketed drugs in general practice in England: analysis of 48 cohort studies.” British journal of clinical pharmacology. 1998;46(5):505-11.
  3. Heinrich J. “Drug safety: most drugs withdrawn in recent years had greater health risks for women.”GAO. 2001.
  4. Farkas RH., Unger EF. “Zolpidem and driving impairment—identifying persons at risk.”N Engl J Med 2013;369(8):689-691.
  5. Ridker PM, Cook NR, “A Randomized Trial of Low-Dose Aspirin in the Primary Prevention of Cardiovascular Disease in Women.”N Engl J Med.2005;352:1293-1304.
  6. Gleiter C.H., Gundert-Remy U. “Gender differences in pharmacokinetics.” European Journal of Drug Metabolism and Pharmacokinetics.1996;21:123–128.
  7. Gandhi M., Aweeka F., « Sex Differences in Pharmacokinetics and PharmacodynamicsAnnualReview of Pharmacology and Toxicology 2004 ;44(1), 499-523.
  8. McQueen JK., Wilson H. “Estradiol-17β increase serotonin transporter (SERT) mRNA levels and the density of SERT-binding sites in female rat brain”, Molecular Brain Research; 1997;45(1):13-23.
  9. Jovanovic H, Johan Lundberg J, “Sex differences in the serotonin 1A receptor and serotonin transporter binding in the human brain measured by PET”, NeuroImage.2008;39(3): 1408-1419.
  10. Vizgirda VM, Wahler GM, “Mechanisms of sex differences in rat cardiac myocyte response to beta-adrenergic stimulation.” Am J Physiol Heart Circ Physiol.2002; 282(1): 256‐263.
  11. Craft RM, Bernal SA. “Sex differences in opioid antinociception: kappa and mixed action agonists.”Drug Alcohol Depend. 2001;63: 215–28.
  12. Lew KH, Ludwig EA. “Gender-based effects on methylprednisolone pharmacokinetics and pharmacodynamics.”Clin. Pharmacol. Ther.1993; 54:402–414.
  13. Liu K, Dipietro Mager A. “Women's involvement in clinical trials: historical perspective and future implications.” Pharmacy practice. 2016;14(1):708.
  14. Auroux M., Dulioust E., “Cyclosphamide in the F0 male rat: physical and behavioral changes in three successive adult generations.”Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis. 1990;229(2):189-200.
  15. Zakhem G.A. “Infertility and teratogenicity after paternal exposure to systemic dermatologic medications: A systematic review.”Journal of the American Academy of Dermatology. 2019;80 (4): 957 – 969.
  16. Mouyis M, Flint JD. “Safety of anti-rheumatic drugs in men trying to conceive: A systematic review and analysis of published evidence.” Semin Arthritis Rheum. 2019;48(5):911‐920.
  17. Lutwak-Mann C, Schmid K , Keberle H. “Thalidomide in rabbit semen.” Nature.1967;214(5092):1018-1020.
  18. “Guidelines for the study and evaluation of gender differences in the clinical evaluation of drugs”; notice. Fed Regist.: 58(139):39406-39416. 22 Jul 1993.
  19. US Government Accountability Office (GAO). “Women’s Health: Women sufficiently represented in new drug testing, but FDA oversight needs improvement”Washington, DC. 2001. Report GAO-01-754.
  20. Geller S, Adams M, Carnes M. “Adherence to federal guidelines for reporting of sex and race/ethnicity in clinical trials.”J Womens Health2006;15(10):1123-1131.
  21. Geller S, Koch A, Pellettieri, Carnes M. “Inclusion, analysis, and reporting of sex and race/ethnicity in clinical trials: have we made progress?”J Womens Health. 2011; 20(3):315-320.
  22. Poon R., KhanijowK. “Participation of women and sex analyses in late-phase clinical trials of new molecular entity drugs and biologics approved by the FDA in 2007-2009.” Journal of women's health. 2013; 22(7): 604-16.
  23. Jin X.., Chandramouli C.Women’s Participation in Cardiovascular Clinical Trials From 2010 to 2017. »Circulation.2020; 141(7):540-548.
  24. E. Pinnow, P.Sharma, “Increasing Participation of Women in Early Phase Clinical Trials Approved by the FDA”Women's Health Issues. 2009;19(2): 8.
  25. Beery AK, Zucker I. “Sex bias in neuroscience and biomedical research.”Neurosci Biobehavrev. 2011; 35(3):565-72.
  26. Lu HR, Remeysen P. “Female Gender is a Risk Factor for Drug‐Induced Long QT and Cardiac Arrhythmias in an In Vivo Rabbit Model.”Journal of Cardiovascular Electrophysiology.2001; 12: 538-545.
  27. Drici M, Burklow T. “Sex hormones prolong the QT interval and downregulate potassium channel expression in the rabbit heart.”Circulation. 1996;94: 1471–1474.
  28. Drici MD, Clement N. “Is gender a risk factor for adverse drug reactions? The example of drug-induced long QT syndrome.”Drug Saf. 2001;24(8): 575–585.
  29. Tannenbaum C., Ellis R. “Sex and gender analysis improves science and engineering.” Nature 2019;575: 137–146.
  30. Rathore SS, Wang Y. “Sex-based differences in the effect of digoxin for the treatment of heart failure.”N Engl J Med.2002;347:1403-11.
  31. Andrey JL, Romero S. “Mortality and morbidity of heart failure treated with digoxin. A propensity-matched study.”Int J Clin Pract.2011; 65: 1250-8.
  32. Flory, James H et al. “Observational cohort study of the safety of digoxin use in women with heart failure.” BMJ open vol. 2,2 e000888. 2012
  33. Aulakh A., Anand S., “Sex and Gender Subgroup Analyses of Randomized Trials: The Need to Proceed With Caution”, Women's Health Issues. 2007;17: 342-350.
  34. Buch T,Moos K. “Benefits of a factorial design focusing on inclusion of female and male animals in one experiment.”J. Mol. Med. 2019;97:871–877.