“Will it work?” is the question usually asked first and most frequently about any contraceptive method.1 Although this question cannot be answered with certainty for any particular couple, clinicians and counselors can try to help patients understand something of the difficulty of quantifying efficacy.
It is useful to distinguish between measures of contraceptive effectiveness and measures of the risk of pregnancy during contraceptive use. Many persons, including clinicians and clients, prefer positive rather than negative statements; instead of the negative statement that 20% of women using a method become accidentally pregnant during their first year of use, they prefer the alternative positive statement that the method is 80% effective. However, it does not follow that the method is 80% effective, because it is not true that 100% of these women would have become pregnant if they had not been using contraception. If 90% of these method users would have become pregnant had they used no method, then the use of the method reduced the number of accidental pregnancies from 90% to 20%, a reduction of 78%. In this sense, the method could be said to be 78% effective at reducing pregnancy in the first year. But if only 60% of these women would have become pregnant if they did not use contraception, then the method would be only 67% effective. Because no study can ascertain the proportion of women who would have become pregnant had they not used the contraceptive method under investigation, it is simply not possible to measure effectiveness directly. Therefore, we focus attention entirely on pregnancy rates or probabilities of pregnancy during contraceptive use, which are directly measurable. However, we continue to use the term 'effectiveness', in its loose everyday sense of how well a method works, throughout this chapter, and we use the terms 'effectiveness' and 'efficacy' interchangeably. We also provide estimates of the proportion of women who would become pregnant if they did not use contraception, so that the reader may calculate rough effectiveness rates if they are needed.
Four pieces of information about contraceptive efficacy would help couples to make an informed decision when choosing a contraceptive method:
The difference between failure rates during imperfect use and failure rates during perfect use shows how forgiving of imperfect use a method is. The difference between failure rates during typical use and failure rates during perfect use shows the consequences of imperfect use. This difference depends both on how unforgiving of imperfect use a method is and on how hard it is to use that method perfectly. Only the first two pieces of information are currently available.
Our current understanding of the literature on contraceptive efficacy is summarized in Table 1.2 In the second column, we provide estimates of the probabilities of pregnancy during the first year of typical use of each method in the United States. For most methods, these estimates were derived from the experience of women in the 1995 National Survey of Family Growth (NSFG) or the 1995 and 2002 NSFGs so that the information pertains to nationally representative samples of users.3, 4 We based the probabilities of pregnancy for the cervical cap and the sponge on results of two clinical trials in which women were randomly assigned to use the diaphragm or sponge, or the diaphragm or cervical cap.2 Our estimates for methods such as intrauterine contraceptives, the implant, and sterilization were derived from large clinical investigations.2 The estimate for the female condom is based on the only clinical trial of this method.2 The estimate for chance was based on evidence from clinical invegtigations.2 We assumed that the pregnancy rate during typical use of Evra and NuvaRing would be the same as that for the pill, because the only published clinical trials showed no differences in efficacy between the pill and the patch or the pill and the ring.5, 6
Table 1. Percentage of women experiencing an unintended pregnancy during the first year of typical use and the first year of perfect use of contraception and the percentage continuing use at the end of the first year (United States).
Percentage of women experiencing an unintended pregnancy within the first year of use
Percentage of women continuing use at 1 year‡ (4)
Typical use* (2)
Perfect use† (3)
|Fertility awareness-based methods|
|Standard days method||5|
Two day method
Combined pill and progestin-only pill
Mirena (LNg IUS)
Emergency contraception: Insertion of a copper intrauterine contraceptive or taking emergency contraceptive pills after unprotected intercourse substantially reduces the risk of pregnancy‡‡
Lactational amenorrhea method (LAM): LAM is a highly effective, temporary method of contraception§§
Source: Trussell J. Contraceptive efficacy. In: Hatcher RA, Trussell J, Nelson AL, Cates W, Kowal D, Policar M. Contraceptive Technology: Twentieth Revised Edition. New York NY: Ardent Media, 2011.
Pregnancy rates during typical use reflect how effective methods are for the average person who does not always use methods correctly or consistently. It is important to understand that typical use does not imply that a contraceptive method was actually used. In the NSFG and in most clinical trials, a woman is “using” a contraceptive method if she considers herself to be using that method. So typical use of the condom could include actually using a condom only occasionally, and a woman could report that she is “using” the pill even though her supplies ran out several months ago. In short, “use” – which is identical to “typical use” – is an elastic concept that depends entirely on an individual woman's perception.
In the third column, we provide our best guess of the probabilities of method failure (pregnancy) during the first year of perfect use. A method is used perfectly when it is used consistently according to a specified set of rules. For many methods, perfect use requires use at every act of intercourse. Virtually all method failure rates reported in the literature have been calculated incorrectly and are too low. (See the discussion of methodologic pitfalls below.) Hence, we cannot justify our estimates rigorously except those for three fertility awareness-based methods,7, 8, 9 the diaphragm,10 the sponge,10 the male condom,11, 12, 13 the female condom,14 spermicides15 methods with little scope for user error (implants, injectables, and sterilization), and the pill, patch, and intrauterine contraceptives, which have extensive clinical trials with very low-pregnancy rates.2 Even the estimates for the fertility awareness-based methods, female condom, diaphragm, sponge, and spermicides are based on only one or two studies. Our hope is that our understanding of efficacy during perfect use for these and other methods will be enhanced by additional studies.
The fourth column displays the first-year probabilities of continuing use. They are based on the same sources used to derive the estimates in the second column (typical use). More complete explanations of the derivations of the statistics in Table 1 are provided elsewhere.2
It is interesting to compare these estimates with pregnancy rates observed among women using isotretinoin, which is effective in treating severe acne but is also teratogenic. To minimize pregnancies among women undergoing treatment, the manufacturer and the U.S. Food and Drug Administration implemented a pregnancy prevention program. Among 76,149 women who reported using contraception, 268 became pregnant, yielding a rate of 3.6 per 1000 20-week courses of therapy;16 this rate, if constant for a year, would be equivalent to an annual probability of pregnancy of 0.9%. Estimated annual probabilities of pregnancy were 0.8%, 2.1%, and 2.6% among women who reported using oral contraceptives, diaphragms, and condoms, respectively. Thus, women using diaphragms achieved lower rates of pregnancy than we estimate would occur during perfect use, and those using condoms and oral contraceptives experienced about the same pregnancy rates that would be expected during perfect use. Pregnancy rates for women using any of these three methods, however, were substantially below rates generally observed during typical use; this finding would appear to indicate that understanding of the teratogenic risks of isotretinoin substantially enhanced correct and consistent use. It is also possible that women in this study had lower-than-average fecundity because acne is a marker for excess androgen production resulting from anovulation,17 that they lowered their coital frequency during treatment, or that they underreported their number of pregnancies (and abortions).
Using two methods at once dramatically lowers the risk of unintended pregnancy, provided they are used consistently. If one of the methods is a condom or vaginal barrier, protection from disease transmission is an added benefit. For example, the probabilities of pregnancy during the first year of perfect use of male condoms and spermicides are estimated to be 2% and 18%, respectively, in Table 1. It is reasonable to assume that during perfect use the contraceptive mechanisms of condoms and spermicides operate independently, since lack of independence during typical use would most likely be due to imperfect use (either use both methods or not use either). The annual probability of pregnancy during simultaneous perfect use of condoms and spermicides would be 0.03%, about the same as that achieved by the implant and much lower than that achieved by the pill (0.3%) during perfect use.18
We confine attention to the first-year probabilities of pregnancy solely because probabilities for longer durations are generally not available. There are three main points to remember about the effectiveness of contraceptive methods over time. First, the risk of pregnancy during either perfect or typical use of a method should remain constant over time for an individual woman with a specific partner, providing that her underlying fecundity and frequency of intercourse do not change (although it is possible that the risk for a woman could decline during typical use of certain methods because she learns to use her method correctly and consistently). Second, in contrast, the risk of pregnancy during typical use of a method will decline over time for a group of users, primarily because those users prone to fail do so early, leaving a pool of more diligent contraceptive users or those who are relatively infertile or who have lower coital frequency. This decline will be far less pronounced among users of those methods with little or no scope for imperfect use. The risk of pregnancy during perfect use for a group of users should decline as well, but this decline will not be as pronounced as that during typical use, because only the relatively more fecund and those with higher coital frequency are selected out early. For these reasons, the probability of becoming pregnant during the first year of use of a contraceptive method will be higher than the probability of becoming pregnant during the second year of use. Third, probabilities of pregnancy cumulate over time. Suppose that 15%, 12%, and 8% of women using a method experience a contraceptive failure during years 1, 2, and 3, respectively. The probability of not becoming pregnant within 3 years is calculated by multiplying the probabilities of not becoming pregnant for each of the 3 years: 0.85 times 0.88 times 0.92, which equals 0.69. Thus, the percentage becoming pregnant within 3 years is 31% (= 100% – 69%).
The lesson here is that the differences among probabilities of pregnancy for various methods will increase over time. For example, suppose that each year the typical proportion of women becoming pregnant while taking the pill is 8% and while using the diaphragm is 16%. Within 5 years, 34% of pill users and 58% of diaphragm users will become pregnant.
Data from the 1995 NSFG can be used to estimate age-specific contraceptive failure rates to produce a total life-time contraceptive failure rate (i.e. the number of contraceptive failures that the typical woman would experience in a lifetime if she used reversible methods of contraception continuously, except for the time spent pregnant after a contraceptive failure) from exact age 15 to exact age 45. This estimate is of course based on the standard synthetic-cohort assumption. In this case, that is the typical woman uses at each age the same mix of methods observed at each age in the NSFG and experiences the same rate of contraceptive failure observed at that age. The typical woman who uses reversible methods of contraception continuously from ages 15–45 years would experience 1.8 contraceptive failures. If we consider both reversible methods and sterilization, the typical woman would experience only 1.3 contraceptive failures from ages 15–45 years.19
Both providers of contraception and their patients can better understand why the answer to the simple question “Will it work?” is such a complicated issue if we recall that many factors influence efficacy. Factors that affect contraceptive failure rates and probabilities reported in the literature can be usefully divided into three categories: (1) the inherent efficacy of the method when used correctly and consistently (perfect use), and the technical attributes of the method that facilitate or interfere with proper use; (2) characteristics of the user; and (3) competence and honesty of the investigator in planning and executing the study and in analyzing and reporting the results.
For some methods, such as sterilization, implants, and intrauterine contraceptives, the inherent efficacy is so high and proper and consistent use is so nearly guaranteed that extremely low pregnancy rates are found in all studies, and the range of reported pregnancy rates is quite narrow. For other methods, such as the pill and injectable, inherent efficacy is high, but there is still room for potential misuse (e.g. forgetting to take pills or failure to return on time for injections), so that the second factor can contribute to a wider range of reported probabilities of pregnancy. In general, the studies of sterilization, injectable, implant, pill, and IUD use have been competently executed and analyzed. Studies of periodic abstinence, spermicides, and the barrier methods display a range of reported probabilities of pregnancy because the potential for misuse is high, the inherent efficacy is relatively low, and the competence of the investigators is mixed.
Characteristics of the users can affect the pregnancy rate for any method under investigation, but the impact is greatest when the pregnancy rates during typical use are highest, either because the method has less inherent efficacy or because it is hard to use consistently or correctly.
The user characteristic that is probably most important is imperfect use of the method. Unfortunately, nearly all investigators who have attempted to calculate 'method' and 'user' failure rates have done so incorrectly. Investigators routinely separate the unintended pregnancies into two groups. By convention, pregnancies that occur during a month when a method was used improperly are classified as user failures (even though, logically, a pregnancy might be because of failure of the method, if it was used correctly on some occasions and incorrectly on others), and all other pregnancies are classified as method failures. But investigators do not separate the exposure (i.e. the denominator in the calculation of failure rates) into these two groups.
For example, assume that there are two method failures and eight user failures during 100 women-years of exposure to the risk of pregnancy. Then the common calculation is that the user failure rate is 8% and the method failure rate is 2%; the sum of the two is the overall failure rate of 10%. By definition, however, method failures can occur only during perfect use, and user failures cannot occur during perfect use. If there are 50 years of perfect use and 50 years of imperfect use in the total of 100 years of exposure, then the method failure rate would be 4% and the user failure rate would be 16%; the difference between the two rates (here 12%) provides a measure of how forgiving of imperfect use the method is. However, because investigators do not generally inquire about perfect use except when a pregnancy occurs, the proper calculations cannot be performed. The importance of perfect use is shown in the few studies in which the requisite information on quality of use was collected. For example, in a World Health Organization study of the ovulation method of periodic abstinence, the proportion of women becoming pregnant among those who used the method perfectly during the first year was 3.1%, whereas the corresponding proportion failing during a year of imperfect use was 86.4%.9 In a large clinical trial of the cervical cap conducted in Los Angeles, among the 5% of the sample who used the method perfectly, the fraction failing during the first year was 6.1%. Among the remaining 95% of the sample who at least on one occasion used the cap imperfectly, the first-year probability of pregnancy was nearly twice as high (11.9%).20
FREQUENCY OF INTERCOURSE
Among those who use a method consistently and correctly (perfect users), the most important user characteristic that determines the risk of pregnancy is frequency of intercourse. For example, in a study in which users were randomly assigned to either the diaphragm or the sponge, diaphragm users who had intercourse four or more times a week became pregnant in the first year twice as frequently as did those who had intercourse fewer than four times a week.21 In that clinical trial, among women who used the diaphragm at every act of intercourse, only 3.4% of those who had intercourse fewer than three times a week became pregnant in the first year compared with 9.7% of those who had intercourse three or more times per week.10
A woman's biologic capacity to conceive and bear a child declines with age. This decline is likely to be pronounced among those who are routinely exposed to sexually transmitted infections such as chlamydia and gonorrhea. Among those not so exposed, the decline is likely to be moderate until a woman reaches her late 30s.22 Although many investigators have found that contraceptive failure rates decline with age, this effect almost surely overstates the pure age effect because age in many studies primarily captures the effect of coital frequency, which declines both with age and with marital duration.23 User characteristics such as race and income seem to be less important determinants of contraceptive failure.
Influence of the investigator
The competence and honesty of the investigator also affect the published results. The errors committed by investigators range from simple arithmetic mistakes to outright fraud.24 One well-documented instance of fraud involved the Dalkon shield. In a two-page article published in the American Journal of Obstetrics and Gynecology, a first-year probability of pregnancy of 1.1% was presented and the claim made that “only the combined type of oral contraceptive offers slightly greater protection”.25 It was not revealed by the researcher that some women had been instructed to use spermicides as an adjunctive method to reduce the risk of pregnancy, nor that the author was part owner of the Dalkon Corporation. Furthermore, the author never subsequently revealed (except to the A.H. Robins Company, which bought the shield from the Dalkon Corporation but did not reveal this information either) that as the original trial matured, the first-year probability of pregnancy more than doubled.26
The system of drug testing in the United States, which demands that the company wishing to market a drug be responsible for conducting studies to assess its efficacy and safety, provides incentives for the unscrupulous to present less-than-honest results. Some actions that are not deliberately dishonest are, nevertheless, not discouraged by the incentives in the current system. For example, a woman who becomes pregnant may be discarded from a clinical trial if the researcher decides that she did not fit the protocols after all. Or one can be less than vigilant in trying to contact patients lost to follow-up (LFU). The standard assumption made at the time of analysis is that women who are LFU experience unintended pregnancy at the same rate as those who are observed. This assumption is probably innocuous when the proportion of LFU patients is small. But in many studies, the proportion LFU may be 20% or higher, so that what really happens to these women could drastically affect the estimate of the proportion becoming pregnant. Our strong suspicion is that women LFU are more likely to experience a contraceptive failure than are those still in the trial. For example, one study found that the pregnancy rate for calendar rhythm rose from 9.4 to 14.4 per 100 women-years of exposure as a result of resolution of patients LFU.27
Several methodologic pitfalls can snare investigators. One of the most common is a misleading measure of contraceptive failure called the Pearl index, which is obtained by dividing the number of unintended pregnancies by the number of years of exposure to the risk of unintended pregnancy contributed by all women in the study. This measure can be misleading when one wishes to compare pregnancy rates obtained from studies with different average amounts of exposure. The likelihood of pregnancy declines over time because those most likely to become pregnant do so at earlier durations of contraceptive use and exit from observation. Those still using after long durations are unlikely to become pregnant, so that an investigator could (wittingly or unwittingly) drive the reported pregnancy rate toward zero by running the trial 'forever'. Two investigators using the NSFG could obtain Pearl index pregnancy rates per 100 women-years of exposure for the condom.28 One investigator (who got 4.4) allowed each woman to contribute a maximum of 5 years of exposure, while the other investigator (who got 7.5) allowed each woman to contribute only 1 year. Which investigator is incorrect? Neither. The two rates are simply not comparable. In contrast, life-table measures of contraceptive failure are easy to interpret and control for the distorting effects of varying durations of use. Another problem occurs when deciding which pregnancies to count. Most studies count only the pregnancies observed and reported by the women. If, conversely, a pregnancy test were administered every month, the number of pregnancies (and, hence, the pregnancy rate) would increase because early fetal losses not observed by the woman would be added to the number of observed pregnancies. Such routine pregnancy testing in the more recent contraceptive trials has resulted in higher pregnancy rates than would otherwise have been obtained and makes the results not comparable to those from other trials. Other more technical errors that have biased reported results are discussed elsewhere.9, 24, 29
The incentives to conduct research on contraceptive failure vary widely from method to method. Many studies of the pill and IUD exist because companies wishing to market them must conduct clinical trials to show their efficacy. In contrast, few studies of withdrawal exist because there is no financial reward for investigating this method. Moreover, researchers face differing incentives to report unfavorable results. The vasectomy literature is filled with short articles by clinicians who have performed 500 or 1000 or 1500 vasectomies. When they report pregnancies (curiously, pregnancy is seldom mentioned in discussions of vasectomy 'failures', which focus on the continued presence of sperm in the ejaculate), their pregnancy rates are invariably low. Surgeons with high-pregnancy rates simply do not write articles calling attention to their poor surgical skills. Similarly, drug companies do not commonly publicize their failures. Even if investigators prepared reports describing failures, journal editors would not be likely to publish them.
This brief review of contraceptive efficacy leads to the following conclusions:
Grady WR, Klepinger DH, Nelson-Wally A. Contraceptive characteristics: The perceptions and priorities of men and women. Fam Plann Perspect 1999;31:168–175
Trussell J. Contraceptive efficacy. In: Hatcher RA, Trussell J, Nelson AL, Cates W, Kowal D, Policar M (Eds). Contraceptive Technology: Twentieth Revised Edition. New York NY: Ardent Media, 2011
Trussell J. Estimates of contraceptive failure from the 1995 National Survey of Family Growth. Contraception. 2008;78:85
Kost K, Singh S, Vaughan B, Trussell J, Bankole A. Estimates of contraceptive failure from the 2002 National Survey of Family Growth. Contraception. 2008;77:10-21
Audet MC, Moreau M, Koltun WD et al. Evaluation of contraceptive efficacy and cycle control of a transdermal contraceptive patch vs an oral contraceptive. JAMA 2001;285:2347– 2354
Oddsson K, Leifels-Fischer B, de Melo NR et al. Efficacy and safety of a contraceptive vaginal ring (NuvaRing) compared with a combined oral contraceptive: a 1-year randomized trial. Contraception 2005;71:176-182
Arévalo M, Jennings V, Sinai I. Efficacy of a new method of family planning: the Standard Days Method. Contraception 2002;65:333-338
Arévalo M, Jennings V, Nikula M, Sinai I. Efficacy of the new TwoDay Method of family planning. Fertil Steril 2004;82:885-892
Trussell J, Grummer-Strawn L. Contraceptive failure of the ovulation method of periodic abstinence. Fam Plann Perspect 1990;22:65– 75
Trussell J, Strickler J, Vaughan B. Contraceptive efficacy of the diaphragm, the sponge and the cervical cap. Fam Plann Perspect 1993;25:100–105, 135
Frezieres RG, Walsh TL, Nelson AL et al. Evaluation of the efficacy of a polyurethane condom: Results from a randomized, controlled clinical trial. Fam Plann Perspect 1999;31:81–87
Walsh TL, Frezieres RG, Peacock K, Nelson AL, Clark VA, Bernstein L. Evaluation of the efficacy of a nonlatex condom: results from a randomized, controlled clinical trial. Perspect Sex Reprod Health 2003;35:79-86
Steiner MJ, Dominik R, Rountree RW, Nanda K, Dorflinger LJ. Contraceptive effectiveness of a polyurethane condom and a latex condom: a randomized controlled trial. Obstet Gynecol 2003;101:539-547
Farr G, Gabelnick H, Sturgen K et al. Contraceptive efficacy and acceptability of the female condom. Am J Public Health 1994;84:1960–1964
Raymond EG, Chen PL, Luoto J. Contraceptive effectiveness and safety of five nonoxynol-9 spermicides: a randomized trial. Obstet Gynecol 2004;103:430-439
Mitchell AA, Van Bennekom CM, Louik C. A pregnancy-prevention program in women of childbearing age receiving isotretinoin. N Engl J Med 1995;333:101-106
Speroff L, Glass RH, Kase NG. Clinical gynecologic endocrinology and infertility, 5th ed. Baltimore, MD: Williams and Wilkins; 1994
Kestelman P, Trussell J. Efficacy of the simultaneous use of condoms and spermicides. Fam Plann Perspect 1991;23:226–227, 232
Trussell J, Vaughan B. Contraceptive failure, method-related discontinuation and resumption of use: Results from the 1995 National Survey of Family Growth. Fam Plann Perspect 1999;31:64–72, 93
Richwald GA, Greenland S, Gerber MM et al. Effectiveness of the cavity-rim cervical cap: Results of a large clinical study. Obstet Gynecol 1989;74:143–148
McIntyre SL, Higgins JE. Parity and use-effectiveness with the contraceptive sponge. Am J Obstet Gynecol 1986;155:796–801
Menken J, Trussell J, Larsen U. Age and infertility. Science 1986;233:1389–1394
Trussell J, Westoff CF. Contraceptive practice and trends in coital frequency. Fam Plann Perspect 1980;12:246–249
Trussell J, Kost K. Contraceptive failure in the United States: A critical review of the literature. Stud Fam Plann 1987;18:237–283
Davis HJ. The shield intrauterine device. A superior modern contraceptive. Am J Obstet Gynecol 1970;106:455–456
Mintz M. At any cost: Corporate greed, women, and the Dalkon shield. New York, NY: Pantheon Books; 1985
Tietze C, Poliakoff SR, Rock J. The clinical effectiveness of the rhythm method of contraception. Fertil Steril 1951;2:444–450
Trussell J, Menken J. Life table analysis of contraceptive failure. In: Hermalin AI, Entwisle B (eds), The Role of Surveys in the Analysis of Family Planning Programs, pp 537–571. Liege, Belgium: Ordina Editions; 1982
Trussell J. Methodological pitfalls in the analysis of contraceptive failure. Stat Med 1991;10:201–220