Quantcast
Last updated on February 11, 2012 at 0:00 EST

When Bad Evidence Happens to Good Treatments

May 15, 2008

By Carr, Daniel B

This article has as its starting point the American Society of Regional Anesthesia and Pain Medicine (ASRAPM) 2004 Bonica Lecture, “Pain Treatments: Elegant Theories, Inelegant Universe.” That talk summarized over a decade of work I had been privileged to undertake with U.S. federal agencies producing clinical practice guidelines and evidence reports,1-4 the American Society of Anesthesiologists on guidelines,5,6 the Cochrane Collaboration7 on systematic reviews and meta-analyses, and other initiatives to help pain medicine benefit from evidence-based practice. This interest is ongoing.8 It was a great honor to deliver that lecture, particularly since my career as an anesthesiologist treating pain had started early enough to offer the personal privilege of friendship with John Bonica himself (Fig 1). Yet even as I was preparing that talk, the climate of pain medicine was changing. It was already clear that powerful stakeholders in the healthcare enterprise were looking to evidencebased medicine (EBM) for answers about effectiveness, cost effectiveness, appropriateness, and even efficacy beyond what EBM could reasonably provide.9 Since then, over- and mis-application of EBM to support health policies such as “pay for performance” and to restrict payment has created a crisis.10 This ongoing crisis threatens the survival of important forms of pain therapy, restricting health care offered by many members of ASRAPM and the clinical infrastructure needed to sustain it. Faced with a choice between submitting a timely, largely unread report about past efforts, versus a tardy one that might offer a basis to confront a deteriorating situation of widespread challenges and denials for payment based upon mis- and over-interpretations of EBM, I chose the latter. I thank the Editor for his forbearance in allowing me ample time to revisit and extend the original content. In doing so, I realized that the practice of EBM dates to antiquity, but what is new today is that EBM is being used as a rationale to restrict physician payment and/or autonomy. Ironically, as the Human Genome Project has generated optimism about tomorrow’s personalized medicine based upon pharmaco-genomics,11 the already individualized pain care provided today in multidisciplinary pain treatment programs is under attack. This essay surveys the allure and power of reductionism, but also seeks to counter the fallacy of citing oversimplified group statistics as a basis for denying clinical care. By presenting a taxonomy of bad evidence that mayand already has-befallen good treatments, I hope to provide colleagues with a basis for restoring balance between health policies based upon statistical aggregates, and patient-centered care that accommodates individual diversity. Goals of Pain Medicine

Healthcare professionals and patients worldwide have through numerous organizations, in cooperation with governmental and regulatory authorities nationally and internationally, argued that pain relief is a fundamental human right.12 This advocacy has occurred via a multidimensional process, only some of whose activities have been deliberately coordinated,13,14 e.g., by the World Health Organization.15 The spontaneous, grass-roots nature of most efforts along these lines speaks for the action of widespread (if poorly understood) social forces that typically emerge at historic moments to actualize an idea whose time has come.16,17 Table 1 summarizes the essence of pain practice in an ideal world in which this right has become recognized. All guidelines on the topic of pain control emphasize the importance of individualized pain assessment and related patient-reported outcomes as the practical basis for the entire enterprise.18 Indeed, it may be argued that the key driver for the recent explicit focus upon pain control in organized healthcare was a fundamental shift in orientation of the entire healthcare enterprise to become more patient-centered.19 From that perspective, inquiring regularly about the presence and intensity of pain is a quick and simple means for every healthcare system to remind patients regularly of its concern for the quality of their subjective experience.19

Fig 1. Evidence of this grateful author’s debt to John Bonica.

EBM-An Archaic Concept

Every culture has produced oral narratives of illness or injury20,21 and efforts to treat these, later documented in formal medical writings such as papyri, and archeological relics such as trephined skulls or amputated limbs. Since prehistory, screening of natural substances for their medicinal properties has occurred, along with observations of the effects of surgery, and the accumulated knowledge carried forward in spoken or written form. The meltback of an Alpine glacier about 15 years ago to reveal what until that time was the oldest intact cadaver ever recovered gave immediate evidence that even in the Bronze Age, pain treatment already relied upon structured interventions. This author suggested that the then-mysterious tattoos on the newly discovered “Ice Man’s” flank, ipsilateral knee and ankle were therapeutic tattooing, like acupuncture, to treat likely sciatic pain.22 Subsequently, computed tomography (CT) scanning disclosed disk degeneration and facet arthritis of the cervical and lumbar spine, making it overwhelmingly likely that the Ice Man had experienced classical sciatic symptoms, had a means to describe them, and had sought and received injection therapy from someone else (because he could not have tattooed his own flank so delicately). The tattoos’ patterns and placement are sufficiently precise as to make it overwhelmingly likely that this technique had been refined over years, possibly generations, before being applied to the Ice Man.

Today we might apply the terms “case report,”"case series,” or “N of 1 trial” to describe the nature of the evidence for treatments that emerged in antiquity. Outcomes assessment has also been practiced for thousands of years. The writings of Plato around 400 B.C. (as cited in Osier’s survey of the practice of medicine in ancient Greece) describe that the annual reappointment of state- supported physicians involved their appearing before the community.23 In these public hearings, citizens who had been treated by that physician during the prior year (or their surviving kin!) could praise or condemn the clinical outcomes achieved, thereby influencing the physician’s reappointment. Socrates, quoted by Plato, said, “If you and I were physicians, and were advising one another that we were competent to practice, should I not ask you, and would you not ask me, Well, what about Socrates himself, has he not good health? And was anyone else ever been known to be cured by him, whether slave or freeman?” Equally interestingly, the ancient Greeks distinguished between the labor-intensive, “scientific” approach to diagnosis and therapy that was available for citizens, i.e., freemen, versus formulaic empirical therapies hastily provided on the basis of symptoms alone. The latter, inexpensive care was given to slaves in the original model for a 2-tiered medical system. Again, quoting Plato, “And did you ever observe that there are two classes of patients in states, slaves and freemen; and the slave doctors run about and cure the slaves, or wait for them in the dispensaries. Practitioners of this sort never talk to their patients individually, or let them talk about their own individual complaints. The slave doctor prescribes what mere experience suggests, as if he had exact knowledge; and when he has given his orders, like a tyrant, he rushes off with equal assurance to some other servant who is ill; and so he relieves the master of the house of the care of his invalid slaves. But the other doctor, who is a freeman, attends and practices upon freemen; and he carries his inquiries far back, and goes into the nature of the disorder; he enters into discourse with the patient and his friends, and is at once getting information from the sick man, and also instructing him as far as he is able, and will not prescribe for him until he has first convinced him. At last, when he has brought the patient more and more under his persuasive influences and set him on the road to health, he attempts to effect a cure. Now which is the better way of proceeding in a physician?”

Table 1. Pain Medicine in an Ideal World

Formal clinical trials no doubt also took place in the ancient world. The Old Testament clearly recounts, in the Book of Daniel,24 what we would now describe as a prospective, nonrandomized, controlled clinical trial conducted around 600 B.C. Despite its low power (n = 10 in each of the experimental and control groups), it emphatically confirmed in both qualitative and quantitative dimensions the benefits of a diet of grain and water, versus the royal household diet of meat and wine. It clearly anticipated today’s interest in “neutraceuticals”!

Another seemingly modern insight of current-day EBM is the importance of using methods that minimize bias. However, scholarly writings throughout history well articulate the detrimental effects of irresponsible reporting, including distortion on the basis of conflict of interest. For example, the Arabic scholar Hadjrat AIi wrote in 400 A.D.: “Understand information you hear with the reasoning of responsibility, not the reasoning of the reporter. For there are many reports of knowledge, but few are responsible.”25 The medieval Jewish sage Maimonides, writing in the twelfth century A.D., cautioned: “If anyone declares to you that he has actual proof, from his own experience, of something that he requires for the confirmation of his theory-even though he be considered a man of great authority, truthfulness, earnest words and morality-yet, just because he is anxious for you to believe his theory, you should hesitate.”26 Kerr White has surveyed the sophisticated insights provided by the seventeenth and eighteenth century scholars, William Petty of England, and Pierre-Charles-Alexandre Louis of France, on the need to apply bias-free numerical methods to study the effectiveness of medical treatments.27 Florence Nightingale was another early champion of data-based medical practice. According to White, this numerical approach fell into disfavor during the second half of the nineteenth century as bacteriological research became the dominant focus of medical inquiry, only to resurface in the twentieth century with the rise of health services research. A century ago, Bernard Shaw in his play, The Doctor’s Dilemma, addressed the drama that could ensue when a choice must be made as to who can receive the single available dose of a lifesaving therapy.28 Although he didn’t use the phrases “healthcare rationing,” or “psychology of change,” he did employ astonishingly modern language in saying, “Doctors are not trained in the use of evidence, nor in biometrics, nor in the psychology of human eredulity, nor in the incidence of economic pressure. They must believe, on the whole, what their patients believe…. That is why all changes come from the laity.” Shaw’s writings were not ahead of his time, nor were they neglected by any means. His plays were performed before large audiences to wide acclaim, he was an influential public celebrity, and he won a Nobel Prize in literature as well as an Academy Award.

Despite tenets of EBM having been recognized for millennia, the first prospective, randomized, controlled clinical trial (RCT) did not occur until 1948. The subsequent rise of the RCT29-30 and the combining of RCTs through meta-analyses31-32 and systematic reviews33 have been described in numerous articles and monographs. However, only within my career as a pain medicine physician have I witnessed the routine exploitation of ostensibly negative results from RCTs and meta-analyses of RCTs, as a basis to deny payment for new drugs that are asserted by payors to have no demonstrable differences from older and cheaper ones, nondrug modalities such as transcutaneous electrical nerve stimulation, or invasive pain therapies such as neuromodulation. The over-, and in many cases, frank mis-application of provisional conclusions from the still- evolving methodology of EBM as a basis to deny care harms not only pain sufferers, but also has eroded the financial underpinning of the infrastructure needed to provide multidisciplinary pain care. How often have pain clinicians been told by payors that “insufficient evidence to prove efficacy” is de facto-through withholding of payment-equated with “sufficient evidence to prove ineffectiveness”?

Human Cognition Is Reductionist

I do not believe that EBM’s pervasiveness in the modern medical scene results primarily from its being forced upon healthcare providers by financially-minded auditors as a scheme to profit from the denial of necessary care. Instead, I believe that the reductionist posture of EBM makes it innately attractive and hence influential throughout medicine and other healthcare professions. EBM comprises a series of valuable albeit easily misused tools.31 A common feature of many of the tools of EBM is that they appear to offer welcome methods to distill the torrent of biomedical information, whose current volume overwhelms the absorptive capacity of even the most diligent clinician.7 The Bonica Lecture began by pointing out that reduction of an irritatingly complex, disorganized, everyday reality into a simpler, timeless set of rules and relationships, is innately satisfying to all people. Hard-wired human instinct seeks organizing principles: “this reduction [of nature] to a few laws, to one law, is not a choice of the individual, it is the tyrannical instinct of the mind.”34 Throughout human history, reductionist world views have relied upon hidden, deeper dimensions not obvious in our everyday lives.35,36 At moments when we recognize hidden order,37 we experience a “eureka” emotion in which intellectual satisfaction is mingled with joy, wonder, and even, sometimes, awe that approaches a religious experience.35,37 Think of Pythagoras’s sacrifice of a hundred oxen when he discovered the eternal relationship between the lengths of the sides of a right triangle. Thousands of years before Pythagoras, northern Europeans carried out feats of Neolithic engineering to build stone circles precisely aligned to recognize solstices and equinoxes, visible dimensions of an orderly, unseen universe with cycles of birth, death, and regeneration. Recognition of underlying, invisible, unifying principles is the basis for mathematics and physics. Yet the refutation of reductionist principles has played a pivotal role in postmodern mathematics,38 such as the recognition that mathematical logic faces intrinsic limits on its ability to categorize complex statements as true or false (see below). Current day physics is also facing its growing disillusionment that the satisfyingly elegant construct termed string theory may add esthetic attractiveness, but nothing of substance, to the models that preceded it.39

The -”Why” Chromosome and the Allure of Elegance

Discernment of timeless truths in an abstract reality transcending the messy, transient observations of everyday life occurs in the arts as well as science, and in both fields is embodied in the concept of elegance. The Oxford English Dictionary40 defines elegance as “Characterized by refined grace of form (usually as the result of art or culture) …. of scientific properties, contrivances etc: ‘neat’, pleasing by ingenious simplicity and effectiveness.” Interestingly, among the examples of the use of this word appears one from 1788: “Physicians call a medicine which contains efficient ingredients in a small volume, and of pleasant or tolerable taste, an elegant medicine.” The word “elegant” is widely used in mathematics and theoretical physics. It may refer to a proof that is ingenious and hence requires fewer steps than a more pedestrian, plodding approach, or a predictive physical model that requires relatively few variables and formulae to unify and explain a large quantity of data.41-45 In the course of assembling material for the Bonica Lecture, I collected advertisements in which the word “elegance” appeared, or that were such obvious embodiments of elegance that they contained no words at all. Because they are so common (indicative of the marketing allure of elegance) they were easy to come by. Many were in black and white, that one photo magazine termed “timeless” in comparison to color pictures.

The genetically-based human instinct to construct underlying, simpler patterns and principles out of diverse observations and explicitly to reflect upon these constructs-that one may name the “why” chromosome-differentiates us from all other species. This spirit “is a primordial human property that reveals itself whenever human beings live or material vestiges of former life exist.”46 It is difficult to know if the oldest surviving representational art (estimated to date from about 25,000 years ago in caves in what is now France and Spain) is primitive because the artists could not do any better, or because even long ago they deliberately conveyed the essence of objects they sought to depict. To point out the obvious, any painted or sculpted representation reflects the artist’s judgment as to which features are essential about the object, so that others seeing the work will recognize it as the object depicted. Yet already by 4500 years ago, subtle details in stylized sculptures produced by artists in the Greek Cycladic islands reveal a refined, even whimsical esthetic sophistication. Recently, scholars have deduced that as Cycladic sculpture evolved, successive epochs employed progressively more sophisticated geometrical methods to codify anatomical proportions, and applied these methods as formulaic templates to sculpt figures efficiently from available pieces of marble (Fig 2).47 These early Bronze Age sculptors not only anticipated Leonardo da Vinci’s geometrical modeling of the proportions underlying human anatomy, but also modern regional anesthesiologists’ application of geometric schema based upon anatomic landmarks to guide needle placement (e.g., for nerve root or peripheral nerve blocks). Writing of these ancient insights, Renfrew48 has said “As isolated works some of them are indeed memorable…. But taken together they are no longer seen as … unexpectedly fortunate successes by craftsmen experimenting with a promising material at the dawn of civilization …. Early in date they may be … but they appeal to us today as quite astonishing achievements.”

The deliberate, explicit search for hidden order is well documented in early studies of geometry, that arose around the Mediterranean. Queen Dido of Carthage in 850 B.C. was the first to solve the problem of how to slice the hide of a bull so as to enclose the greatest possible area of land. “By pure intuition” she realized that if the hide were cut so as to form a long leather cord, the greatest amount of land would be enclosed by placing the cord upon the ground in the shape of a circle.49 Heron of Alexandria provided a similarly intuitive solution in 70 A.D. of the “catoptric” problem, namely, to determine the shortest path traced by a ray of light in bouncing off a mirror between source to observer. In a simple geometrical proof, albeit one generated intuitively, Heron showed that the shortest path was one whose angle of incidence equaled the angle of reflection-which is the actual path followed by light reflected from a mirror.50 This glimpse into a previously hidden parsimony of nature anticipated Fermat’s discovery over a millennium later that the trajectory of a ray of light passing through media of differing density is always one that minimizes the time taken, not the distance traveled. By 300 A.D., the Greek mathematician Pappus had assembled a compendium of geometry problems dealing with minima and maxima. The rise of Western philosophy by this time had been accompanied by numerous clear statements of a deep, explicit belief in fundamental, if not always visible, laws established by a benign deity. For example, Epictetus (55-135 A.D.) wrote, “Remember that the divine order is intelligent and fundamentally good. Life is not a series of random, meaningless episodes, but an ordered, elegant whole that follows ultimately comprehensible laws.”51 Fig 2. A template for representing human proportions employed by Cycladic sculptors around 2500 B.C.E. divided the height of the entire figure into 5 equal parts, each the length of the head (“canonical figure of quintile modularity”). Earlier sculptors used multiples of 3 or 4, and later ones used 6. Reproduced with permission of Prof. Lord Colin Renfrew and the Museum of Cycladic Art, Athens.48

The invention of calculus produced even more sophisticated insights, expressed in quantitative terms. In one famous episode, Newton returned home after a long day at the Royal Mint to find a letter from Bernoulli. The letter contained Bernoulli’s challenge to deduce the shape of a wire along which a hollow bead would slide, so as to minimize the time to slide downwards and sideways between 2 points of unequal height. Newton worked late into the night, sending Bernoulli the solution (not a straight line, but instead the “brachistochrone” curve) in the next day’s mail.52 All these problems since antiquity have in common that they explore not just relationships that are in front of one’s eyes, such as those between a circle’s diameter and its area, or a sphere’s diameter and its volume, but instead ask whether these obvious relationships are a manifestation of subtler, timeless principles involving minimization or maximization of unseen variables. Application of these and similar methods (now termed the “calculus of variations”) to the study of moving objects-mechanics-by D’Alembert, Lagrange, Hamilton and others led to the astonishing insight that every trajectory of every moving object in the universe, from billiard balls to projectiles to planets, could be calculated on the basis that each one always follows a path that minimizes the difference between its kinetic and potential energies.53 Maupertuis, an eighteenth century pioneer in mathematical physics, wrote that “These simple and beautiful laws are perhaps the only ones that the Creator and Organizer of all things has to put in place to carry out the workings of the visible world. Our principle . . . follows necessarily from the most wise use of His strength.”54 The primal human instinct to search for hidden principles of order, and the commingling of joy and awe when such principles are revealed for the first time, have recently been posited by several authors as the biological basis of religious experience.55-57 Whether or not this is true, the application of statistics and reductionist graphics in health services research, embodied in forest plots32 or cumulative meta-analysis,58 no doubt evokes the same “eureka” satisfaction of transcending complicated, messy data, as does the calculus of variations (Fig 3). After all, as Chaitin has pointed out, “The belief that the universe is rational, lawful, is of no value if the laws are too complicated for us to comprehend, and is even meaningless if the laws are as complicated as our observations, since the laws would then be no simpler than the world they are supposed to explain …. But simplicity certainly reflects what we mean by understanding: understanding is compression.”60

Fig 3. Randomized controlled trials addressing the reduction of postoperative pulmonary atelectasis during epidural versus systemic opioid analgesia are displayed as a forest plot on the left and a cumulative meta-analysis on the right. A forest plot depicts, in its bottom line, the aggregate confidence interval derived from synthesizing the results of separate studies, each with its own individual confidence interval shown on a separate line. In cumulative meta-analysis, a running aggregate confidence interval is recalculated and depicted for each study in its chronological order of publication. The figure on the right evokes in this author a feeling of recognizing an underlying truth that emerges over time from a diverse group of isolated observations. Reprinted with permission from Ballantyne et al.59

But Pain-Related Evidence is Rarely Elegant

The clinical practice guideline movement within the field of pain medicine began about 15 years ago initially with federal guidelines developed to address acute pain1 and later, cancer pain.2 These were the first federal practice guidelines to be evidence-based, in the sense that they made explicit use of literature retrieval and synthesis techniques designed to minimize bias. The initial guideline, on acute pain, equated the nature of the available evidence with the validity of that evidence, i.e., it assigned the highest place to meta-analyses of randomized controlled trials and the lowest to expert opinion (Table 2). Yet by the time we were preparing the second guideline, we understood that just because an RCT had been conducted to address a question, its results-or even the results obtained by combining several RCTs-did not necessarily carry greater strength and consistency than higher quality studies of less rigorous design. For example, a rigorously designed study might enroll so few patients, or be carried out in such a narrow population or setting, as to be less persuasive than an observational trial conducted in a large and diverse patient cohort. We therefore broadened our evidence hierarchy (Table 2).

Unfortunately, fundamental problems with the nature of the evidence could not be overcome by efforts to categorize it using progressively more sophisticated methods. For one thing, the quality of the evidence related to pain treatments was poor-only about 5% of the acute pain literature we screened, and less than 1 % of that on cancer pain, was of sufficiently high quality for inclusion in our systematic reviews. The majority of the studies were descriptive case reports, case series, or lacked fundamental data such as variances to permit comparison of results from experimental and control groups. The heterogeneity of study designs and endpoints made it inevitable that information would be lost when pooling outcomes across studies, a process that inevitably resulted in the lowest common denominator of outcomes available for pooling, e.g., in assessing the efficacy of neurolytic celiac plexus blockade for cancer pain.61 This problem reached major proportions when conducting a subsequent literature review for the successor agency, the Agency for Healthcare Research and Quality (AHRQ).4 The 218 retrieved trials on cancer pain (of 24,000 screened) employed 125 different pain assessment instruments! Similar problems related to low quality, heterogeneous primary trials of interventional pain management have been noted by Merrill62 and their implications commented upon.63 The inexorable increase in numbers of assessment instruments has been paralleled by concurrent increases in the numbers of systems to rate the strength of scientific evidence such as RCTs, observational studies, and systematic reviews. An AHRQ report in 2002 identified 49 such systems for RCTs, 19 for observational studies, and 20 for systematic reviews.64 That the proliferation of instruments intended at some level to streamline and simplify research and care could produce paralyzing viscosity was a dismaying paradox. Yet bearing in mind that a general feature of all complex dynamic systems is to evolve towards endlessly increasing levels of complexity, this paradox is likely inevitable.65

Table 2. Evidence Hierarchies

Lest we think that codifying and simplifying assessment tools for pain and trial quality is all that stands in the way of the triumph of EBM, a deeper paradox lurks. Twentieth-century advances in the field of pure mathematics that deals with logic have produced unsettling results.66 Since Godel’s theorem, as extended by Gregory Chaitin, it has become clear that even when all the postulates, operators, and variables of a finite logical system are stipulated in vacuo with perfect precision, it is always possible to generate a limitless number of statements whose complexity exceeds the ability of that logical system to prove them true or false. As Chaitin has expressed it, “[When young] I learned that math is more real than the world of everyday appearances … [however] it is not the case that simple clear questions have simple clear answers, not even in the world of pure ideas, and much less so in the messy real world of everyday life.”67 In other words, quantitative attempts to describe and predict reality face intrinsic limits of imprecision and indeterminacy, even when patient variation, heterogeneity of assessment methods, and incompleteness of data capture are not concerns.68

Other, more technical considerations introduce imprecision into the practice of EBM. Specifically, when retrieving and deciding which articles to include in a meta-analysis, there is commonly a degree of judgment, sometimes to the point of arbitrariness. If I am performing a systematic review of methods to control procedure- related pain in patients with cancer, what do I do if I cannot find a paper dealing with pain control for patients with peritoneal metastases undergoing endoscopic biopsy, but do find papers dealing with pain control for endoscopic abdominal surgery? Do I include these studies or not? What if I am trying to decide how best to control pain in children during a certain procedure, but only find RCTs enrolling adults? What if I can’t find a clear answer to the efficacy of a certain anticonvulsant for 1 form of neuropathic pain in a given population, but do find a study of a similar, nonidentical anticonvulsant? In addition to fuzziness of inclusion, Gotzsche and colleagues have recently identified a surprisingly large number of mechanical errors of data extraction in metaanalyses of pooled trials.69 Related to the tension between compression of a large amount of data so as to better grasp it, and the loss of information in doing so, are 2 problems that in a sense are the inverse of each other. Firstly, McQuay, Moore, and colleagues from the Oxford Pain Relief Unit have ingeniously pooled data from postoperative pain trials in diverse models, using diverse outcome measures, by dichotomizing a vast amount of data so as to derive a single numberneeded-to-treat (NNT) for each of numerous analgesics and a few combinations.32 In this context, NNT represents the number of patients needed to be exposed to the particular analgesic and dose, so that 1 additional patient experiences a 50% decline in pain intensity beyond that expected from a placebo. However, other European clinical researchers have criticized this approach because distinct NNTs, sometimes even with nonoverlapping confidence intervals, result when examining the same dose of the same drug used after different types of operations.70 The latter researchers have proposed a structured process by which to synthesize evidenceguided, procedure-specific clinical knowledge so as not to discard the experience of clinicians.71 A similar approach was recently taken by a distinguished group of American clinical researchers to elaborate upon a survey of current practice72 in a process that might be termed practice-based evidence.

Secondly, just as it is hazardous to over-distill heterogeneous data, it is equally hazardous to assume a good outcome will result when applying the results of even the most exciting RCT under realworld conditions over a prolonged duration. Outcomes in such contexts are termed “effectiveness” while those observed in RCTs, or meta-analyses of RCTs, are termed “efficacy.”7 Glasgow has calculated the progressive dilution of impact if a 40% success rate for a hypothetical therapy is tracked along each step of translation from efficacy to effectiveness.73 Thus a therapy that is 100% efficacious in an RCT with enriched accrual (of subjects likely to benefit), but that works only in a subpopulation, is only adopted by some clinics, is only advocated by some clinicians within those clinics, is only accepted by some of their patients and adhered to by some of those who initially accept it, and produces a sustained benefit in only some adherent patients, has a long term impact of merely 0.16%. As with the point made above concerning complexity, the inverse hazards of over-compression and overgeneralization echo lessons from mathematical physics,74,75 that during the twentieth century clearly proved that causal models valid at 1 scale (e.g., a billiard ball) generally fail when applied to another scale (e.g., a subatomic particle or a galaxy).

We may summarize the above concerns that bad evidence may befall good treatment in 2 categories. First are the technical ones, that would disappear if we could just organize better quality trials and do a better job of synthesizing them (Table 3). Second, at a deeper level are those paradoxes that make it likely EBM-a tool intended to ease clinical decision-making-will become progressively more complex; that this growing complexity will render it less and less able to offer simple, clear solutions to everyday, real-world pain problems; and that evidence accumulated in even the best RCTs will not provide answers to broader system-wide and population-based questions (Table 4).

Moving Beyond Reductionism

Recognition of the common sense fact that average results of large-scale trials may not be applicable on an individual basis arose early during the present-day EBM movement76 and has persisted.77 In contexts other than pain control, considerable work has already been accomplished on the importance of stratification of patients at risk of nonresponse at the start of clinical trials, to enhance the power of these trials.78 Conversely, individual risk stratification has been proposed as a necessary but often overlooked step in deciding whether to apply summary results of clinical trials to individual patients.79 Every dimension of pain and its response to therapy displays considerable interindividual variability (Table 5). Genetic variability in nociception and analgesic responses is now undeniable,80’82 if still not fully understood. Patient expectations are a powerful influence upon the analgesic efficacy of drug and nondrug interventions.83,84 The neurobiological substrates of placebo mechanisms that contribute to analgesia (and nocebo mechanisms to hyperalgesia) are becoming clarified, and display interindividual variability in part due to genetics.85 Faced with this pervasive heterogeneity, to deny patients trials of interventions, or payment for apparently successful interventions because the average responses of such interventions did not differ from placebo in earlier published trials, is now unconscionable. Indeed, a paper jointly written by leaders in the U.S. Food and Drug Administration and National Institutes of Health86 asked, and was titled, “Are Means Meaningless?” In that piece, the authors pointed out that the overall response rate to Herceptin in breast cancer is 6% but approaches 100% in the 15% to 20% who are HER-2 positive. Thus, “the mean group results are essentially meaningless.” Based upon this and other examples, these policy makers indicated that masking of important individual findings or differences by inordinate emphasis upon group means should likewise be considered in analgesic drug development. In that context, N-of-1 trials should be considered as a means to judge for an individual patient whether a contemplated therapy should be carried out.87

Table 3. Bad Evidence and Bad Use of Evidence Can Impugn Good Treatments

Table 4. Paradoxes that Undermine the Applicability of EBM to Pain Medicine

Table 5. Reasons for Individual Variability of Responses to Nociceptive Stimuli and Analgesic Interventions

To ask “does the treatment work?” or to state “it doesn’t differ from placebo” are disingenuous when directed toward pain treatments in the abstract, divorced from assessment of individual patients. Such language falsely assumes a dichotomous response when the reality is one of continuous responses across multiple axes. Clinical reality is not perfectly captured by even the most sophisticated standardized instruments.88-91 But denying treatment based upon a false dichotomy that all who receive it have either a 0% or 100% response, guarantees withholding beneficial treatment from a proportion of patients who may well benefit. Formulas are inadequate bases for the denial of care. The modern discovery that the complexity of medical practice may defy simple rules, and render compendia of evidence most appropriate for use merely as “tools not rules,” was foreseen nearly 200 years ago. The Prussian general and strategist von Clausewitz, having begun a comprehensive monograph to place operational military strategy on a mathematical foundation, abandoned this effort when his detailed analyses led him to conclude that this was impossible. He did not abandon writing his military classic. On War, although he substantially recast its thesis and conclusions. In the final version he wrote, “All principles, rules and methods increasingly lack universality and absolute truth the moment they become a positive doctrine. They are there to present themselves for use. Judgment must always be free to determine whether or not they are suitable. Criticism must never use these results of theory as laws and standards, but only as a person acting in war should also do: as aids to judgment.”92

Conclusion

There is a growing acceptance of the view that pain management is a basic human right.12 This trend is clearly at odds with the growing frequency of denial of payment for pain care based upon the misapplication of EBM to argue that, based upon historical outcomes showing no difference on average between groups given a treatment or placebo, individuals now and in the future may not be offered the benefit of possibly helpful treatments. No one would argue for the merit of a treatment never observed to benefit anyone more than placebo, nor deny the need to improve the quality and quantity of evidence for many pain therapies, particularly invasive ones.93 Yet when certain individuals clearly respond to a specific treatment more than to a placebo, although group mean responses are equivalent for the treatment and placebo, the latter cannot justify refusal to pay for therapeutic trials to identify responders whose quality of life will improve with that treatment. When arguments based upon group means are used by insurors unwilling to pay for pain treatment, they can and should be challenged as fundamentally unscientific and at odds with the best available evidence. John Bonica, a former professional wrestler and tireless battler on behalf of patients with pain and those who care for them, would certainly do so were he alive today.

Acknowledgment

Miss Evelyn Hall provided expert secretarial assistance.

References

1. Carr DB, Jacox AK, Chapman CR, Ferrell BR, Fields HL, Heidrich G III, Hester NO, Hill CS Jr, Lipman AG, McGarvey CL, Miaskowski CA, Mulder DS, Payne R, Schechter N, Shapiro BS, Smith RS, Tsou CV, Vecchiarelli L. Acute Pain Management: Operative or Medical Procedures and Trauma. Clinical Practice Guideline. Rockville, MD: Agency for Health Care Policy and Research, Public Health Service, U.S. Department of Health & Human Services; 1992. AHCPR publication 92-0032. 2. Jacox AK, Carr DB, Payne R, Berde CB, Breitbart W, Cain JM, Chapman CR, Cleeland CS, Ferrell BR, Finley RS, Hester NO, Hill CS Jr, Leak WD, Lipman AG, Logan CL, McGarvey CL, Miaskowski CA, Mulder DS, Paice JA, Shapiro BS, Silberstein EB, Smith RS, Stover J, Tsou CV, Vecchiarelli L, Weissman DE. Management of Cancer Pain. Clinical Practice Guideline No. 9. Rockville, MD: Agency for Health Care Policy and Research, Public Health Service, U.S. Department of Health & Human Services; 1994. AHCPR publication 94-0592.

3. Goudas LC, Carr DB, Bloch R, Balk E, Ioannidis JPA, Terrin N, Gialeli-Goudas M, Chew P, Lau J. Management of Cancer Pain. Vol. 1 & Vol. 2 Evidence Tables. Evidence Report/Technology Assessment No. 35. Rockville, MD: Agency for Healthcare Research and Quality; 2001. AHRQ publication 02-E002.

4. Carr DB, Goudas LC, Lawrence D, Pirl W, Lau J, DeVine D, Kupelnick B, Miller K. Management of Cancer Symptoms: Pain, Depression, and Fatigue. Evidence Report/Technology Assessment No. 61. Rockville, MD: Agency for Healthcare Research and Quality; 2002. AHRQ publication 02-E032.

5. Practice guidelines for acute pain management in the perioperative setting. A report by the American Society of Anesthesiologists Task Force on Pain Management, Acute Pain section. Anesthesiology 1995;82:1071-1081.

6. American Society of Anesthesiologists Task Force on Acute Pain Management. Practice guidelines for acute pain management in the perioperative setting: an updated report by the American Society of Anesthesiologists Task Force on Acute Pain Management. Anesthesiology 2004; 100:1573-1581.

7. Wittink H, Wiffen P, Carr DB. Evidence-based medicine in pain management. In: Berman S, ed. Approaches to Pain Management: An Essential Guide for Clinical Leaders. Oakbrook Terrace, IL: Joint Commission on Accreditation of Healthcare Organizations; 2003:21- 33.

8. Wittink H, Carr DB, eds. Pain Management: Evidence, Outcomes and Quality of Life. A Sourcebook. New York, NY: Elsevier; 2008.

9. Carr DB, Goudas LC. Evidence-based pain medicine: the good, the bad, and the ugly. Reg Anesth Pain Med 2001;26:389-393.

10. Bacon DR. Henry Kissinger and P4P. American Society of Anesthesiologists Newsletter 2006;70:1-2.

11. Walsh T. Personalized medicine is central to Massachusetts’ continued leadership in life sciences. Massachusetts Medical Society Vital Signs 2007; 12:1-2.

12. Brennan F, Carr DB, Cousins MJ. Pain management: a fundamental human right. Anesth Analg 2007; 105: 205-221.

13. European Federation of IASP Chapters. Europe Against Pain: Declaration on Chronic Pain as a Major Healthcare Problem, a Disease in its Own Right. Brussels: EPIC, 2001.

14. Dahl JL. Working with regulators to improve the standard of care in pain management: the US experience. J Pain Symptom Manage 2002;24:136-147.

15. Scholten W, Nygren-Krug H, Zucker HA. The World Health Organization paves the way for action to free people from the shackles of pain. Anesth Analg 2007; 105:1-4.

16. Strassels SA, Carr DB, Meldrum M, Cousins MJ. Toward a canon of the pain and analgesia literature: A citation analysis. Anesth Analg 1999;89:1528-1533.

17. Papper EM. Romance, Poetry and Surgical Sleep: Literature Influences Medicine. Westport, CT: Greenwood Press, 1995.

18. Gordon DB, Dahl JL, Miaskowski C, McCarberg B, Todd KH, Paice JA, Lipman AG, Bookbinder M, Sanders SH, Turk DC, Carr DB. American Pain Society recommendations for improving the quality of acute and cancer pain management. Arch Intern Med 2005;165:1574-1580.

19. Carr DB. The development of national guidelines for pain control: synopsis and commentary. Ew J Pain 2001;5 Suppl A:91-98.

20. Morris DB. The Culture of Pain. Berkeley: University of California Press; 1994.

21. Frank AW. How stories remake what pain unmakes. In: Dostrovsky JO, Carr DB, Koltzenburg M, eds. Proceedings of the 10th World Congress on Pain. Progress in Pain Research and Management Vol. 24. Seattle: IASP Press; 2003:619-630.

22. Carr DB. Letter to Forum commenting on “Iceman from the Copper Age.” National Geographic 1993;4: 184.

23. Osier W. Physic and physicians as depicted in Plato. In: Aequanimitas, with Other Addresses to Medical Students, Nurses and Practitioners of Medicine. Philadelphia: Blakiston; 1905:45-76.

24. Holy Bible. Book of Daniel 1:3-20.

25. Cleary T. Living and Dying with Grace: Counsels of Hadrat Ali. Boston: Shambala Press; 1996.

26. Maimonides M. Ethical Writings of Maimonides. Weiss RL, Butterworth C, eds. New York: Dover Publications; 1975.

27. White KL. Health care research: old wine in new bottles. Pharos Alpha Omega Alpha Honor Med Soc 1993;56:12-16.

28. Shaw GB. The Doctor’s Dilemma. London: Penguin Books; 1946.

29. Cochrane AL. Effectiveness and Efficiency: Random Reflections on Health Services. Cambridge, UK: Cambridge University Press; 1989.

30. Jadad AR. Randomized Controlled Trials. London: BMJ Press; 1998.

31. Jadad AR. Meta-analysis: a valuable but easily misused tool. Curr Opin Anaesthesia! 1996;9:426-429.

32. McQuay HJ, Moore RA. An Evidence-Based Resource for Pain Relief. Oxford: Oxford University Press; 1998.

33. Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decision. Ann Intern Med 1997;126:376- 380.

34. Emerson RW. Natural History of Intellect. I. Powers and Laws of Thought. New York: Solar Press; 1995.

35. Krauss LM. Hiding in the Mirror: the Mysterious Allure of Extra Dimensions, from Plato to String Theory and Beyond. New York: Viking Penguin; 2005.

36. Randall L. Warped Passages: Unraveling the Mysteries of the Universe’s Hidden Dimensions. New York: Harper Collins; 2005.

37. Holland JH. Hidden Order: How Adaptation Builds Complexity. Reading, MA: Addison-Wesley Publication Company; 1995.

38. Tasic V. Mathematics and the Roots of Postmodern Thought. Oxford: Oxford University Press; 2001.

39. Smolin L. The Trouble with Physics. Boston: Houghton Mifflin; 2006.

40. Anonymous. The Compact Edition of the Oxford English Dictionary. Oxford: Oxford University Press; 1971.

41. Greene B. The Elegant Universe: Superstrings, Hidden Dimensions, and the Quest for the Ultimate Theory. New York: Norton; 1999.

42. Wolfram S. A New Kind of Science. Champaign, IL: Wolfram Media, Inc.; 2002.

43. Stewart I. Why Beauty is Truth: A History of Symmetry. New York: Basic Books; 2007.

44. Barrow JD. The Artful Universe. Boston: Little, Brown; 1995.

45. Hildebrandt S, Tromba A. The Parsimonious Universe: Shape and Form in the Natural World. New York: Springer Verlag; 1996.

46. Schimmel A. The Mystery of Numbers. New York: Oxford; 1993:3.

47. de Vries J. Pattern and precision: taking the measure of early Cycladic II Spedos variety figures. In: Getz-Gentle P, ed. Personal Styles in Early Cycladic Sculpture. Madison, WI: University of Wisconsin Press; 2001: 109-124.

48. Renfrew C. The Cycladic Spirit. New York: Harry N. Abrams; 1989.

49. Sagan S. Introduction to the Calculus of Variations. New York: Dover; 1969.

50. Stillwell JC. Calculus of variations. Encyclopedia Britannica Online. Available at http://www.britannica.com/eb/article-9384665/ Calculus-of-Variations#848342 .hook. Accessed February 19, 2008.

51. Lebell S. Epictetus: A Manual for Living. New York: HarperCollins; 1994.

52. Weinstock R. Calculus of Variations, with Applications to Physics and Engineering. New York: Dover; 1974.

53. Goldstein H. Classical Mechanics. Reading, MA: Addison- Wesley; 1950.

54. O’Connor JJ, Robertson EF. Pierre Louis Moreau de Maupertuis. Available at http://www-history.mcs .st-andrews.ac.uk/Biographies/ Maupertuis.html. Accessed February 19, 2008.

55. Dawkins R. The God Delusion. New York: Houghton Mifflin; 2006.

56. Dennett DC. Breaking the Spell: Religion as Natural Phenomenon. New York: Penguin; 2006.

57. Newberg A, Waldman MR. Why We Believe What We Believe: Uncovering our Biological Need for Meaning, Spirituality, and Truth. New York: Simon & Schuster; 2006.

58. Lau J, Antman EM, Jimenez-Silva J, Kupelnick B, Mosteller F, Chalmers TC. Cumulative meta-analysis of therapeutic trials for myocardial infarction. N Engl JMed 1992;327:248-254.

59. Ballantyne JC, Carr DB, deFerranti S, Suarez T, Lau J, Chalmers TC, Angelillo IF, Mosteller F. The comparative effects of postoperative analgesic therapies on pulmonary outcome: cumulative meta-analyses of randomized, controlled trials. Anesth Analg 1998; 86:598-612.

60. Chaitin G. On the intelligibility of the universe and the notions of simplicity, complexity and irreducibility. Available at http://www.cs.auckland.ac.nz/CDMTCS/chaitin/bonn.html. Accessed February 19, 2008.

61. Eisenberg E, Carr DB, Chalmers TC. Neurolytic celiac plexus block for treatment of cancer pain: a metaanalysis [Erratum in 1995:81:213]. Anesth Analg 1995;80:290-295.

62. Merrill DG. Hoffman’s glasses: evidence-based medicine and the search for quality in the literature of interventional pain medicine. Reg Anesth Pain Med 2003;28:547-560.

63. Rathmell JP, Carr DB. The scientific method, evidence-based medicine, and rational use of interventional pain treatments. Reg Anesth Pain Med 2003;28: 498-501.

64. West S, King V, Carey TS, Lohr KN, McKoy N, Sutton SF, Lux L. Systems to Rate the Strength of Scientific Evidence. Evidence Report/ Technology Assessment No. 47. Rockville, MD: Agency for Healthcare Research and Quality; 2002. AHRQ publication 02-E016.

65. Bak P. How Nature Works: the Science of Self-Organized Criticality. New York: Springer-Verlag; 1996.

66. Casti JL. Searching for Certainty: What Scientists Can Know About the Future. New York: William Morrow and Company, Inc.; 1990.

67. Chaitin G. Paradoxes of randomness and the limitations of mathematical reasoning. Complexity 2002;7: 14-21. 68. Prigogine I. The End of Certainty: Time, Chaos, and the New Laws of Nature. New York: Simon &- Schuster; 1996.

69. Gotzsche PC, Hrobjartsson A, Marie K, Tendal B. Data extraction errors in meta-analyses that use standardized mean differences [Erratum in 2007;298: 2264], JAMA 2007;298:430-437.

70. Gray A, Kehlet H, Bonnet F, Rawal N. Predicting postoperative analgesia outcomes: NNT league tables or procedure-specific evidence? Br J Anaesth 2005;94: 710-714.

71. Neugebauer EA, Wilkinson RC, Kehlet H, Schug SA; PROSPECT Working Group. PROSPECT: a practical method for formulating evidence- based expert recommendations for the management of postoperative pain. Surg Endosc 2007;21:1047-1053.

72. Rathmell JP, Wu CL, Sinatra RS, Ballantyne JC, Ginsberg B, Gordon DB, Liu SS, Reuben SS, Rosenquist RW, Viscusi ER. Acute post- surgical pain management: a critical appraisal of current practice. Reg Anesth Pain Med 2006;31(4 Suppl l):l-42.

73. Glasgow RE. Translating research to practice: lessons learned, areas for improvement, and future directions. Diabetes Care 2003;26:2451-2456.

74. Carr DB. Evidence, explanation – or “the power of myth”? Curr Opin Anaesthesiol 1996;9:415-420.

75. Holland JH. Emergence: From Chaos to Order. Reading, MA: Addison-Wesley Publication Company; 1998.

76. Rothwell PM. Can overall results of clinical trials be applied to all patients? Lancet 1995;345:1616-1619.

77. Kravitz RL, Duan N, Braslow J. Evidence-based medicine, heterogeneity of treatment-effects, and the trouble with averages [Erratum in 2006;84:759-760]. Milbank Q 2004;82:661-687.

78. Ioannidis JP, Lau J. Heterogeneity of baseline risk within patient populations of clinical trials: a proposed evaluation algorithm. Am J Epidemiol 1998; 148:1117-1126.

79. Kent DM, Hayward RA. Limitations of applying summary results of clinical trials to individual patients: the need for risk stratification. JAMA 2007;298:1209-1212.

80. Coghill RC, Eisenach J. Individual differences in pain sensitivity: implications for treatment decisions. Anesthesiology 2003;98:1312-1314.

81. Kim HS, Neubert JK, Iadarola MJ, Miguel AS, Xu K, Goldman D, Dionne RA. Genetic influence on pain sensitivity: evidence of heritability associated single nucleotide polymorphisms in opioid receptor genes. In: Dostrovsky JO, Carr DB, Koltzenburg M, eds. Proceedings of the 10th World Congress on Pain. Progress in Pain Research and Management, Vol. 24. Seattle: IASP Press; 2003:513- 520.

82. Zubieta JK, Heitzeg MM, Smith YR, Bueller JA, Xu K, Xu Y, Koeppe RA, Stohler CS, Goldman D. COMT vall58met genotype affects mu- opioid neurotransmitter responses to a pain Stressor. Science 2003;299: 1240-1243.

83. Linde K, Witt CM, Streng A, Weidenhammer W, Wagenpfeil S, Brinkhaus B, Willich SN, Melchart D. The impact of patient expectations on outcomes in four randomized controlled trials of acupuncture in patients with chronic pain. Pain 2007; 128:264-271.

84. Benedetti F. What do you expect from this treatment? Changing our mind about clinical trials. Pain 2007;128:193-194.

85. Finniss DG, Benedetti F. Placebo analgesia, nocebo hyperalgesia. Pain: Clinical Updates 2007; 15:1-4.

86. Witter J, Simon LS, Dionne R. Are means meaningless? The application of individual responder analysis to analgesic drug development. APS Bulletin 2003; 13: 1-7.

87. Cepeda MS, Acevedo JC, Alvarez H, Miranda N, Cortes C, Carr DB. An N-of-1 trial as an aid to decision-making prior to implanting a permanent spinal cord stimulator. Pain Med 2008;9:235-239.

88. Greenhalgh T, Hurwitz B, eds. Narrative Based Medicine: Dialogue and Discourse in Clinical Practice. London: BMJ Press; 1998.

89. Charon R. Narrative medicine: a model for empathy, reflection, profession, and trust. JAMA 2001;286: 1897-1902.

90. Carr DB. Memoir of a meta-analyst: on the silent “L” in “Quantitative.” In: Carr DB, Loeser JD, Morris DB, eds. Narrative, Pain, and Suffering. Progress in Pain Research and Management Vol. 34. Seattle: IASP Press; 2005:325-354.

91. Burgess FW. Pain scores: are the numbers adding up to quality patient care and improved pain control? Pain Med 2006;7:371-372.

92. Von Ghyczy T, von Oettinger B, Bassford C, eds. Clausewitz on Strategy: Inspiration and Insight from a Master Strategist. New York: Wiley; 2001. p. 74

93. Carr DB, Eidelman A. Clinical study design. In: Krames E, Peckham PH, Rezai AR, eds. Textbook of Neuromodulation. Oxford: Blackwell Publishing; 2009; in press.

Daniel B. Carr, M.D., F.A.B.P.M., F.F.P.M.A.N.Z.C.A. (Hon.)

From the Department of Pain Research, Tufts-New England Medical Center, Department of Anesthesiology, Tufts University School of Medicine, Boston, and Javelin Pharmaceuticals, Inc., Cambridge, MA.

Accepted for publication January 18, 2008.

Preparation of this manuscript was supported by the Saltonstall Fund for Pain Research.

Reprint requests: Daniel B. Carr, M.D., Department of Anesthesia, Tufts-New England Medical Center, 750 Washington Street, Boston, MA 02111. E-mail: daniel.carr@tufts.edu

(c) 2008 by the American Society of Regional Anesthesia and Pain Medicine.

1098-7339/08/3303-0001S34.00/0

doi:10.1016/j.rapm.2008.01.005

Copyright Churchill Livingstone Inc., Medical Publishers May/Jun 2008

(c) 2008 Regional Anesthesia and Pain Medicine. Provided by ProQuest Information and Learning. All rights Reserved.