Evidence Synthesis in CER
To this point, this chapter has focused largely on primary research approaches in CER, in which a research team identifies a question, selects a study population, comparators and study design, conducts the study and reports its findings. In medical and clinical science, however, one study rarely produces results that are sufficiently definitive to change practice. In most cases, knowledge develops through a series of experiments or observations that cumulatively impact our shared understanding.
Secondary research, or evidence synthesis, is a body of methods that has emerged as part of evidence-based medicine and aims to extend knowledge through aggregating results or data from multiple individual studies. It seeks to determine what is known in a field of inquiry across all the available, relevant primary research studies and to estimate the variability or consistency of the evidence. These tools allow summary of “what we know” (consistent conclusions) and “how surely we know it” (i.e., confidence that the conclusions are valid, precise, and unlikely to change with future research) to most appropriately inform decision-makers in health-care policy and practice or determine the needs for future research.66 As the methodology of evidence synthesis, including systematic reviews and other approaches, has developed, “secondary” research has gained a prominent and influential role in CER, health-care policy and clinical practice worldwide, especially as a knowledge foundation for clinical guidelines.
There are several approaches under the umbrella of evidence synthesis. The most well-known of these, systematic reviews, was promoted in the early 1990s as an antidote to the selective citation of findings in support of an expert's opinion in a given clinical area.67,68 Systematic reviews have gained acceptance among decision-makers as an approach that uses comprehensive, rigorous, explicit, and ostensibly reproducible methods and includes critical appraisal of each study's design and conduct. As defined by the IOM, systematic reviews are “essential for clinicians who strive to integrate research findings into their daily practices” and are thus a critical component of trustworthy clinical practice guidelines.69
The production of systematic reviews is currently supported by the international Cochrane Collaboration, the AHRQ Evidence-based Practice Centers, and many others. Systematic reviews are distinguished from narrative reviews and characterized by clearly specified, objective methods for locating, critically appraising, summarizing, and reporting all research relevant to a particular question. In 2011, the IOM specified 21 standards with 82 performance elements for conducting systematic reviews of CER; these standards were intended to ensure that systematic reviews are objective, transparent, and scientifically valid.69 PCORI has adopted the IOM standards almost entirely into PCORI's Methodology Standards focusing on the standards for evidence synthesis (see Text Box).
Standards for Evidence Synthesis
- 1.
Initiate a team with appropriate expertise and experience to conduct the systematic review and ensure user and stakeholder input into the systematic review design and conduct, while appropriately managing conflicts of interest in all participants.
- 2.
Formulate the systematic review topic, develop, and peer-review the review protocol, and publish the final protocol with timely amendments as warranted.
- 3.
Conduct and document a comprehensive, systematic search for evidence, with attention to addressing potential sources of bias in research results reporting.
- 4.
For individual studies:
- a.
Assess and document assessment of individual studies for inclusion/exclusion according to protocol; and
- b.
Conduct and document critical appraisal of individual studies for bias, relevance, and fidelity of interventions using prespecified criteria.
- 5.
Use standard and rigorous data collection and management approaches.
- 6.
Synthesize the body of evidence qualitatively and, if warranted, quantitatively, using prespecified methods.
- 7.
Evaluate the body of evidence on characteristics-related overall quality and confidence in the estimates of effect on prespecified outcomes.
- 8.
Report the results using a structured format, peer-review the draft report (including public comment period), and publish the final report to allow free public access.
These standards reflect current scientific consensus and are likely to be supplemented or revised periodically as major producers of systematic reviews, such as the Cochrane, AHRQ, International Health Technology Assessment (HTA), and others, continue to work with decision-makers and conduct additional empirical studies of the impact of these standards on the production of unbiased, relevant systematic reviews. New standards also will be developed and current standards revised through further empirical methods development for analytic approaches including mixed treatment comparisons, network meta-analysis, and individual patient data meta-analysis (see Chapter 22).
As with primary studies, systematic reviews were initially focused mostly on questions of efficacy rather than comparative effectiveness. However, the principles and methods of systematic review apply equally well for synthesizing results from CER studies. Care must be taken to assure the equivalence of the comparisons being synthesized across studies. When the same (or reasonably similar) treatment-comparison contrasts are available in a series of studies, traditional metaanalytic techniques can be appropriate for combining results.
HTAs were early exemplars of the application of systematic review methods to comparing the benefits and harms of new technologies with existing alternatives. HTA research aims to provide evidence for decision-making on the incorporation of new health technologies and uses an interdisciplinary approach to evaluate the impact of these technologies in clinical practice.70 The “technologies” assessed include pharmaceuticals as well as devices, procedures, and other interventions. The United States had an Office of Technology Assessment from 1972 to 1995 that provided nonpartisan information on a wide range of scientific and technological issues, including health care, to Congress. Although the United States no longer has this national agency, HTA remains a robust enterprise internationally .71 Within the United States, HTA continues in some state initiatives, such as the Drug Effectiveness Review Project in Oregon, which synthesizes clinical evidence and was originally intended for drug-class decisions under Medicaid.72 HTA also has been supported by payers and health system decision-makers through such entities as the Blue Cross–Blue Shield Technology Evaluation Center.73 The AHRQ also funds some HTA through its Evidence-based Practice Centers, and these often are used by Medicare as part of their national coverage decisions.
As systematic review has gained traction, many innovations or adaptations have developed to address needs for more timely or robust summarized evidence that is applicable to and can comprehensibly inform decision-making across a range of contexts. In many cases, there are not multiple CER studies featuring direct comparisons of two or more treatments. Indirect comparisons represent inferences about treatment A versus treatment B through synthesizing results of studies that do not directly compare the two, such as by comparing results in studies of A versus no treatment and B versus no treatment. A number of caveats apply when making indirect comparisons, but quantitative methods known as Mixed Treatment Comparisons have been developed to address the challenges of conducting statistically valid syntheses based on indirect comparisons. One of the better known methods, network meta-analysis, is increasingly applied in protocols and reports from CER reviews.74–76 Network meta-analysis leverages both the direct and indirect evidence available for comparisons of two or more interventions; this methodology maintains as much evidence as possible when conducting a systematic review across trials with a common comparator.77 There are multiple caveats that must be kept in mind, including the observational nature of the indirect comparisons.78 Nonetheless, this method holds promise, particularly when ideal evidence is not available to address urgent decision-making needs.
Most metaanalyses and systematic reviews are based on aggregate data, i.e., the results from multiple studies. These data are readily accessed from the published medical literature with caveats to protect against publication bias favoring overrepresentation of positive findings in published results.79 Despite their relative accessibility, aggregate data are not as robust as individual participating data for synthesizing research results. Individual participant data (IPD) meta-analysis has been called the gold standard since it allows for characterization of results according to individual participant characteristics (such as age, sex, disease risk, or comorbidity), which cannot be adequately investigated using aggregated study results. IPD meta-analysis is a particularly powerful technique for addressing heterogeneous treatment effects and for targeting treatments to those most likely to receive benefit (or least likely to be harmed). IPD meta-analysis has not been applied as often as might be expected for a “gold standard” research technique, largely due to the considerable challenges in gaining access to original trial data. However, application of this method is growing and should be expected to accelerate further as the ethos of data sharing and open science gain momentum.79–81
Synthesis of existing evidence is always limited by the scope and rigor of the existing evidence, and limitations are most severe for newly emerging, often expensive treatments. There may be little or no evidence on comparisons with existing, less expensive alternatives, due to in part to a lack of requirements or incentives for this type of research prior to approval. In such cases, traditional methods of evidence synthesis will not be able to quickly support decision-makers' needs, although attempts at synthesis may be useful for identifying the relevant gaps in the CER literature.82
There is growing advocacy and appreciation for a broader range of approaches that can be considered part of the family of evidence synthesis methods. One such approach is decision modeling. In the absence of empiric evidence on important comparative effectiveness questions, decision-makers may determine that mathematical models incorporating estimates from available empiric evidence and reasonable assumptions for missing parameters provide a worthwhile substitute. Such models can estimate the likely comparative effectiveness research results needed by decision-makers. In some instances, these models extend to incorporate cost-effectiveness analyses, to more conveniently assess the relative value of various alternatives. Cost-effectiveness models typically incorporate outcomes metrics that can facilitate estimation of relative value across diseases or conditions. These include life-years saved, quality- or disability-adjusted life-years saved and others. While the advantages and disadvantages of such approaches are beyond this chapter, it is important to acknowledge that these are active and important areas that those interested in policy-relevant CER may be called upon to understand and address.