Reshaping Recommendations
Making clinical practice guidelines work for pathologists
At a Glance
- Many clinical practice guidelines (CPGs) are written with minimal input from pathologists
- Adhering to these guidelines can be difficult because the level of evidence supporting laboratory tests often differs from evidence in other areas of medical practice
- CPGs must not only be carefully written, but also rigorously evaluated to ensure that they’re meaningful
- In order to best make use of CPGs, pathologists should involve themselves in guideline creation and assessment
Most clinical and laboratory medical professionals are aware that there’s considerable interdependence in their areas of expertise – for instance, in disease diagnosis, screening, prognosis and treatment. Clinicians use clinical practice guidelines (CPGs) to provide the best possible care to their patients, but it isn’t only clinical practitioners who encounter these guidelines in their work; pathologists, too, must consider the recommendations made. Though some are authored by specialty pathology societies, there are many others that contain recommendations that directly impact pathologists. Many CPGs that include recommendations for lab-based testing are written by clinicians with minimal or no input from pathologists.
But CPGs are not the final word in medicine. The evidence that supports laboratory tests is often different to the evidence supporting other areas of medical practice – and this is one of the main challenges CPGs encounter in making recommendations that relate to pathology (1 ). To make these guidelines work for everyone, it’s important for pathologists not only to have insight into how CPGs are developed and how they should be evaluated by the people who are actually using them, but also to get involved in pulling them together.
Getting started with standards
Because of the huge volume of CPGs out there and the wide variety of ways to report and assess evidence, many pathologists trying to unravel the details have found themselves frustrated. One way of evaluating CPGs is to measure them against the standards (see Supplementary Table 1) published by the Institute of Medicine (IOM) (2). Though many national organizations and specialist societies have developed their own handbooks for writing CPGs (3, 4), some of those handbooks have been criticized for not meeting IOM standards (5, 6). As the IOM’s suggestions become more widely adopted and are incorporated into more agencies’ procedures, it is hoped that CPGs will become more consistent and easier to apply.
The first step to good guideline development is posing a relevant clinical question. It’s a delicate balance – the question needs to be specific enough to allow a focused review of the evidence, but also broad enough to allow application in a variety of clinical settings. A good example is the United Kingdom’s National Institute for Health and Care Excellence (NICE) diagnostic guideline for genetic testing in adults with locally advanced or metastatic non-small cell lung cancer (NSCLC). It sets out to “identify which test and test strategies for EGFR-TK mutation testing in adults with previously untreated, locally advanced or metastatic NSCLC are clinically and cost effective for informing first-line treatment decisions…” (7). It’s a useful guideline because it evaluates all of the possible testing strategies and only then makes recommendations for use in the defined populations.
Evaluating evidence
Clinical evaluations are classified as case reports, case series, retrospective or prospective observational cohorts, case-control studies, cross-sectional studies, and non-randomized or randomized clinical trials. Any of these can be used to support medical recommendations, but there is a perceived hierarchy in levels of evidence, as shown in Figure 1, which describes the hierarchy of levels developed by the Oxford Centre for Evidence-Based Medicine (CEBM) and the 6S model.
The lowest level of evidence is mechanistic reasoning, which rarely applies to laboratory medicine. For diagnostic tests, there are typically very few randomized controlled trials, so clinical researchers most often rely on cross-sectional studies to provide evidence. The 6S model for applying evidence to clinical practice extends the levels of evidence beyond systematic review to include even higher levels: synopses of synthesis, such as the Database of Abstracts of Reviews of Effects (DARE) ; summaries, such as evidence-based practice guidelines; and systems, such as computerized decision support systems (8 ). The 6S model requires CPGs to be evidence-based, rather than opinion-based, and considers the role of well-conducted systematic review essential to guideline development. The IOM standards, too, require systematic review – but it can be challenging, though not impossible, to accomplish, especially in the context of pathology.
Many of the organizations commissioning CPGs have developed their own procedures to help with interpreting the evidence used to make recommendations. These procedures are usually based on similar principles, but often they aren’t consistent between organizations. The Grading of Recommendations Assessment, Development and Evaluation (GRADE ) project strives to standardize the way CPG recommendations are evaluated. Initiated in 2007, GRADE is seeing increasing adoption, but it still isn’t the only system in use, particularly as there remains some debate about its appropriateness for diagnostic tests (1, 9). For instance, one common recommendation is the use of the B-type natriuretic peptide (BNP) test to exclude heart failure in a primary care setting; it’s ranked differently in different CPGs (see Table 1).
The GRADE system aims to evaluate risk of bias, inconsistency, indirectness, imprecision, and publication bias. The best means of achieving these goals is a systematic review that addresses the clinical question posed by the guideline; if good review technique is followed, the review team should be able to evaluate most of the GRADE components and develop an overall impression of the strength of the evidence – which can be reported as high, moderate, low, or very low (see Table 2). For example, a CPG relating to high blood triglyceride levels reads, “The Task Force recommends basing the diagnosis of hypertriglyceridemia on fasting triglyceride levels and not on non-fasting triglyceride levels (1|+++o)” (10). Tools like data collection tables can be used to estimate precision and consistency, whereas others like the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) can be used to evaluate risk of bias and directness (11).
With regard to the BNP test, we’ve recently applied the Agency for Healthcare Research and Quality (AHRQ) grading process to the questions set for systematic reviews of BNP in heart failure (12, 13). We used the AHRQ process for both diagnostic and prognostic questions, and were able to achieve most of the principles set out by the GRADE project. We found that, for diagnostic tests, the types of evidence available usually result in deficiencies in bias and directness, which can become even more challenging when factoring in screening, prognosis, and treatment monitoring.
Considering CPG quality
The Appraisal of Guidelines for Research and Evaluation II (AGREE II) instrument assesses the quality of CPGs to help with developing new guidelines as well as with reporting their recommendations (14). Already validated for laboratory tests (5), AGREE II takes into account the things that need to be considered when preparing or presenting CPGs – especially the domain covering the rigor of development, which has historically been the most challenging to address. The AGREE II domains (available online at www.agreetrust.org along with the tool itself) include:
- Domain 1: Scope and Purpose
- Domain 2: Stakeholder Involvement
- Domain 3: Rigor of Development
- Domain 4: Clarity of Presentation
- Domain 5: Applicability
- Domain 6: Editorial Independence
The AGREE II tool works well for CPGs, but to specifically look at considerations for medical tests in CPGs, a European working group has suggested a comprehensive checklist of items to consider (15). This includes important factors in the post-analytical phase, such as reference intervals, cutoff values and turnaround time. Additional web-based resources for creating and evaluating guidelines are available (see Table 3).
Only after a full evaluation can we begin implementing CPGs in our practices. Complete evaluations should include an examination of the clinical question posed by the guideline, the population encompassed by the question, the intervention being applied, and the expected outcome. If all of these factors match a specific practice, then the guideline can be considered and adapted for that practice’s use – but even with all of these checks, there’s still no way of knowing how stringent the CPGs development was. That’s where the appraisal tools come in – to provide an idea of the rigor behind the development of a given guideline. To be sure that a guideline is appropriate for the laboratory as well as the clinic, we need to ensure that it’s been evaluated, that the evaluation was rigorous enough, and that it included enough specifics of laboratory medicine to be meaningful.
Hopefully, a better understanding of CPG development and appraisal will encourage more pathologists to contribute to the process. Lab medicine is an important component of CPGs, and the methods and evidence used in the laboratory don’t always look like those found in other areas of specialization. If we want to ensure that these guidelines are just as useful for pathologists as for clinical practitioners, we need to get involved in the process of creating and assessing them, so that they are as helpful as possible for the laboratory professionals who use them.
Janet Simons is a third year resident in medical biochemistry at McMaster University in Hamilton, Ontario, Canada.
Andrew Don-Wauchope is a medical biochemist with the Hamilton Regional Laboratory Medicine Program and associate professor in pathology and molecular medicine at McMaster University, Ontario, Canada.
- AR Horvath, et al., “From biomarkers to medical tests: The changing landscape of test evaluation”, Clin Chim Acta, 427, 49–57 (2014). PMID: 24076255.
- R Graham, et al., Clinical Practice Guidelines We Can Trust, 1–2. The National Academies Press: 2011.
- D Davis, et al., Canadian Medical Association handbook on clinical practice guidelines. Canadian Medical Association: 1997.
- Scottish Intercollegiate Guidelines Network, SIGN 50: a guideline developer’s handbook. Scottish Intercollegiate Guidelines Network: 2014.
- AC Don-Wauchope, et al., “Applicability of the AGREE II instrument in evaluating the development process and quality of current National Academy of Clinical Biochemistry guidelines”, Clin Chem, 58, 1426–1437 (2012). PMID: 22879395.
- S Sabharwal, et al., “Guidelines in cardiac clinical practice: evaluation of their methodological quality using the AGREE II instrument”, J R Soc Med, 29, 315–322 (2013). PMID: 23759888.
- National Institute for Health and Care Excellence, “EGFR-TK mutation testing in adults with locally advanced or metastatic non-small-cell lung cancer”, (2013). Available at: bit.ly/14KRXhr. Accessed January 12, 2015.
- A DiCenso, et al., “Accessing preappraised evidence: fine-tuning the 5S model into a 6S model”, Ann Intern Med, 151, JC3-2–JC3-3 (2009). PMID: 19755349.
- JL Brozek, et al., “Grading quality of evidence and strength of recommendations in clinical practice guidelines. Part 1 of 3. An overview of the GRADE approach and grading quality of evidence about interventions”, Allergy, 64, 669–677 (2009). PMID: 19210357.
- L Berglund, et al., “Evaluation and treatment of hypertriglyceridemia: an Endocrine Society clinical practice guideline”, J Clin Endocrinol Metab, 97, 2969–2989 (1012). PMID: 22962670.
- PF Whiting, et al., “QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies”, Ann Intern Med, 155, 529–536 (2012). PMID: 22007046.
- SA Hill, et al., “Use of BNP and NT-proBNP for the diagnosis of heart failure in the emergency department: a systematic review of the evidence”, Heart Fail Rev, 19, 421–438 (2014). PMID: 24924957908.
- RA Booth, et al., “Performance of BNP and NT-proBNP for diagnosis of heart failure in primary care patients: a systematic review”, Heart Fail Rev, 19, 439–451 (2014). PMID: 24969534.
- MC Brouwers, et al., “AGREE II: advancing guideline development, reporting and evaluation in health care”, CMAJ, 182, E839–E842 (2010). PMID: 20603348.
- KM Aakre, et al., “Critical review of laboratory investigations in clinical practice guidelines: proposals for the description of investigation”, Clin Chem Lab Med, 51, 1–10 (2013). PMID: 23037517.
Janet Simons is a third year resident in medical biochemistry at McMaster University in Hamilton, Ontario, Canada.
Andrew Don-Wauchope is a medical biochemist with the Hamilton Regional Laboratory Medicine Program and associate professor in pathology and molecular medicine at McMaster University, Ontario, Canada.