Developing and testing a tool for the classification of study designs in systematic reviews of interventions and exposures

The highest-ranked tool was a design algorithm for studies of health care interventions developed, but no longer advocated, by the Cochrane Non-Randomised Studies Methods Group. This tool was used as the basis for our classification tool and was revised to encompass more study designs and to incorpo...

Full description

Bibliographic Details
Main Author: Hartling, Lisa
Corporate Authors: United States Agency for Healthcare Research and Quality, University of Alberta Evidence-based Practice Center
Format: eBook
Language:English
Published: Rockville, MD Agency for Healthcare Research and Quality 2010, [2010]
Series:Methods research report
Subjects:
Online Access:
Collection: National Center for Biotechnology Information - Collection details see MPG.ReNa
Description
Summary:The highest-ranked tool was a design algorithm for studies of health care interventions developed, but no longer advocated, by the Cochrane Non-Randomised Studies Methods Group. This tool was used as the basis for our classification tool and was revised to encompass more study designs and to incorporate elements of other tools. A sample of 30 studies was used to test the tool. Three members of the Steering Committee developed a reference standard (i.e., the "true" classification for each study); 6 testers applied the revised tool to the studies. Interrater reliability was measured using Fleiss' kappa (o) and accuracy of the testers' classification was assessed against the reference standard. Based on feedback from the testers and the reference standard committee, the tool was further revised and tested by another 6 testers using 15 studies randomly selected from the original sample.
Application of a tool to classify study designs in the context of a systematic review should be accompanied by adequate training, pilot testing, and documented decision rules
BACKGROUND: Classification of study design can help provide a common language for researchers. Within a systematic review, definition of specific study designs can help guide inclusion, assess the risk of bias, pool studies, interpret results, and grade the body of evidence. However, recent research demonstrated poor reliability for an existing classification scheme. OBJECTIVES: To review tools used to classify study designs; to select a tool for evaluation; to develop instructions for application of the tool to intervention/exposure studies; and to test the tool for accuracy and interrater reliability. METHODS: We contacted representatives from all AHRQ Evidence-based Practice Centers (EPCs), other relevant organizations, and experts in the field to identify tools used to classify study designs. Twenty-three tools were identified; 10 were relevant to our objectives. The Steering Committee ranked the 10 tools using predefined criteria.
RESULTS: In the first round of testing the inter-rater reliability was fair among the testers (o = 0.26) and the reference standard committee (o = 0.33). Disagreements occurred at all decision points in the algorithm; revisions were made based on the feedback. The second round of testing showed improved interrater reliability (o = 0.45, moderate agreement) with improved, but still low, accuracy. The most common disagreements were whether the study was "experimental" (5/15 studies) and whether there was a comparison (4/15 studies). In both rounds of testing, the level of agreement for testers who had completed graduate-level training was higher than for testers who had not completed training. CONCLUSION: Potential reasons for the observed low reliability and accuracy include the lack of clarity and comprehensiveness of the tool, inadequate reporting of the studies, and variability in user characteristics.
Item Description:Title from PDF title page. - "Contract no. 290-02-0023.". - "December 2010.". - Mode of access: Internet