TitleUtility of the 5-Minute Apgar Score as a Research Endpoint.
Publication TypeJournal Article
Year of Publication2019
AuthorsBovbjerg, ML, Dissanayake, MV, Cheyney, M, Brown, J, Snowden, JM
JournalAm J Epidemiol
Date Published09/2019
KeywordsApgar Score, Area Under Curve, Biomedical Research, Datasets as Topic, Epidemiologic Methods, Humans, Infant, Newborn, Infant, Newborn, Diseases, Predictive Value of Tests, Risk Factors, ROC Curve, Sensitivity and Specificity

Although Apgar scores are commonly used as proxy outcomes, little evidence exists in support of the most common cutpoints (<7, <4). We used 2 data sets to explore this issue: one contained planned community births from across the United States (n = 52,877; 2012-2016), and the other contained hospital births from California (n = 428,877; 2010). We treated 5-minute Apgars as clinical "tests," compared against 18 known outcomes; we calculated sensitivity, specificity, positive and negative predictive values, and the area under the receiver operating characteristic curve for each. We used 3 different criteria to determine optimal cutpoints. Results were very consistent across data sets, outcomes, and all subgroups: The cutpoint that maximizes the trade-off between sensitivity and specificity is universally <9. However, extremely low positive predictive values for all outcomes at <9 indicate more misclassification than is acceptable for research. The areas under the receiver operating characteristic curves (which treat Apgars as quasicontinuous) were generally indicative of adequate discrimination between infants destined to experience poor outcomes and those not; comparing median Apgars between groups might be an analytical alternative to dichotomizing. Nonetheless, because Apgar scores are not clearly on any causal pathway of interest, we discourage researchers from using them unless the motivation for doing so is clear.

Alternate JournalAm J Epidemiol
PubMed ID31145428
PubMed Central IDPMC6736341