Development of a Machine Learning Model to Estimate US Firearm Homicides in Near Real Time

GVPedia Study Database

Development of a Machine Learning Model to Estimate US Firearm Homicides in Near Real Time

Category: Homicide|Journal: JAMA Network Open (full text)|Author: A Alic, E Swedo, R Law|Year: 2023


Firearm homicides are a major public health concern; lack of timely mortality data presents considerable challenges to effective response. Near real-time data sources offer potential for more timely estimation of firearm homicides.


To estimate near real-time burden of weekly and annual firearm homicides in the US.

Design, Setting, and Participants

In this prognostic study, anonymous, longitudinal time series data were obtained from multiple data sources, including Google and YouTube search trends related to firearms (2014-2019), emergency department visits for firearm injuries (National Syndromic Surveillance Program, 2014-2019), emergency medical service activations for firearm-related injuries (biospatial, 2014-2019), and National Domestic Violence Hotline contacts flagged with the keyword firearm (2016-2019). Data analysis was performed from September 2021 to September 2022.

Main Outcomes and Measures

Weekly estimates of US firearm homicides were calculated using a 2-phase pipeline, first fitting optimal machine learning models for each data stream and then combining the best individual models into a stacked ensemble model. Model accuracy was assessed by comparing predictions of firearm homicides in 2019 to actual firearm homicides identified by National Vital Statistics System death certificates. Results were also compared with a SARIMA (seasonal autoregressive integrated moving average) model, a common method to forecast injury mortality.


Both individual and ensemble models yielded highly accurate estimates of firearm homicides. Individual models’ mean error for weekly estimates of firearm homicides (root mean square error) varied from 24.95 for emergency department visits to 31.29 for SARIMA forecasting. Ensemble models combining data sources had lower weekly mean error and higher annual accuracy than individual data sources: the all-source ensemble model had a weekly root mean square error of 24.46 deaths and full-year accuracy of 99.74%, predicting the total number of firearm homicides in 2019 within 38 deaths for the entire year (compared with 95.48% accuracy and 652 deaths for the SARIMA model). The model decreased the time lag of reporting weekly firearm homicides from 7 to 8 months to approximately 6 weeks.

Conclusions and Relevance

In this prognostic study of diverse secondary data on machine learning, ensemble modeling produced accurate near real-time estimates of weekly and annual firearm homicides and substantially decreased data source time lags. Ensemble model forecasts can accelerate public health practitioners’ and policy makers’ ability to respond to unanticipated shifts in firearm homicides.

Verified by MonsterInsights