The purpose of this study was to develop a human–machine system that could be relatively easily tailored to routinely and accurately classify injury narratives from large administrative databases such as workers compensation. We used a semi-automated approach based on two Naïve Bayesian algorithms to classify 15,000 workers compensation narratives into two-digit Bureau of Labor Statistics (BLS) event (leading to injury) codes. Narratives were filtered out for manual review if the algorithms disagreed or made weak predictions.
This approach resulted in an overall accuracy of 87%, with consistently high positive predictive values across all two-digit BLS event categories including the very small categories (e.g., exposure to noise, needle sticks). The Naïve Bayes algorithms were able to identify and accurately machine code most narratives leaving only 32% (4853) for manual review. This strategy substantially reduces the need for resources compared with manual review alone.
© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号 地址:北京市海淀区学院路29号 邮编:100083 电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700 |