Our client has a digital platform to help their clients manage risk associated with injury claims and has a sizable dataset around past claims. Risk managers in their client companies are largely working in the dark when facing a dashboard of sometimes hundreds of risk claims, comprised of primarily unstructured text describing accidents and injuries. They needed an application to provide insight into which claims were likely to be the most serious in terms of cost and duration to resolve, and more insight on the underlying causes and patterns related to the injuries.
We used natural language processing (NLP) to transform large unstructured data sets of claim injury descriptions into usable, structured data describing the injuries. Combining NLP features with historic claim information and past outcomes, we developed a high accuracy model to predict claim cost and duration for new injury events. We incorporated this model into a production application to enable prioritization of resources for new claims. The NLP derived features further enabled descriptive analytics to better understand injury patterns and trends, supporting more focus on injury prevention and risk mitigation.