OrthoFi Machine Learning Project
This is a good example of a large scale data science project. OrthoFi Inc. is a software services company based in Denver, Colorado. Their flagship product is an advanced third-party billing and practice management system for orthodontic offices. They sought to improve efficiency in their insurance eligibility check process and OutsideInnovation was hired to analyze the company’s history and propose data driven solutions to the problem.
The orthodontic insurance space is very complex due to lack of regulation and standard systems. the insurance companies that report the insurance data to orthofi are under no obligation to report complete or even correct data. this presents a great challenge for orthofi since data accuracy is a key part of their business. Orthofi used a large number of human resources to manually perform insurance eligibility checks and was looking for a way to increase the efficiency of the project. outsideinnovation was hired to analyze hundreds of fields worth of historical data and propose data driven solutions to the problem.
A small 3 person team was selected to work hand in hand with orthofi for 4 months. An initial round of basic statistical analysis was performed on the historical data provided by orthofi. based on the characteristics of the data it was determined that a machine learning solution would likely be effective. the outsideinnovation team spent the next several months Cleaning and organizing the data in preparation for analysis by several types of machine learning algorithms. A random tree classifier was ultimately used with some customizations specific to the orthofi data set. This classifier was coupled with several other traditional algorithms to significantly increase the efficiency of orthofi’s insurance eligibility checks. The entire project was completed in python and the delivered package was prepared for interface with the appropriate Orthofi api.
By far the most difficult part of this project was the incompleteness and incorrectness of the historical data set. because orthofi is financially responsible for the quotes they provide the accuracy of this data and this process is critical. data preparation required significant deprecation, imputation, etc. and was time consuming for even basic analysis. additionally, the random nature of the data made it difficult to properly train and test any statistical classifier applied to the problem again Threatening the project schedule. with careful planning and clever technical development procedures the outsideinnovation team was able to find ways around these problems and deliver an effective product on time and on budget.