Our client needed to train predictive models based on large sets of MRIs, sourced from many clinical locations and public data sets. Manual quality assurance to determine which images met the criteria for the study were not scalable. Our needed a fully automated solution to determine which images met the quality threshold.
We created an automated quality control classifier to determine which MRI scans were allowed to proceed to the next level of analysis that requires no human intervention, using a Balanced Random Forests Classifier. Cross-study performance analysis confirmed the generalizability of the model. Further, our data pipeline extracts features from the scan process logs using a Boruta algorithm to be available during the predictive modeling phase.