Polygenic Scores Help Reduce Racial Disparities in Predictive Accuracy of Automated Type 1 Diabetes Classification Algorithms
Objectives
Automated algorithms to identify individuals with type 1 diabetes using electronic health records (EHR) are increasingly used in biomedical research. It is not known whether the accuracy of these algorithms differs by self-reported race. We investigated whether polygenic scores improve identification of individuals with type 1 diabetes.
Research Design and Methods
We investigated two large hospital-based biobanks (Mass General Brigham [MGB] and BioMe) and identified individuals with type 1 diabetes using an established automated algorithm. We performed chart reviews to validate the diagnosis of type 1 diabetes. We implemented two published polygenic scores for type 1 diabetes (developed in individuals of European or African ancestry). We assessed the classification algorithm before and after incorporating polygenic scores.
Results
The automated algorithm was more likely to incorrectly assign a diagnosis of type 1 diabetes for self-reported non-White individuals compared to self-reported White individuals (odds ratio = 3.45 [95% confidence interval 1.54-7.69], P=0.0026). After incorporating polygenic scores in MGB Biobank, the positive predictive value of the type 1 diabetes algorithm increased from 70% to 97% for self-reported White individuals (meaning that 97% of those predicted to have type 1 diabetes indeed had type 1 diabetes), and from 53% to 100% for self-reported non-White individuals. Similar results were found in BioMe.
Conclusions
Automated phenotyping algorithms may exacerbate health disparities due to an increased risk of misclassification of individuals from underrepresented populations. Polygenic scores may be used to improve the performance of phenotyping algorithms and potentially reduce this disparity.