Why European Genetic Risk Scores Fall Short in Predicting Multiple Sclerosis Risk Among South Asian Populations
The article, “Polygenic risk score prediction of multiple sclerosis in individuals of South Asian ancestry,” addresses a critical problem in precision medicine: whether genetic risk prediction tools developed primarily in European populations can be reliably applied to individuals from other ancestral backgrounds. Multiple sclerosis is a complex immune-mediated neurological disease influenced by many common genetic variants, each contributing a small increment of risk. Polygenic risk scores aggregate these variants into a single measure of inherited susceptibility. However, because most genome-wide association studies have been conducted in European-ancestry cohorts, the transferability of these scores to South Asian populations remains uncertain and ethically important.
Study Design and Population Cohorts
The authors compared polygenic risk score performance in two large longitudinal genetic resources: Genes & Health and UK Biobank. Genes & Health included British–Bangladeshi and British–Pakistani participants, offering a rare opportunity to evaluate multiple sclerosis genetics in a South Asian-ancestry population. After quality control, the analysis retained 40,532 individuals, including 42 people with coded multiple sclerosis diagnoses and 40,490 controls. UK Biobank served as the European-ancestry comparator, with 2,091 multiple sclerosis cases and 374,866 controls. This contrast allowed the investigators to ask whether a European-derived genetic score predicts disease risk equally well across ancestral groups.
Construction of Polygenic Risk Scores
The study calculated multiple sclerosis polygenic risk scores using effect sizes from the largest available multiple sclerosis genome-wide association study. The authors applied a clumping-and-thresholding method through PRSice-2, generating multiple scores under different linkage disequilibrium and association-threshold assumptions. Importantly, they evaluated scores both including and excluding the major histocompatibility complex region. This distinction is biologically significant because the MHC, particularly human leukocyte antigen variation, is the strongest known genetic contributor to multiple sclerosis risk. The graphical abstract on page 2 clearly summarizes this workflow: European-derived genetic associations were used to construct risk scores and then tested in a South Asian-ancestry cohort.
Main Findings in the South Asian-Ancestry Cohort
The central finding was that European-derived polygenic risk scores had limited predictive performance in the Genes & Health cohort. The score including the MHC explained approximately 1.1% of liability to multiple sclerosis, while the score excluding the MHC explained approximately 1.5%. Although both scores showed some association with multiple sclerosis status, their performance was modest. The quartile analysis suggested that individuals in the highest score group had greater risk than those in the lowest group, but confidence intervals were wide because only 42 cases were available. The receiver operating characteristic curves on page 5 also show that covariates such as age, sex, and genetic principal components accounted for much of the model’s discrimination.
Comparison with European-Ancestry Participants
When the same general analytical framework was applied to European-ancestry UK Biobank participants, the polygenic risk scores performed substantially better. In the full UK Biobank European-ancestry sample, the MHC-inclusive score explained approximately 4.4% of liability, and the non-MHC score explained approximately 2.3%. To reduce bias from the much larger number of European cases, the authors repeatedly subsampled UK Biobank to match the Genes & Health case-control structure. Even then, the MHC-inclusive score explained more liability in European-ancestry participants than in South Asian-ancestry participants. Figure 3 on page 6 visually demonstrates this ancestry-related reduction in predictive power.
Biological and Methodological Interpretation
The reduced performance of the European-derived score in South Asian individuals is most plausibly explained by differences in allele frequencies and linkage disequilibrium structure between populations. A variant that tags a causal risk allele effectively in Europeans may not tag the same causal signal with equal accuracy in South Asians. The result does not necessarily imply that multiple sclerosis has fundamentally different biology across populations. Rather, it suggests that the statistical architecture captured by European genome-wide association studies is not fully portable. The unexpected lack of improvement from including the MHC region may reflect limited case numbers, imperfect tagging of relevant HLA alleles, or ancestry-specific differences in local genomic structure.
Implications for Precision Medicine and Equity
This article provides an important warning for the future clinical use of genetic risk prediction. Polygenic risk scores may eventually help identify individuals at elevated risk of multiple sclerosis, enrich preventive trials, or support earlier intervention during prodromal disease phases. However, if such tools are trained mainly on European datasets, they may provide less accurate predictions for underrepresented populations and thereby reinforce existing health inequities. The study therefore supports a clear scientific and ethical conclusion: ancestrally diverse genome-wide association studies are essential before polygenic risk scores can be deployed responsibly in clinical or public health contexts.
Disclaimer: This blog post is based on the provided research article and is intended for informational purposes only. It is not intended to provide medical advice. Please consult with a healthcare professional for any health concerns.
References:
Breedon, J. R., Marshall, C. R., Giovannoni, G., van Heel, D. A., Dobson, R., & Jacobs, B. M. (2023). Polygenic risk score prediction of multiple sclerosis in individuals of South Asian ancestry. Brain Communications, 5(2), fcad041.
