polygenic risk scores: what are they good for

Polygenic risk scores (PRS) is one of the most popular prediction methods for complex traits and diseases with high-dimensional genome-wide association (GWAS) data where sample size n is typically much smaller than the number of SNPs p. Polygenic risk scores (PRS) are poised to improve biomedical outcomes via precision medicine. The motivations behind PRS are that 1) only summary statistics are needed for constructing PRS rather than raw data which may not be readily available due to privacy concerns; 2) most complex traits are affected by many genes with small effects, or follow a polygenic (or newly emerging omnigenic) model.


Such a score provides a quantitative index of the genomic burden of risk variants in an individual, which relates to the likelihood that a person has a particular disorder. Our major findings are that 1) when PRS is constructed with all p SNPs (referred as GWAS-PRS), its prediction accuracy is solely determined by the p/n ratio; 2) when PRS is built with a list of top-ranked SNPs that pass a pre-specified P-value threshold (referred as threshold-PRS), its accuracy can vary dramatically depending on how sparse true genetic signals are. Ancestry of GWAS participants over time compared to the global population.

To realize the full and equitable potential of PRS, greater diversity must be prioritized in genetic studies, and summary statistics must be publically disseminated to ensure that health disparities are not increased for those individuals already most underserved. However, the major ethical and scientific challenge surrounding clinical implementation of PRS is that those available today are several times more accurate in individuals of European ancestry than other ancestries.

However, the major ethical and scientific challenge surrounding clinical implementation of PRS is that those available today are several times more accurate in individuals of European ancestry than other ancestries.

Potential clinical uses of polygenic risk scores. Our results demystify the poor performance of PRS and demonstrate that the original purpose of PRS to aggregate effects from a large number of causal SNPs for polygenic traits is wishful and can lead us to a practical paradox for polygenic/omnigenic traits.

Major psychiatric disorders are heritable but they are genetically complex. In the future, as the datasets supporting the development of such scores become larger and more diverse and as methodological developments improve predictive capacity, we expect that PRS will have substantial clinical utility in the assessment of risk for disease, subtypes of disease, and even treatment response.

Such a score provides a quantitative index of the genomic burden of risk variants in an individual, which relates to the likelihood that a person has a particular disorder. However, research has shown that taken together they can provide significant predictive ability, and this approach has been applied to several common diseases in the form of genetic or polygenic risk scores (PRS). Only when m is magnitude smaller than n, or genetic signals are sparse, can threshold-PRS perform well.

