RegionalP: A region-based meta-analysis of genome-wide association studies in genetically diverse populations

This C++ program works by quantifying the degree of over-representation of associated SNPs in a pre-defined genomic region, given a specific definition of statistical significance. For example, under the null hypothesis that the region is independent of the phenotype, we expect 5% of the SNPs to be statistically significant by chance when adopting a P-value threshold of 5%, giving all the SNPs in this region are mutually independent. An over-representation of statistically significant SNPs in this region constitutes evidence that this region is associated with the phenotype, with the extent of over-representation indicating the strength of the evidence. The effective number of independent SNPs and the number of independent SNPs exhibiting evidence of phenotypic association is evaluated using Eigen-decomposition of the matrix measuring the LD between every possible pair of SNPs in the region.

This region-based analysis can be generalized to perform gene-based or pathway-based analyses.


