RegionalP: A region-based meta-analysis of genome-wide association studies in genetically diverse populations

This C++ program works by quantifying the degree of over-representation of associated SNPs in a pre-defined genomic region, given a specific definition of statistical significance. For example, under the null hypothesis that the region is independent of the phenotype, we expect 5% of the SNPs to be statistically significant by chance when adopting a P-value threshold of 5%, giving all the SNPs in this region are mutually independent. An over-representation of statistically significant SNPs in this region constitutes evidence that this region is associated with the phenotype, with the extent of over-representation indicating the strength of the evidence. The effective number of independent SNPs and the number of independent SNPs exhibiting evidence of phenotypic association is evaluated using Eigen-decomposition of the matrix measuring the LD between every possible pair of SNPs in the region.

This region-based analysis can be generalized to perform gene-based or pathway-based analyses.


Program Download

We have just released the updated software with minor bugs fixed and new versions for different platforms, referred as RegionalP [Beta-version]. The original software is referred to as [Alpha-version]. The format for the input and output files do not change.


Example Dataset

The following zipped file contains example datasets for regionalP.



Please cite the following publication if you are using the program in any publication.

  • X.Wang et al. :A statistical method for region-based meta-analysis of genome-wide association studies in genetically diverse population (submitted)



If you have any questions regarding the use of the program, please send an e-mail to both of the following people:

  • Wang Xu ( )
  • Dr. Yik Ying Teo ( )