HapFinder: Finding common haplotype blocks in a database
There are currently three main types of program execution-
- Type 1: Can be used to identify the longest haplotype block of a pre-defined frequency that carries a specific allele at a focal snp. An example will be to find the longest haplotype block for allele A of the sickle cell locus (rs334) in the Hapmap YRI population.
- Type 2: Can be used to identify the longest haplotype block of a pre-defined frequency at approximate location in the chromosome. This is most useful in identifying the longest haplotype block when the focal snp and allele are unknown but evidence points to the region undergoing positive selection. Examples include the the sickle cell locus(positions around 5204808 bp on chr11) in Hapmap YRI population, or the lactase gene locus(positions around 136442378 bp on chr2) in Hapmap CEU population, or EDAR gene(positions around 108942119 bp on chr2) in JPT and CHB.
- Type 3: Can be used to find the haplotype form that carries most, if not all, of the implicated alleles associated with disease onset or increased severity from SNPs that have earlier been identified from genome wide association studies (GWAS). The data shown is a simulation of a ‘causal’ SNP (Chr6:rs2206734), where 2000 cases and 2000 controls are simulated with HAPGEN in the three HapMap populations with a multiplicative effect of RR = 1.5 at allele T.
Program Download
A zipped file containing the main software and its required java libraries can be downloaded. hapfinder-1.0-20101122.tar.gz
Please refer to the README for program instructions.
Example dataset
The following zipped file contains files that are provided as an example dataset for input to hapfinder. These files comprises of the genotype data from two Hapmap populations: YRI on chromosome 11 and CEU on chromosome 6, release 22, build 36
- Example Type 1 Program Execution : java -jar hapfinder.jar –type 1 –legend genotypes_chr11_YRI_r22_nr.b36_fwd_legend.txt –haplotype genotypes_chr11_YRI_r22_nr.b36_fwd.phase –chromosome 11 –focals 5204808 –outputFilename YRI_rs334_type1_alleleA –score 0.98 –freqs 0.10 –alleles A –samples genotypes_YRI_r22_nr.b36_fwd_sample.txt
- Example Type 2 Program Execution : java -jar hapfinder.jar –type 2 –legend genotypes_chr11_YRI_r22_nr.b36_fwd_legend.txt –haplotype genotypes_chr11_YRI_r22_nr.b36_fwd.phase –chromosome 11 –focals 5204800 –outputFilename YRI_rs334_type2 –score 0.98 –freqs 0.10 –samples genotypes_YRI_r22_nr.b36_fwd_sample.txt
- Example Type 3 Program Execution : java -jar hapfinder.jar –type 3 –legend genotypes_chr6_CEU_r22_nr.b36_fwd_legend.txt –haplotype genotypes_chr6_CEU_r22_nr.b36_fwd_phased –chromosome 6 –focals 20535448 20739932 20749315 20775667 20825383 20828258 20836492 –alleles A A T T G C C –log10p 6.784 6.882 7.189 7.189 8.073 7.363 7.484 –start 20530000 –end 20840000 –recombination genetic_map_chr6_b36.txt –outputFilename CEU_rs2206734_type3 –score 0.98 –samples genotypes_CEU_r22_nr.b36_fwd_sample.txt
Do note that the following commands are to be in one line, but are presented in multiple lines for asthetic purposes.
Plotting and interpretation of HapFinder results
The following zipped file contains R scripts and examples of the 3 types of analysis respectively.
References
Please cite the following publication(s) if you are using the program in any publication.
- RTH Ong et al. HapFinder: Finding haplotype blocks in a database (submitted)
Contact
If you have any questions regarding the use of the program, please send an e-mail to both of the following people:
- Rick Ong ( g0801900.nus.edu.sg)
- Dr. Yik Ying Teo ( statyy.nus.edu.sg)