iCall: An improved genotype-calling algorithm for rare and common variants on the Illumina exome array.
Next-generation genotyping microarrays have been designed with insights from 1000 Genomes Project and whole exome-sequencing studies. These arrays additionally include variants that are typically present at lower frequencies. The design of next-generation genotyping microarrays is to increase genome coverage and to include low-frequency and rare variants that are often ancestry-specific.
iCall is an improved genotype-calling algorithm for rare and common variants on the Illumina exome array. The algorithm does not rely on having prior training data and it can asssign genotypes to hybridization data from thousands of individuals simultaneously. This algorithm can assign accurate genotypes to variants across the whole spectrum of allele frequencies. It adopts the three-component Gaussian mixture model framework that illuminus adopts, but focuses on deriving appropriate penalties to find the best seeding parameters to initialize the EM procedure in order to recognize the variety of situations where calling becomes difficult, such as when: (i) the MAFs are low; (ii) the total number of samples for joint calling is small; or (iii) the hybridization intensities deviate substantially from usual.
A zipped file containing the executable program, main codes and example dataset can be downloaded here: iCall.rar
If you need to compile the program, boost library is required for the compilation. The version we use is 1.53. Please download the boost_1_53_0 library and save at the directory ~/other_libraries before compilation. A make file is contained in the zipped file.
Please refer to README in the zip file for program instructions.
If you have any questions regarding the use of the program, please send an e-mail to both of the following people: