Team leader: Xiaodong Fang
Our group mainly focuses on the detection of variations based on resequencing. These variations include three types: single nucleotide polymorphism (SNP), short insertion & deletion (short indel) and structural variation (SV).
Solexa provides a high throughput at low cost sequencing technology, speeding up the human genetic variation studies. BGI had developed a pipeline for variations detection based on Solexa sequencing which had been valuated in the Yanhuang project.
SNP is the most common genetic variation in the genome. The SNP detection program developed by BGI well considered the characteristics of Solexa sequencing, such as sequencing quality, alignment errors and experimental error dependency. The program finally calculated a quality score for each SNP candidate.
Indel refers to the two types of genetic mutation that are often considered together when comparing two sequences, means that, one is insertion, the other one is deletion. Since the Solexa sequencing reads are so short that has low confident in gap-alignment. So we only focus on detecting short indels (1~3 bp) based on gap-alignment.
SV, including duplication, inversion, translocation, insertion, deletion and complex (which means several rearrangement events), occurred at the same region and is hard to distinguish what exactly happened. Paired-end sequencing is useful for structural variations detection. When the paired end reads mapped to the reference abnormally (improper insert size or orientation), it may be caused by structural variation in the sequenced individual. So we can detect them based the abnormal paired-end alignments.
There are only 0.1% or a bit more differences between any two individuals at the genome level. It is important to understand these differences as they provide a way to explain individual differences in susceptibility to disease, response to drugs and environment. It helps peoples to understand ourselves better.
1 Yanhuang project:
Sequence at least 100 Chinese people's genome to study the genetic characteristics of Chinese and construct a Chinese genetic polymorphism map. Then apply the findings to medical science hoping to solve the problems related to Chinese-specific genetic diseases.
2 1000 Genome Project:
It is an international collaboration project aiming to sequence at least 1000 people from all over the world to create the detailed and medically useful map of human genetic variation. The scientific goals of the 1000 Genomes Project are to produce a catalog of variants that are present at 1% or greater frequency in the human population across most of the genome, and down to 0.5% or lower within genes. The project will provide a deep understanding of human genetic variation and may contribute greatly to human health.