LD Prunning | Huan Fan

In the era of whole genome sequencing of thousands of individuals, we are facing the problem of not too few genetic variants, but too many. A major task is to filter those variants. Recently there is a very good review paper on this topic by Hemstrom et al. One of the way of filtering is based on linkage disequilibrium (LD).

1. What is Linkage Disequilibrium (LD)?

LD measures the statistical association between alleles at different loci.
If two loci are in LD, the allele combinations at these loci occur more or less frequently than expected based on their individual allele frequencies.
LD can be quantified using metrics like ( D’ ), ( r^2 ), or ( D ):
- ( r^2 ): Measures the correlation between alleles at two loci, ranging from 0 (no LD) to 1 (complete LD).
- ( D ): Measures the deviation of observed haplotype frequencies from expected under linkage equilibrium.

2. Real-World Implications

Mapping Genes:
- LD is used in genome-wide association studies (GWAS) to link genetic markers to traits.
- High LD regions may indicate physical proximity between a marker and a causal variant.
Population Genomics:
- LD patterns provide insights into recombination landscapes, population structure, and demographic history.
Selective Sweeps:
- Strong positive selection can maintain high LD around a beneficial allele, even in regions with moderate recombination rates.

Why do we want/need to ca

The relationship between recombination rate and linkage disequilibrium (LD) is a key concept in population genetics. LD describes the non-random association of alleles at two or more loci, while recombination rate determines how frequently genetic material is exchanged between loci during meiosis. Here’s how they are related:

I’d like to do some PCA on a genome sequencing dataset. I have a filtered vcf and all is good, but there is one more thing that I need to do before a PCA.

LD prunning.

Previously we explained what LD is and

Huan Fan / 2024-12-02
Published under (CC) BY-NC-SA in categories notes tagged with stats