In the .fam
file prepared for plink
, there are two columns for you to specify one’s father (PID) and mother (MID) in this dataset, 0 if unknown
. Those with both PID and MID as 0 are considered as founders. Note that “By default, if parental IDs are provided for a sample, they are not treated as a founder even if neither parent is in the dataset.” In that case you need to manually make them founders via --make-founders
.
Why do we need founders? Because only they are included in some calculations such as minor allele frequencies/counts
or Hardy-Weinberg equilibrium tests
, both related to the concept of base population.
Traditionally, the probability that two alleles are IBD was most often calculated from a known pedigree and so the individuals at the top of the pedigree (the founders) form a natural base population, where the founders themselves are unrelated.
The probability that two alleles are IBD has to be defined with respect to a base (reference) population; that is, the two alleles are descended from the same ancestral allele in the base population.
The point of coalescence is the most recent common ancestor. The status of alleles there is in the ancestral state.