pyp_metrics contains a set of procedures for calculating metrics on PyPedal pedigree objects. These metrics include coefficients of inbreeding and relationship as well as effective founder number, effective population size, and effective ancestor number.
a_coefficients() writes population average coefficients of inbreeding and relationship to a file, as well as individual animal IDs and coefficients of inbreeding. Some pedigrees are too large for fast_a_matrix() or fast_a_matrix_r() -- an array that large cannot be allocated due to memory restrictions -- and will result in a value of -999.9 for all outputs.
a_effective_ancestors_definite() uses the algorithm in Appendix B of Boichard et al. (1996) to compute the effective ancestor number for a myped pedigree. NOTE: One problem here is that if you pass a pedigree WITHOUT generations and error is not thrown. You simply end up wth a list of generations that contains the default value for Animal() objects, 0. Boichard's algorithm requires information about the GENERATION of animals. If you do not provide an input pedigree with generations things may not work. By default the most recent generation -- the generation with the largest generation ID -- will be used as the reference population.
a_effective_ancestors_indefinite() uses the approach outlined on pages 9 and 10 of Boichard et al. (1996) to compute approximate upper and lower bounds for f_a. This is much more tractable for large pedigrees than the exact computation provided in a_effective_ancestors_definite(). NOTE: One problem here is that if you pass a pedigree WITHOUT generations and error is not thrown. You simply end up wth a list of generations that contains the default value for Animal() objects, 0. NOTE: If you pass a value of n that is greater than the actual number of ancestors in the pedigree then strange things happen. As a stop-gap, a_effective_ancestors_indefinite() will detect that case and replace n with the number of founders - 1. Boichard's algorithm requires information about the GENERATION of animals. If you do not provide an input pedigree with generations things may not work. By default the most recent generation -- the generation with the largest generation ID -- will be used as the reference population.
a_effective_founders_boichard() uses the algorithm in Appendix A of Boichard et al. (1996) to compute the effective founder number for myped. Note that results from this function will not necessarily match those from a_effective_founders_lacy(). Boichard's algorithm requires information about the GENERATION of animals. If you do not provide an input pedigree with generations things may not work. By default the most recent generation -- the generation with the largest generation ID -- will be used as the reference population.
a_effective_founders_lacy() calculates the number of effective founders in a pedigree using the exact method of Lacy.
common_ancestors() returns a list of the ancestors that two animals share in common.
descendants() uses pedigree metadata to walk a pedigree and return a list of all of the descendants of a given animal.
effective_founder_genomes() simulates the random segregation of founder alleles through a pedigree. At present only two alleles are simulated for each founder. Summary statistics are computed on the most recent generation.
effective_founders_lacy() calculates the number of effective founders in a pedigree using the exact method of Lacy. This version of the routine a_effective_founders_lacy() is designed to work with larger pedigrees as it forms "familywise" relationship matrices rather than a "populationwise" relationship matrix.
a_fast_coefficients() writes population average coefficients of inbreeding and relationship to a file, as well as individual animal IDs and coefficients of inbreeding. It returns a list of non-zero individual CoI.
founder_descendants() returns a dictionary containing a list of descendants of each founder in the pedigree.
generation_intervals() computes the average age of parents at the time of birth of their first (oldest) offspring. This is implies that selection decisions are made at the time of birth of the first offspring. Average ages are computed for each of four paths: sire-son, sire-daughter, dam-son, and dam-daughter. An overall mean is computed, as well. IT IS IMPORTANT to note that if you DO NOT provide birthyears in your pedigree file that the returned dictionary will contain only zeroes! This is because when no birthyear is provided a default value (1900) is assigned to all animals in the pedigree.
generation_intervals_all() computes the average age of parents at the time of birth of their offspring. The computation is made using birth years for all known offspring of sires and dams, which implies discrete generations. Average ages are computed for each of four paths: sire-son, sire-daughter, dam-son, and dam-daughter. An overall mean is computed, as well. IT IS IMPORTANT to note that if you DO NOT provide birthyears in your pedigree file that the returned dictionary will contain only zeroes! This is because when no birthyear is provided a default value (1900) is assigned to all animals in the pedigree.
mating_coi() returns the coefficient of inbreeding of offspring of a mating between two animals, anim_a and anim_b.
min_max_f() takes a pedigree and returns a list of the individuals with the n largest and n smallest coefficients of inbreeding. Individuals with CoI of zero are not included.
num_equiv_gens() computes the number of equivalent generations as the sum of (1/2)^n, where n is the number of generations separating an individual and each of its known ancestors.
num_traced_gens() is computed as the number of generations separating offspring from the oldest known ancestor in in each selection path. Ancestors with unknown parents are assigned to generation 0. See: Valera, M., Molina, A., Gutierrez, J. P., Gomez, J., and Goyache, F. 2005. Pedigree analysis in the Andalusian horse: population structure, genetic variability and the influence of the Carthusian strain. Livestock Production Science. 95:57-66.
partial_inbreeding() computes the number of equivalent generations as the sum of (1/2)^n, where n is the number of generations separating an individual and each of its known ancestors.
pedigree_completeness() computes the proportion of known ancestors in the pedigree of each animal in the population for a user-determined number of generations. Also, the mean pedcomps for all animals and for all animals that are not founders are computed as summary statistics.
related_animals() returns a list of the ancestors of an animal.
relationship() returns the coefficient of relationship for two animals, anim_a and anim_b.
theoretical_ne_from_metadata() computes the theoretical effective population size based on the number of sires and dams contained in a pedigree metadata object. Writes results to an output file.