What do their genes contribute? | Animal Genetics Training Resources

Decisions about which breeds to be conserved should be based on objective criteria, but also consider both current utility and the need to maintain maximum genetic diversity in the gene pool of the species (Simianer et al. 2003). The latter can be achieved by conserving that subset of all breeds in a species that shows the most genetic differentiation among them, including those that contain unique alleles or allele combinations, while at the same time meeting the production needs of the farmers. Genetic analysis could facilitate identification of genetic duplicates and/or separation of breeds on the basis of genetic distinctiveness. Pair-wise genetic distances estimated among all the breeds/strains/populations of a species, and the single phylogeny constructed from these distances that best represent all the relationships among the breeds will aid objective and rational decision making in the choice of breeds for conservation and breed improvement, including evaluation studies to determine comparative genetic merit.

How important are breeds/strains within species?

The process of domestication of animals involved a selection of only some 40 out of the estimated 40 thousand or so species of vertebrates. These represent the ancestors of the domestic species available today. The selected species accompanied human populations across the earth into a variety of new environments, gradually evolving to adapt to a wide range of environmental conditions.

The next stage in the evolution of domestic animal breeding was the development of controlled mating and human selection of preferred animal types. Breeds began to be formally recognised in Europe in the 18^th century. Superior animals were identified and then herd registers and herd books were created for them (see Section 1.1, this module).

Although no compelling quantitative figures are available, it is estimated that 50% of the total genetic variation among domestic AnGR is at species level and the remaining 50% is accounted for by variation among breeds within species. There is no estimate of the extent of genetic diversity within breeds due to variation among strains. This is likely to vary considerably given the wide range in the number of strains available for the different breeds of all species. From a standpoint of utilisation and conservation, breeds and possibly strains are generally more important than species. It is the differentiation of species into breeds that has allowed existence of livestock production in many of the unique/special environments of the world (see also Module 1, Section 5). Moreover, livestock species (e.g. cattle, sheep, goats, pigs etc.) are not likely to become extinct, but certain breeds are. For example, the swamp buffalo breeds that used to be important for providing draft power in Thailand, Lao and many of the South-East Asian countries have declined and are now increasingly threatened with extinction due to increased mechanisation of paddy rice cultivation. Losing a particular breed of a species adapted to live in an environment in which no other breed of that species can live could have serious implications for human food and livelihood security [CS 1.1 by Mpofu and Rege]; [CS 1.24 by Dempfle and Jaitner]; [CS 1.37 by Kharel et al]. Also important is the fact that it is the differentiation of species into breeds that has produced a wide range of populations, each serving a specific set of purposes milk, meat, traction, pack, eggs, wool etc. for society . In the recent past, crossbreeding programmes where breed total replacement was not the goal have, however, created rather than reduced genetic variation (Madalena 2005). Such new variations have been exploited to varying extents. There are examples where positive impacts have been realised from breed combinations [Sunandini-Kerala]; [Boer goats] and [Dorper sheep]; [Mafriwal cattle].

Measuring diversity within breeds

As descriptive measures of genetic and environmental variation, it is more convenient to use what are called genetic parameters or, strictly speaking, phenotypic, genetic and environmental parameters, which are all ratios of variances and covariances. Genetic diversity has been defined above as the heritable variation within and between populations. Heritability, a quantitative measure of heredity, is thus an important parameter not only in understanding genetic diversity in a population but also in utilising that diversity.

Heritability is defined as the ratio of (additive) genetic variance to the phenotypic variance and is an indicator of the proportion of the observable variation in a trait in the parental generation which will be passed to the offspring generation. Other important parameters in this context are repeatability, phenotypic, genetic and environmental correlations(see Module 4, section 5.1).
Repeatability is defined as the ratio of the variance due to animal effects (both genetic and non-genetic) to the total phenotypic (observable) variance. It can also be considered as the fraction of the difference from the mean, which is expected, in another record of the same animal. Repeatability is an important measure in that it indicates the reliability of an existing record as an indicator of possible future records on the same individual.
Phenotypic correlation measures observable association between two traits and is calculated as the ratio of the phenotypic covariance to the product of the phenotypic standard deviations of the two traits.
Genetic correlation is a quantitative measure of the association, at the genetic level, between two traits. It arises from the fact that some genes affect more than one trait. It is estimated as the ratio of the genetic covariance between two traits to the product of the genetic standard deviations of the two traits.
Environmental correlation. Just as there are two possible causes (environmental and genetic) of differences between individuals in expressed phenotype of one trait, there are also two causes of correlation between two traits or characters. Environmental correlation measures the association between traits due to environmental factors and is calculated in a similar manner as the genetic correlation but using environmental covariance and standard deviations.

These parameters are very important in animal breeding [CS 1.6 by Mpofu and Rege]; [CS 1.9 by Aboagye]. Indeed, they underpin both the understanding and utilisation of genetic diversity in populations [Manual exercises - Quantitative characters]; [Manual exercises - Genetic gain].

Measuring genetic diversity within and between breeds/strains

The field of molecular biology, particularly the application of molecular markers to study genetic diversity, has evolved very rapidly since the mid-1960s. The dominance of protein electrophoretic approaches to population genetics and evolutionary biology was, in the late 1970s, replaced by DNA analysis, primarily through the use of restriction enzymes, and in the 1980s by mitochondrial DNA analyses and DNA fingerprinting approaches [CS 1.38 by Ofelia]. More recently, the introduction of PCR-mediated (polymerase chain reaction-mediated) DNA genotyping/sequencing has provided the first rapid and easy access to the ultimate genetic data.

The various state-of-the-art analytical (statistical) methodologies available for assessment of genetic diversity using molecular data are described in (Module 4 Section 7). Although DNA-based technologies are now the methods of choice, it would be a mistake to conclude that DNA markers provide the ultimate solution. Several alternative assays, such as protein/allozyme polymorphisms, remain tremendously useful, especially in developing countries, because of their utility, ease, cost and amount of genetic information accessed or simplicity of data interpretation. The role or potential of these alternative approaches in animal genetic diversity studies should not be underplayed.

What biochemical or DNA-based molecular techniques are presently available?

Protein polymorphisms: Variation in proteins reflects changes in the genes that code for them. This has been widely used in studying genetic diversity (Hames and Rickwood 1990). The two approaches applied are protein electrophoresis and, to a small extent, protein immunology. The principle behind studies of electrophoretic mobility of enzymes (and other proteins) is that mobility across gels can be related to differences in allelic groups responsible for amino acid changes in the protein. Such amino acid substitutions are, in turn, a direct consequence of gene mutations. Thus, we can quantify the amount of variation within and between populations by measuring frequencies of different variants in groups of individuals.
Protein immunology methods rely on the antigenic properties of proteins. When a protein from population 'A' is injected into a suitable host population 'B', this antigen elicits the production of antibodies with high specificity for antigenic sites on the injected protein. The difference in antigen-antibody re-activities in tests involving homologous versus heterologous antigens provides a measure of the genetic relationship, usually expressed as immunological distance (ID) units between these proteins for the two animal populations. Protein immunology methods are not used as routine procedures for diversity assessment. An example of its use is the blood group protein immunology studies by Baker and Manwell (1991) which demonstrated close genetic relationships among breeds of European Bos taurus cattle and their genetic separation from the humped B. indicus cattle of Asia and Africa.
Restriction fragment length polymorphisms (RFLP) analyses involve cutting double-stranded DNA with one or more restriction endonucleases, enzymes that cut DNA at sites containing specific base sequences, i.e. restriction sites. The cutting process produces DNA fragments, the restriction fragments, which are then separated, according to molecular weight, by electrophoresis. Differences among individuals in 'digestion profiles' (the banding pattern on the gel) are generated by the presence or absence of restriction sites resulting from mutations (e.g. base substitutions, deletions/insertions and rearrangements within the restriction site). Avise (1994) described the analyses in detail. The resulting bands are then scored for individuals and these generate the 'frequency data' analysed for genetic diversity. The major limitation of nuclear DNA RFLPs as genetic markers is their low degree of heterozygosity. Most tend to be diallelic and hence not highly informative for diversity assessment. Another disadvantage of these markers is their lack of resolving power when dealing with closely related populations such as breeds or strains. This is because the polymorphisms are results of mutation events at the restriction sites; the mutation rates are extremely low (10-7 to 10-9 per generation).
Random amplified polymorphic DNA (RAPDs) are considered the easiest group of DNA polymorphisms to detect and are based on the PCR amplification of random DNA segments with single, short (10 base) primers of arbitrary sequence (Williams et al. 1990). The resulting highly polymorphic pattern of bands is revealed by agarose electrophoresis with each random primer producing different pattern of bands. The bands are scored to generate data on individuals [CS 1.11 by Gwakisa]. In addition to its potential application in genetic distance estimation, RAPD can be used as a tool for mapping a trait (Michelmore et al. 1991). This can be important in searching for markers associated with quantitative characters (see Module 4, Section 5). The limitation of RAPD is the ambiguity of the resulting fingerprint patterns and the fact that heterozygotes cannot be distinguished from homozygotes due to its dominant inheritance mechanism. In addition, how the genetic variation observed is generated is not fully understood, making reconstruction of evolutionary histories from RAPD data difficult.
Mitochondrial DNA (mtDNA) is a highly conserved molecule whose genes are organised in a very compact manner, with some genes actually overlapping, e.g. in the bovine (Anderson et al. 1982). The only non-coding region, apart from small numbers of interspersed bases, is the D-loop or control region. Previous studies (e.g. Hausworth et al. 1984) indicated that the D-loop evolves at a higher rate than the rest of mtDNA. This has been used to support the argument that a sequence comparison of this region should be most efficient at detecting differences between individuals at the mtDNA level (e.g. Cann et al. 1987). In addition, mtDNA has a high copy number per cell and a high mutation rate. Moreover, the fact that mtDNA is maternally inherited is of practical importance in the field: to ensure that only non-related mtDNA are sampled, one needs to only worry about the female side of the pedigree. Fortunately, information on the dam side is relatively easily available at field level and is usually reliable.
Y-chromosome specific markers. The Y-chromosome is a large linear molecule whose sequence is still largely unknown. Unlike the mitochondrial DNA, it is located in the nucleus and is paternally inherited. There are two major types of Y-chromosome in cattle. The typical B. taurus type is sub-metacentric and the B. indicus type is acrocentric (Kieffer and Cartwright 1968). Just like the mitochondrial DNA is useful in tracing female-mediated genetic relationships between populations, the markers on the Y-chromosome provide a means of studying male-mediated genetic introgression. For example, Hanotte et al. (1997) identified a polymorphic microsatellite marker in cattle. This locus has two alleles, one specific to taurine cattle and the other specific to indicine cattle. This specificity has been used to investigate the history of and genetic relationships among African cattle breeds (Hanotte et al. 2000).
Microsatellites are segments of genomic DNA which contain short tandem repeats of 2-6 bp nucleotides. They are now considered to be the markers of choice when trying to discriminate between closely related populations, e.g. breeds or strains (MacHugh et al. 1994). Microsatellite markers have several additional advantages which make them ideal for genetic characterisation. Microsatellite polymorphism refers to the differences in allele sizes due to variation in the number of repeats of base sequences that are detected by gel electrophoresis. These are scored on individual samples and provide the frequency data that are analysed to assess genetic diversity [CS 1.10 by Okomo]. The advantages include: ease with which they can be identified and sequences of flanking regions determined as a prelude to primer design; the analysis procedure requires only a very small amount of DNA; microsatellite polymorphisms can be described numerically, facilitating computerised data handling and hence automation; and ease of sharing information on the relatively short primers between collaborating laboratories.
Amplified fragment length polymorphisms (AFLPs) (Vos et al. 1995) are based on the detection of restriction fragments by PCR amplification. Genomic DNA is restricted with two different restriction endonucleases and then a subset of these are amplified using a modified PCR and visualised using radioactivity, silver staining or fluorescent dyes for use with an automated sequencer. The main advantages of AFLPs for genetic diversity studies are that only small quantities of DNA are required because the technique is based on the PCR and the fingerprint traces are highly reproducible and consist of many markers, allowing for greater discernment between closely related individuals than other techniques including RAPDs and microsatellites. AFLPs are reliable informative multilocus probes and provide high levels of resolution that allows delineation of complex genetic structures (Powell et al. 1996).
Single nucleotide polymorphisms (SNPs) are DNA sequence variations that occur when a single nucleotide (A, T, C or G) in the genome sequence is altered (Collins et al. 1997). For example a SNP might change the DNA sequence AAGGCTAA to ATGGCTAA. For a variation to be considered a SNP it must occur in at least 1% of the population. SNPs, which make up about 90% of all human genetic variation, occur every 100 to 300 bases along the 3 billion-base human genome (Taillon-Miller et al. 1998; Wang et al. 1998). Two of every three SNPs involve the replacement of cytosine (C) with thymine (T). SNPs can occur in both coding (gene) and non-coding regions of the genome.

SNPs offer several advantages over other types of DNA marker systems and are rapidly becoming the markers of choice for many applications in genome analysis due to their abundance (especially important in linkage disequilibrium based mapping approaches) and also because high throughput genotyping methods are being developed for their analysis. The additional advantage offered by this approach lies in the phylogenetic information gathered through sequence variation analysis that allows drawing inferences on allele and population history that cannot be gathered with any of the other marker systems available. SNPs are also evolutionarily stable (i.e. do not change much from generation to generation) making them easier to follow in population studies.

Module 4, Section 7 describes how data obtained from biochemical and DNA-level molecular genetic studies can be analysed to provide estimates of diversity, including relationships among breeds or strains, and within populations (e.g. measures of heterozygosity and extent of inbreeding).

Measuring the influence of the environment

Most of the economically important traits in livestock species are under the control of many genes (at many loci). Such traits are combined expressions of many different physiological systems, each contributing to the metric value additively or through interaction with other physiological mechanisms. If we take milk production as an example, the observable value is the overall expression of several 'macro-functions' such as appetite, feed intake, digestion efficiency, efficiency of utilisation of body reserves, udder function and volume, health status, ability to handle other environmental stresses etc. The list can be long. In addition, behind each macro-function, there are chains of enzymatic, hormonal and other biochemical reactions, each regulated by gene products. Thus, the number of genes involved in one trait is usually or likely to be very large. For such complex quantitative traits, the different genotypes cannot be distinguished on the basis of the phenotype (production record, measurement or appearance) of the individual. An important complicating factor is that environmental effects modify the expression of such characters and therefore contribute to the phenotypic variation among individuals. For example, the milk production by an individual is influenced by such factors as quality and quantity of feed, housing, effect of disease etc.

To the extent that environmental conditions are affected by climatic conditions, season becomes an important factor influencing animal performance. In the tropics, both quality and quantity of feeds and disease and parasite burdens can fluctuate considerably between seasons in response to differences in rainfall, temperature, humidity etc. These have important implications for housing and overall animal management and for herd/flock structures. In turn, management (housing, feeding, health care etc.) considerably influences the expression of quantitative traits.

To handle the complexity of these traits, quantitative genetic theory provides us with powerful tools (see Module 4, Section 4 and 5) for analysing quantitative variation to enable us to use the results in practical animal breeding. There are several analytical methods available. All of these are based on the fact that, no matter how complex the underlying causal mechanisms are for any trait, the expressed phenotype (P) can be attributed to two main sources, the genetic (G) and the environmental (E) components. (In complex models, these components are divided into sub-components [Manual exercises - Quantitative characters]; [Computer exercises - Prediction of breeding values] and interactions among components are also included.)

While for a single trait in one environment, quantitative estimates of the causal (G and E) components, which are usually expressed in terms of variances, provide a good indication of the contribution of the environment relative to the total phenotype, the situation is a bit more complicated for multiple traits and for one trait being evaluated in multiple environments. For quantitative estimates of a single trait in one environment, environmental correlation (Section 3.2, this module) provides a useful parameter, whereas for multiple traits and one trait being evaluated in multiple environments, the concept of genotype by environment interaction (G × E). Where G × E exist, the breed or genotype with the best performance in a given trait in one environment will not give the best performance in another environment, or the extent of superiority will differ between environments. Such differences provide a framework for quantitative analysis [CS 1.39 by Okeyo and Baker]. An instructive approach to the analysis of G × E is to treat records of the same trait taken in different environments as representing different traits and to proceed to estimate genetic correlations between these traits (see Falconer and Mackay 1996). Existence of G × E will be indicated if the genetic correlation is low (Section 5.3 Module 4).

What is the significance of G × E? It has important implications in the development of breeding programmes (Module 3, Section 3.4); [CS 1.39 by Okeyo and Baker]; (see Module 4, section 5.3). If selection is undertaken in good environments (feeding, health, housing or climatic stress), we need to know if the genetic improvement achieved will be exploited in a poor environment. Or should selection for adaptation to poor environments be made in similar environments? This is of direct relevance in stratified breeding systems where, for example, selection decisions are made on the basis of animal evaluations carried out on a few well-managed (commercial) farms or in extreme cases, where selection is based on evaluations carried out in different countries and under different production systems respectively (Ojango and Pollot 2002), [www.interbull.org].

On a more practical level, selection (performance tests) of breeding stock should be undertaken in environments that are similar to where their offspring are expected to be raised. More often, failures in realising the full genetic potential of exotic temperate breeds when they are exported to more stressful tropical environments or production systems is due to failure of the farmers and technical staff to fully recognise the importance of G × E, the most common of which is the continued use of high producing North American Holstein-Friesian bull semen to produce daughters in tropical farmers' herds where husbandry is generally inadequate. Many such examples exist in ill-designed (rather too sophisticated given the existing infrastructure) exotic breed-based livestock development programmes in the tropics.