After my last article, a friend brought up that I had oversimplified the implications of genetic analysis and I think there’s still a lot of confusion on this topic, so I’d like to get right back at it!
You’ve probably read in a pop science magazine at some point that humans are about 98% similar in genetic make-up to Chimpanzees. But what does this mean? To say that two organisms are “98% genetically similar,” is not really saying anything at all. A statement like that begs the question: “how are they similar?” The article you are reading could be talking about chromosomal banding patterns, open reading frames, or expressed DNA, but most people take it to mean the whole shi-bang, a base pair by base pair account of similarity.
The whole shi-bang, what most people are interested in, is a lot to swallow. When we look at DNA in general, there are only four options as far as nucleotides go, and only two different base pairings: A-T and C-G. Like a computer code with only 1s and 0s, the code can look remarkably similar even if they serve different functions. The likelihood of running into some similar sequences between organisms is pretty good, even if they are not closely related. Since there are about 3 billion base pairs in a human being, one could be only 0.1% different from another person and still have 1 million dissimilarities in their sequence!
So what do we count as a dissimilarity? If two organisms have a similar stretch of DNA but they fall on two different chromosomes in one species and only on one chromosome in another species (a common case or chromosomes splitting), do we count it as a difference? How about when a chromosome is duplicated? What about base pair deletions? How about when one part of a chromosome is inserted into another? These are all questions researchers must ask as they compare and contrast the genetic make-up of species.
One of the things we mainly think about when we hear the words “genetically similar” is the genes themselves in the Mendelian sense, the expressed alleles (types of genes). A brother and sister may be 99.9% similar in terms of their total DNA, base pair by base pair, but, cross-over during meiosis aside, they are still likely to share only about 50% of their coding alleles. Often times the differences in genes that make multiple alleles (versions) possible are single nucleotide polymorphisms (SNPs), where just a small change can result in a novel phenotype. A change in one lil’ base pair may not seem like much in a 3 billion base pair human, but it can create a totally different protein and phenotype. Don’t tell a person with color blindness or cystic fibrosis that one little base pair in a billion doesn’t matter! These diseases and others are often a result of small genetic mutations.
The majority of DNA is non-coding, and can harbor both similarities and dissimilarities. In areas like telomeres at the tips of chromosomes or centromeres in the middle, most animals are likely to be similar as these are highly repetitive regions, existing more for the sake of the stability of the chromosome than anything else. Other areas of non-coding DNA, such as microsatellites and other VNTRs, are likely to be highly variable not just between species but sometimes even from person to person. Since the DNA in microsatellites is non-coding and not a structural necessity, conservation is not an issue in these short repeating segments. Their differences add up over time as unchecked mutations build up. These unique segments can be detected through simple gel electrophoresis methods and so are often used in forensic cases or to help determine heredity.
When geneticists do look at DNA to analysis the evolution of species and their relatedness, all of these factors are considered (and much more!) in the holistic field of genomics. From whole chromosomal structures to single nucleotide polymorphisms genomics encompasses all of these features and tries to make sense of speciation.
Sources: Gibson and Muse. A Primer of Genome Science, 2nd Edition. 2004