Endogenous retroviruses (ERVs) are retroviruses (RNA viruses) that infect a hosts nuclear genome, and become integrated into the genome of an organism. When this infection and integration occurs in the germ line of an organism, this retroviral insertion will be passed down to the organisms descendents. So, using the theory of evolution, we can predict that 2 species that are descended from common ancestors should share identical ERV insertions, because they would be passed down from their common ancestor. So using this information, and this hypothesis, we can set up an experiment. We know that chimpanzees and humans are very closely related, and that they share a common ancestor. So if we compare the ERV insertions in both of these species genomes, we will see that most, if not all, of their ERV insertions will be identical, which will necessitate common ancestry.
So, scientists have actually conducted this exact study, and here are the results.
1) 99.9% of ERVs in humans, are shared in identical loci (loci=a specific location in a genome) to those in the chimpanzee genome. This means less than 100 of the 200,000 ERVs in humans are lineage specific, and the rest necessitate being passed down from our common ancestor with chimpanzees.
2) And when the mutations of ERV insertions that are in identical loci are examined, even the mutations on the shared ERVs are found to be identical ...and just as with the distribution of ERVs, some shared mutations within a single shared ERV fall into nested hierarchies. (http://www.pnas.org/content/96/18/10254.full)
To take this data and add a little explanation to it:
We know how many ERVs Humans and Chimpanzees share for two reasons; examination of indel variation, and whole-genome analysis.
Indels are insertions or deletions to the genome of an organism. When lining up genome sequences for comparison, we can measure the amount of indel variation between the genomes. We can measure this because when multiple specie's genomes are aligned, an indel that only appears in one of the species's genome will result in a gap in the alignment. Whereas indels that are in identical locations, in BOTH specie's genomes, will leave no gaps.
Total indel comparison provides a total number of indels that are shared in identical loci between chimpanzees and humans; but closer examination is necessary to determine how many of the indels in these genomes are ERV insertions, and then even further analysis is needed to find how many of these ERV insertions, are identical in both species. One way this can be done is by isolating only the indels that are the right size to potentially be ERVs. Once this is done, the sequences corresponding to this size can be individually examined and the indels that contain LTRs, pol, env and gag proteins (characteristics found only in viral insertions) are found to be ERVs and are then isolated and examined even further. So after isolating all the ERV indels, we can find the total number of ERVs that are in identical loci by taking the total number of ERVs and subtracting the number of ERVs that form gaps when aligned. After doing this, we find that there are ~200,000 ERVs in the human genome, and that less than ~100 of these ERV indels form gaps when aligned, meaning that more than 99.9% of ERVs in the human genome are identical to ERVs in the Chimpanzee genome, which necessitates common ancestry.
After using indel variations to affirm our initial prediction, we can corroborate the indel variation evidence with genome wide sequencing (rather than just indel comparison).
In 2005, the available sequence of the Chimpanzee genome was aligned with that of the human genome, and an extensive comparison analysis was performed. As part of this analysis, the researchers looked at every available solo LTR (LTRs are long terminal repeats in the genome which are byproducts of ERV insertions) and every full-length ERV in the chimpanzee genome, and checked to see if there was also one at each corresponding locus in the Human genome. Just as with the examination of indels variations, the results were that less than ~100 ERVs are human-specific and less than ~300 ERVs are chimpanzee-specific, and the rest were in identical loci, and amazingly, also shared identical mutations as well. (Chimpanzee Sequencing and Analysis Consortium, 2005;R. Waterston, personal communication, April 22, 2010).
In conclusion, indel variation shows that most indels cannot be lineage-specific; they must be in identical loci. When the indels are further examined, the ERV indels are isolated, and the initial prediction is affirmed, and less than 0.1% of ERVs are found to be lineage-specific, necessitating that the other 99.9% are from our common ancestor with chimpanzees.
Finally, definitive confirmation is obtained by genome-wide comparison, where virtually all ERVs, and their accompanying LTRs are directly observed to be in identical loci in both genomes. And amazingly, even the mutations in these identical ERVs are found to be identical.
All of this evidence, necessitates humans and chimpanzees to be derived from a common ancestor.