Meeting Abstract
DNA is commonly used to hypothesize the phylogenetic relationships among species. Degraded DNA can be sourced from fossils, scat, and soil, but its fragmentary nature can result in the capture of only partial gene sequences. It remains to be determined, however, whether the relationships hypothesized from partial sequences are equivalent to those from complete genes. Using complete sequences of cytochrome b (cytb) and mitochondrial genomes from GenBank, we sought to determine how many base pairs of a gene are required to generate accurate phylogenetic relationships and whether this length differs across clades. We included 34 species from Alligatoridae, Dactyloidae, Plethondontidae, Bovidae, Felidae, Mustelidae, and Phosianidae. For both cytb and the mitochondrial genomes, we aligned sequences from each family independently and then cropped the sequences by 150bp from the 5’ end until ≤ 150 base pairs (bp) remained. We chose a 150bp increment because it is a common read length in NextGen sequencing platforms. We then ran a maximum likelihood phylogeny for each subsampled sequence. To determine which phylogenies differed statistically from the topology produced by the complete gene/genome sequence we ran a Shimodaira-Hasewaga test. Per a p-value threshold of 0.05, we found that statistically equivalent tree topologies can be produced from cytb sequences >300-450bp, depending on the clade. We also found that at least 4650bp are required from the mitochondrial genomes of mammals. This is approximately one third of the genome length. The results of this study indicate that partial gene sequences provide comparable data to complete genes. This is encouraging for studies using degraded DNA, as these short sequence lengths have a higher likelihood of capture than full genes/genomes.