Mind the numt: Finding informative mitochondrial markers in a giant grasshopper genome
MetadataShow full item record
Pereira RJ, Ruiz-Ruano FJ, Thomas CJE, et al. Mind the numt: Finding informative mitochondrial markers in a giant grasshopper genome. J Zool Syst Evol Res. 2020;00:1–11. [https://doi.org/10.1111/jzs.12446]
SponsorshipH2020 Marie Sklodowska-Curie Actions 658706; Ministerio de Ciencia, Innovacion y Universidades PID2019-104952GB-I00/AEI/10.13039/501100011033
The barcoding of the mitochondrial COX1 gene has been instrumental in cataloguing the tree of life, and in providing insights in the phylogeographic history of species. Yet, this strategy has encountered difficulties in major clades characterized by large genomes, which contain a high frequency of nuclear pseudogenes originating from the mitochondrial genome (numts). Here, we use the meadow grasshopper (Chorthippus parallelus), which possesses a giant genome of ~13 Gb, to identify mitochondrial genes that are underrepresented as numts, and test their use as informative phylogeographic markers. We recover the same full mitochondrial sequence using both whole genome and transcriptome sequencing, including functional protein‐coding genes and tRNAs. We show that a region of the mitogenome containing the COX1 gene, typically used in DNA barcoding, has disproportionally higher diversity and coverage than the rest of the mitogenome, consistent with multiple insertions of that region into the nuclear genome. By designing new markers in regions of less elevated diversity and coverage, we identify two mitochondrial genes that are less likely to be duplicated as numts. We show that, while these markers show high levels of incomplete lineage sorting between subspecies, as expected for mitochondrial genes, genetic variation reflects their phylogeographic history accurately. These findings allow us to identify useful mitochondrial markers for future studies in C. parallelus, an important biological system for evolutionary biology. More generally, this study exemplifies how non‐PCR‐based methods using next‐generation sequencing can be used to avoid numts in species characterized by large genomes, which have remained challenging to study in taxonomy and evolution.