Genome Divergence Based on Entropic Segmentation of DNA
Metadatos
Mostrar el registro completo del ítemAutor
Bernaola Galván, Pedro A.; Carpena, Pedro; Gómez Martín, Cristina; Oliver Jiménez, José LutgardoEditorial
MDPI
Materia
entropic segmentation Jensen-Shannon divergence genome signatures
Fecha
2025-09-28Referencia bibliográfica
Bernaola-Galván, P.A.; Carpena, P.; Gómez-Martín, C.; Oliver, J.L. Genome Divergence Based on Entropic Segmentation of DNA. Entropy 2025, 27, 1019. https://doi.org/10.3390/e27101019
Patrocinador
Junta de Andalucía (Grant no. FQM-362)Resumen
The concept of a genome signature broadly refers to characteristic patterns in DNA sequences that enable the identification and comparison of species or individuals, often
without requiring sequence alignment. Such signatures have applications ranging from
forensic identification of individuals to cancer genomics. In comparative genomics and
evolutionary biology, genome signatures typically rely on statistical properties of DNA that
are species-specific and carry phylogenetic information reflecting evolutionary relationships. We propose a novel genome signature based on the compositional structure of DNA,
defined by the distributions of strong/weak, purine/pyrimidine, and keto/amino ratios
across DNA segments identified through entropic segmentation. We observe that these ratio
distributions are similar among closely related species but differ markedly between distant
ones. To quantify these differences, we employ the Jensen–Shannon distance—a symmetric
and robust measure of distributional dissimilarity—to define a genome-to-genome distance
metric, termed Segment Compositional Distance (D). Our results demonstrate a clear
correlation between D and species divergence times, and also that this metric captures a
strong phylogenetic signal. Our method employs a genome-wide approach rather than
tracking specific mutations; thus, D offers a coarse-grained perspective on genome compositional evolution, contributing to the ongoing discussion surrounding the molecular
clock hypothesis.





