Mostrar el registro sencillo del ítem

dc.contributor.authorHackenberg , Michael 
dc.contributor.authorPreviti, Christopher
dc.contributor.authorLuque-Escamilla, Pedro Luis
dc.contributor.authorCarpena, Pedro
dc.contributor.authorMartínez-Aroza, José
dc.contributor.authorOliver, José Luis
dc.date.accessioned2014-04-01T11:35:14Z
dc.date.available2014-04-01T11:35:14Z
dc.date.issued2006
dc.identifier.citationHackenberg, M.; et al. CpGcluster: a distance-based algorithm for CpG-island detection. BMC Bioinformatics, 7: 446 (2006). [http://hdl.handle.net/10481/31175]es_ES
dc.identifier.issn1471-2105
dc.identifier.otherdoi: 10.1186/1471-2105-7-446
dc.identifier.urihttp://hdl.handle.net/10481/31175
dc.description.abstract[Background] Despite their involvement in the regulation of gene expression and their importance as genomic markers for promoter prediction, no objective standard exists for defining CpG islands (CGIs), since all current approaches rely on a large parameter space formed by the thresholds of length, CpG fraction and G+C content. [Results] Given the higher frequency of CpG dinucleotides at CGIs, as compared to bulk DNA, the distance distributions between neighboring CpGs should differ for bulk and island CpGs. A new algorithm (CpGcluster) is presented, based on the physical distance between neighboring CpGs on the chromosome and able to predict directly clusters of CpGs, while not depending on the subjective criteria mentioned above. By assigning a p-value to each of these clusters, the most statistically significant ones can be predicted as CGIs. CpGcluster was benchmarked against five other CGI finders by using a test sequence set assembled from an experimental CGI library. CpGcluster reached the highest overall accuracy values, while showing the lowest rate of false-positive predictions. Since a minimum-length threshold is not required, CpGcluster can find short but fully functional CGIs usually missed by other algorithms. The CGIs predicted by CpGcluster present the lowest degree of overlap with Alu retrotransposons and, simultaneously, the highest overlap with vertebrate Phylogenetic Conserved Elements (PhastCons). CpGcluster's CGIs overlapping with the Transcription Start Site (TSS) show the highest statistical significance, as compared to the islands in other genome locations, thus qualifying CpGcluster as a valuable tool in discriminating functional CGIs from the remaining islands in the bulk genome. [Conclusion] CpGcluster uses only integer arithmetic, thus being a fast and computationally efficient algorithm able to predict statistically significant clusters of CpG dinucleotides. Another outstanding feature is that all predicted CGIs start and end with a CpG dinucleotide, which should be appropriate for a genomic feature whose functionality is based precisely on CpG dinucleotides. The only search parameter in CpGcluster is the distance between two consecutive CpGs, in contrast to previous algorithms. Therefore, none of the main statistical properties of CpG islands (neither G+C content, CpG fraction nor length threshold) are needed as search parameters, which may lead to the high specificity and low overlap with spurious Alu elements observed for CpGcluster predictions.es_ES
dc.description.sponsorshipThis work was supported by the Spanish Government (BIO2005-09116-C03-01 to JLO, MH, PC and CP and BIO2002-04014-C03-03 to PLE and JMA) and Plan Andaluz de Investigación (CVI-162 and FQM-322).es_ES
dc.language.isoenges_ES
dc.publisherBiomed Centrales_ES
dc.rightsCreative Commons Attribution-NonCommercial-NoDerivs 3.0 Licensees_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es_ES
dc.subjectAlgorithms es_ES
dc.subjectAnimals es_ES
dc.subjectCpG islandses_ES
dc.subjectGenomees_ES
dc.subjectHumanses_ES
dc.subjectMice es_ES
dc.titleCpGcluster: a distance-based algorithm for CpG-island detectiones_ES
dc.typejournal articlees_ES
dc.rights.accessRightsopen accesses_ES


Ficheros en el ítem

[PDF]

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License
Excepto si se señala otra cosa, la licencia del ítem se describe como Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License