• English 
    • español
    • English
    • français
  • FacebookPinterestTwitter
  • español
  • English
  • français
View Item 
  •   DIGIBUG Home
  • 1.-Investigación
  • Departamentos, Grupos de Investigación e Institutos
  • Departamento de Ingeniería de Computadores, Automática y Robótica
  • DICAR - Artículos
  • View Item
  •   DIGIBUG Home
  • 1.-Investigación
  • Departamentos, Grupos de Investigación e Institutos
  • Departamento de Ingeniería de Computadores, Automática y Robótica
  • DICAR - Artículos
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Using cited references to improve the retrieval of related biomedical documents

[PDF] 1471-2105-14-113.pdf (521.4Kb)
Identificadores
URI: http://hdl.handle.net/10481/28449
DOI: 10.1186/1471-2105-14-113
ISSN: 1471-2105
Exportar
RISRefworksMendeleyBibtex
Estadísticas
View Usage Statistics
Metadata
Show full item record
Author
Ortuño, Francisco M.; Rojas Ruiz, Ignacio; Andrade-Navarro, Miguel A.; Fontaine, Jean-Fred
Editorial
Biomed Central
Materia
Information retrieval
 
Text categorization
 
Citations
 
Full-text documents
 
Biomedical literature
 
Query expansion
 
Document classifcation
 
Date
2013
Referencia bibliográfica
Ortuño, F.M.; et al. Using cited references to improve the retrieval of related biomedical documents. BMC Bioinformatics, 14: 113 (2013). [http://hdl.handle.net/10481/28449]
Sponsorship
This study was funded by the Helmholtz Alliance for Systems Biology (Germany), grant to MAA-N, and the Government of Andalusia (Spain), Project P09-TIC-175476.
Abstract
Background A popular query from scientists reading a biomedical abstract is to search for topic-related documents in bibliographic databases. Such a query is challenging because the amount of information attached to a single abstract is little, whereas classification-based retrieval algorithms are optimally trained with large sets of relevant documents. As a solution to this problem, we propose a query expansion method that extends the information related to a manuscript using its cited references.
 
Results Data on cited references and text sections in 249,108 full-text biomedical articles was extracted from the Open Access subset of the PubMed Central® database (PMC-OA). Of the five standard sections of a scientific article, the Introduction and Discussion sections contained most of the citations (mean = 10.2 and 9.9 citations, respectively). A large proportion of articles (98.4%) and their cited references (79.5%) were indexed in the PubMed® database. Using the MedlineRanker abstract classification tool, cited references allowed accurate retrieval of the citing document in a test set of 10,000 documents and also of documents related to six biomedical topics defined by particular MeSH® terms from the entire PMC-OA (p-value<0.01). Classification performance was sensitive to the topic and also to the text sections from which the references were selected. Classifiers trained on the baseline (i.e., only text from the query document and not from the references) were outperformed in almost all the cases. Best performance was often obtained when using all cited references, though using the references from Introduction and Discussion sections led to similarly good results. This query expansion method performed significantly better than pseudo relevance feedback in 4 out of 6 topics.
 
Conclusions The retrieval of documents related to a single document can be significantly improved by using the references cited by this document (p-value<0.01). Using references from Introduction and Discussion performs almost as well as using all references, which might be useful for methods that require reduced datasets due to computational limitations. Cited references from particular sections might not be appropriate for all topics. Our method could be a better alternative to pseudo relevance feedback though it is limited by full text availability.
 
Collections
  • DICAR - Artículos

My Account

LoginRegister

Browse

All of DIGIBUGCommunities and CollectionsBy Issue DateAuthorsTitlesSubjectFinanciaciónAuthor profilesThis CollectionBy Issue DateAuthorsTitlesSubjectFinanciación

Statistics

View Usage Statistics

Servicios

Pasos para autoarchivoAyudaLicencias Creative CommonsSHERPA/RoMEODulcinea Biblioteca UniversitariaNos puedes encontrar a través deCondiciones legales

Contact Us | Send Feedback