Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies
Metadata
Show full item recordEditorial
Medical Library Association
Materia
Biomedical databases Spanish personal name Information retrieval Bibliometric studies Science Citation Index Author Quality Errors Medline
Date
2002Referencia bibliográfica
Ruiz-Pérez, R.; Delgado López-Cózar, E.; Jiménez-Contreras, E. Spanish personal name variations in national and international biomedical databases: implications for information retrieval and bibliometric studies. Journal of the Medical Library Association, 90(4): 411-430 (2002). [http://hdl.handle.net/10481/32465]
Abstract
Objectives: The study sought to investigate how Spanish names are handled by national and international databases and to identify mistakes that can undermine the usefulness of these databases for locating and retrieving works by Spanish authors.
Methods: The authors sampled 172 articles published by authors from the University of Granada Medical School between 1987 and 1996 and analyzed the variations in how each of their names was indexed in Science Citation Index (SCI), MEDLINE, and indice M6dico Espanol (IME). The number and types of variants that appeared for each author's name were recorded and compared across databases to identify inconsistencies in indexing practices. We analyzed the relationship between variability (number of variants of an author's name) and productivity (number of items the name was associated with as an author), the consequences for retrieval of information, and the most frequent indexing structures used for Spanish names.
Results: The proportion of authors who appeared under more then one name was 48.1% in SCI, 50.7% in MEDLINE, and 69.0% in IME. Productivity correlated directly with variability: more than 50% of the authors listed on five to ten items appeared under more than one name in any given database, and close to 100% of the authors listed on more than ten items appeared under two or more variants. Productivity correlated inversely with retrievability: as the number of variants for a name increased, the number of items retrieved under each variant decreased. For the most highly productive authors, the number of items retrieved under each variant tended toward one. The most frequent indexing methods varied between databases. In MEDLINE and IME, names were indexed correctly as "first surname second surname, first name initial middle name initial" (if present) in 41.7% and 49.5% of the records, respectively. However, in SCI, the most frequent method was "first surname, first name initial second name initial" (48.0% of the records) and first surname and second surname run together, first name initial (18.3%).
Conclusions: Retrievability on the basis of author's name was poor in all three databases. Each database uses accurate indexing methods, but these methods fail to result in consistency or coherence for specific'. entries. The likely causes of inconsistency are: (1) use by authors of variants of their names during their publication careers, (2) lack of authority control in all three databases, (3) the use of an inappropriate indexing method for Spanish names in SCI, (4) authors' inconsistent behaviors, and (5) possible editorial interventions by some journals. We offer some suggestions as to how to avert the proliferation of author name variants in the databases.