Comparing different machine learning and mathematical regression models to evaluate multiple sequence alignments Ortuño Guzmán, Francisco Manuel Valenzuela Cansino, Olga Prieto Campos, Beatriz Sáez Lara, María José Torres Perales, Carolina Pomares Cintas, Héctor Emilio Rojas Ruiz, Ignacio multiple sequence alignments regression models lssvm anova The evaluation of multiple sequence alignments (MSAs) is still an open task in bioinformatics. Current MSA scores do not agree about how alignments must be accurately evaluated. Consequently, it is not trivial to know the quality of MSAs when reference alignments are not provided. Recent scores tend to use more complex evaluations adding supplementary biological features. In this work, a set of novel regression approaches are proposed for the MSA evaluation, comparing several supervised learning and mathematical methodologies. Therefore, the following models specifically designed for regression are applied: regression trees, a bootstrap aggregation of regression trees (bagging trees), least-squares support vector machines (LS-SVMs) and Gaussian processes. These algorithms consider a heterogeneous set of biological features together with other standard MSA scores in order to predict the quality of alignments. The most relevant features are then applied to build novel score schemes for the evaluation of alignments. The proposed algorithms are validated by using the BAliBASE benchmark. Additionally, an statistical ANOVA test is performed to study the relevance of these scores considering three alignment factors. According to the obtained results, the four regression models provide accurate evaluations, even outperforming other standard scores such as BLOSUM, PAM or STRIKE. 2025-01-30T08:00:01Z 2025-01-30T08:00:01Z 2015-09-21 journal article Ortuño, F.M., Valenzuela, O., Prieto, B., Saez-Lara, M.J., Torres, C., Pomares, H. and Rojas, I., 2015. Comparing different machine learning and mathematical regression models to evaluate multiple sequence alignments. Neurocomputing, 164, pp.123-136. https://hdl.handle.net/10481/101051 10.1016/j.neucom.2015.01.080 eng 164;123-136 http://creativecommons.org/licenses/by-nc-nd/3.0/ embargoed access Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License Elsevier