Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization Almorox, Lucía Antequera, Laura Rojas Ruiz, Ignacio Herrera Maldonado, Luis Javier Ortuño Guzmán, Francisco Manuel Uterine corpus cancer Cervical cancer Cervical adenocarcinoma Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/genes15030312/s1, Figure S1: Classification of CORPUS_TUMOR, CERVIX_TUMOR and HEALTHY uterine samples: k-NN test macro F1 score using MRMR as feature selection method. The values are presented as a function of the number of biomarkers used; Figure S2: Classification of CORPUS_TUMOR, CERVIX_TUMOR and HEALTHY uterine samples: sum of the test confusion matrices of the 5-fold cross-validation when using the first 10 MRMR selected miRNAs; Figure S3: Classification of CORPUS_TUMOR, CERVIX_TUMOR and HEALTHY uterine samples: sum of the test confusion matrices of the 5-fold cross-validation when using the two-miRNA signature; Figure S4: Boxplots showing the expression of hsa-mir-21 and hsa-mir-10b miRNAs in each uterine sample class (CORPUS_TUMOR, CERVIX_TUMOR and HEALTHY) using all quality samples; Table S1: Downloaded and filtered samples of each class; Table S2: Classification of CORPUS_TUMOR, CERVIX_TUMOR and HEALTHY uterine samples: top 10 MRMR selected miRNAs for each fold (train set) of the 5-fold cross-validation; Table S3: Classification of CORPUS_TUMOR, CERVIX_TUMOR and HEALTHY uterine samples based on a set of miRNAs associated with the identified gene signature: top 10 MRMR selected miRNAs for each fold (train set) of the 5-fold cross-validation; References [14,33–35,40] are cited in the Supplementary Materials. The analysis of gene expression quantification data is a powerful and widely used approach in cancer research. This work provides new insights into the transcriptomic changes that occur in healthy uterine tissue compared to those in cancerous tissues and explores the differences associated with uterine cancer localizations and histological subtypes. To achieve this, RNA-Seq data from the TCGA database were preprocessed and analyzed using the KnowSeq package. Firstly, a kNN model was applied to classify uterine cervix cancer, uterine corpus cancer, and healthy uterine samples. Through variable selection, a three-gene signature was identified (VWCE, CLDN15, ADCYAP1R1), achieving consistent 100% test accuracy across 20 repetitions of a 5-fold cross-validation. A supplementary similar analysis using miRNA-Seq data from the same samples identified an optimal two-gene miRNA-coding signature potentially regulating the three-gene signature previously mentioned, which attained optimal classification performance with an 82% F1-macro score. Subsequently, a kNN model was implemented for the classification of cervical cancer samples into their two main histological subtypes (adenocarcinoma and squamous cell carcinoma). A uni-gene signature (ICA1L) was identified, achieving 100% test accuracy through 20 repetitions of a 5-fold cross-validation and externally validated through the CGCI program. Finally, an examination of six cervical adenosquamous carcinoma (mixed) samples revealed a pattern where the gene expression value in the mixed class aligned closer to the histological subtype with lower expression, prompting a reconsideration of the diagnosis for these mixed samples. In summary, this study provides valuable insights into the molecular mechanisms of uterine cervix and corpus cancers. The newly identified gene signatures demonstrate robust predictive capabilities, guiding future research in cancer diagnosis and treatment methodologies. 2024-05-10T11:56:38Z 2024-05-10T11:56:38Z 2024-02-28 journal article Almorox, L.; Antequera, L.; Rojas, I.; Herrera, L.J.; Ortuño, F.M. Gene Expression Analysis for Uterine Cervix and Corpus Cancer Characterization. Genes 2024, 15, 312. [https://doi.org/10.3390/genes15030312] https://hdl.handle.net/10481/91657 10.3390/genes15030312 eng http://creativecommons.org/licenses/by/4.0/ open access Atribución 4.0 Internacional MDPI