On the use of the observation-wise k-fold operation in PCA cross-validation Saccenti, Edoardo Camacho Páez, José Cross-validation Principal component analysis dimensionality assessment Cross-validation (CV) is a common approach for determining the optimal number of components in a principal component analysis model. To guarantee the independence between model testing and calibration, the observation-wise k-fold operation is commonly implemented in each cross-validation step. This operation renders the CV algorithm computationally intensive and it is the main limitation to apply CV on very large data sets. In this paper we carry out an empirical and theoretical investigation of the use of this operation in the element wise k-fold (ekf ) algorithm, the state-of-the-art CV algorithm. We show that when very large data sets need to be cross-validated and the computational time is a matter of concern, the observation-wise k-fold operation can be skipped. The theoretical properties of the resulting modi ed algorithm, referred to as column wise k-fold (ckf ) algorithm, are derived. Also, its performance is evaluated with several arti cial and real data sets. We suggest the ckf algorithm to be a valid alternative to the standard ekf to reduce the computational time needed to cross-validate a data set. 2019-04-01T08:02:37Z 2019-04-01T08:02:37Z 2015-07 info:eu-repo/semantics/article Saccenti, E., and Camacho, J. ( 2015), On the use of the observation‐wise k‐fold operation in PCA cross‐validation. J. Chemometrics, 29, 467– 478. doi: 10.1002/cem.2726. http://hdl.handle.net/10481/55302 https://doi.org/10.1002/cem.2726 eng http://creativecommons.org/licenses/by-nc-nd/3.0/es/ info:eu-repo/semantics/openAccess Atribución-NoComercial-SinDerivadas 3.0 España