Please use this identifier to cite or link to this item: http://repositorio.unicamp.br/jspui/handle/REPOSIP/71959
Type: Artigo de periódico
Title: Sorting variables by using informative vectors as a strategy for feature selection in multivariate regression
Author: Teofilo, RF
Martins, JPA
Ferreira, MMC
Abstract: A new procedure with high ability to enhance prediction of multivariate calibration models with a small number of interpretable variables is presented. The core of this methodology is to sort the variables from an informative vector, followed by a systematic investigation of PLS regression models with the aim of finding the most relevant set of variables by comparing the cross-validation parameters of the models obtained. In this work, seven main informative vectors i.e. regression vector, correlation vector, residual vector, variable influence on projection (VIP), net analyte signal (NAS), covariance procedures vector (CovProc), signal-to-noise ratios vector (StN) and their combinations were automated and tested with the main purpose of feature selection. Six data sets from different sources were employed to validate this methodology. They originated from: near-infrared (NIR) spectroscopy, Raman spectroscopy, gas chromatography (GC), fluorescence spectroscopy, quantitative structure-activity relationships (QSAR) and computer simulation. The results indicate that all vectors and their combinations were able to enhance prediction capability with respect to the full data sets. However, regression and NAS informative vectors from partial least squares (PLS) regression, both built using more latent variables than when building the model presented in most of tested data sets, were the best informative vectors for variable selection. In all the applications, the selected variables were quite effective and useful for interpretation. Copyright (C) 2008 John Wiley & Sons, Ltd.
Subject: variable selection
informative vectors
OPS
partial least squares
chemometrics
Country: Inglaterra
Editor: John Wiley & Sons Ltd
Rights: fechado
Identifier DOI: 10.1002/cem.1192
Date Issue: 2009
Appears in Collections:Artigos e Materiais de Revistas Científicas - Unicamp

Files in This Item:
File Description SizeFormat 
WOS000264037400004.pdf1.12 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.