| [ Information ] [ Publications ] [Signal processing codes] [ Signal & Image Links ] | |
| [ Main blog: A fortunate hive ] [ Blog: Information CLAde ] [ Personal links ] | |
| [ SIVA Conferences ] [ Other conference links ] [ Journal rankings ] | |
| [ Tutorial on 2D wavelets ] [ WITS: Where is the starlet? ] | |
If you cannot find anything more, look for something else (Bridget Fountain) |
|
|
|
|
TL;DR: dual-sPLS [paper] [python code] [R code] [video1] [video2] [data]
When facing high-dimensional problems in analytical chemistry, researchers often turn to two common techniques: projection methods, such as PLS, and variable selection algorithms, such as Lasso. However, integrating both approaches can yield more accurate and interpretable results. This is where the sparse PLS method comes in, which combines variable selection with PLS dimension reduction. To further improve this technique, we have developed a new algorithm known as dual-SPLS. This method aims to produce a sparser representation of the data while still maintaining accurate predictions. Moreover, it can handle multiple sets of predictors that are related to the same response variable. Dual-SPLS is based on the dual of a chosen norm, which is the maximum inner product between the data and the weight vector with a constraint on the norm of the weight vector. It's worth noting that the dual of the classical Euclidean norm is itself. In other words, searching for the Euclidean dual norm of the inner product between X and y is equivalent to finding the first weight vector of the PLS1 algorithm for the first component. By considering various types of norms, including adaptive penalization, the dual-SPLS method can provide a more versatile approach to solving high-dimensional problems in analytical chemistry.
"Functional data regression with prediction and interpretability: property inference in chemometrics with sparse Partial Least Squares (PLS)". Analytical chemistry plays a crucial role in various fields as it it covers identification, quantification, and characterization of chemical substances. It is essential for understanding the composition and behavior of matter and for developing new materials and technologies. The thesis also focuses on the interpretability of the results by detecting information using parsimony indicators, which refers to the presence of a relatively small number of non-zero coefficients in the model. Dimension reduction techniques in data analysis includes projection (like PLS) and penalized (like lasso) methods. A new approach called Dual sparse Partial Least Squares was developed, which combines the advantages of both techniques for improved interpretability and accuracy in prediction models. The method uses a dual norm of selected penalties and our studies suggest four types of norms. A comparative benchmark test showed that the approach provided better interpretation of the trained prediction model with accurate prediction. It was also implemented in an R package called dual.spls, which also includes real data, a data simulation algorithm, a calibration and validation method, and evaluation tools.