[ Information ] [ Publications ] [Signal processing codes] [ Signal & Image Links ]

[ Main blog: A fortunate hive ] [ Blog: Information CLAde ] [ Personal links ]

[ SIVA Conferences ] [ Other conference links ] [ Journal rankings ]

[ Tutorial on 2D wavelets ] [ WITS: Where is the starlet? ]

If you cannot find anything more, look for something else (Bridget Fountain)
Si vous ne trouvez plus rien, cherchez autre chose (Brigitte Fontaine)

Dual-sPLS: a family of Dual Sparse Partial Least Squares regressions with feature selection and tunable sparsity

TL;DR: dual-sPLS [paper ] [Python] [R] [video1] [video2] [data]

Videos

Monday Webinar: Dual-sPLS: a versatile approach improving PLS with Lasso shrinkage, hosted by Rasmus Bro for Chemometrics & Machine Learning in Copenhagen, 2023/04/25
When facing high-dimensional problems in analytical chemistry, researchers often turn to two common techniques: projection methods, such as PLS, and variable selection algorithms, such as Lasso. However, integrating both approaches can yield more accurate and interpretable results. This is where the sparse PLS method comes in, which combines variable selection with PLS dimension reduction. To further improve this technique, we have developed a new algorithm known as dual-SPLS. This method aims to produce a sparser representation of the data while still maintaining accurate predictions. Moreover, it can handle multiple sets of predictors that are related to the same response variable. Dual-SPLS is based on the dual of a chosen norm, which is the maximum inner product between the data and the weight vector with a constraint on the norm of the weight vector. It's worth noting that the dual of the classical Euclidean norm is itself. In other words, searching for the Euclidean dual norm of the inner product between X and y is equivalent to finding the first weight vector of the PLS1 algorithm for the first component. By considering various types of norms, including adaptive penalization, the dual-SPLS method can provide a more versatile approach to solving high-dimensional problems in analytical chemistry.
Louna AlSouki: dual-sPLS for functional data regression in with sparse Partial Least Squares (PLS), PhD Defense, 2023/06/15
"Functional data regression with prediction and interpretability: property inference in chemometrics with sparse Partial Least Squares (PLS)". Analytical chemistry plays a crucial role in various fields as it it covers identification, quantification, and characterization of chemical substances. It is essential for understanding the composition and behavior of matter and for developing new materials and technologies. The thesis also focuses on the interpretability of the results by detecting information using parsimony indicators, which refers to the presence of a relatively small number of non-zero coefficients in the model. Dimension reduction techniques in data analysis includes projection (like PLS) and penalized (like lasso) methods. A new approach called Dual sparse Partial Least Squares was developed, which combines the advantages of both techniques for improved interpretability and accuracy in prediction models. The method uses a dual norm of selected penalties and our studies suggest four types of norms. A comparative benchmark test showed that the approach provided better interpretation of the trained prediction model with accurate prediction. It was also implemented in an R package called dual.spls, which also includes real data, a data simulation algorithm, a calibration and validation method, and evaluation tools.

Codes, toolboxes, packages for dual-norm sparse Partial Least Squares (dual-sPLS)

R package dual.spls, version 0.1.4) (developed by Louna Alsouki, Fives)
- R code: dual-sPLS on CRAN
- R code: dual-sPLS, Louna's GitHub (github.com/AlsoukiL/dual.spls)
- R code vignette: dual-sPLS vignette
Python package dual-spls, version 0.0.7 (developed by Louca Malerba, CentraleSupélec)
- dual-sPLS, Louca's GitHub (github.com/malerbe/Dual-sPLS)
- dual-sPLS, PyPI Python Packets Index
- associated notebooks
  - Fundamentals: docs/PLS.ipynb
  - Introducing Sparsity: docs/sPLS.ipynb
  - The Dual Approach: docs/Dual_sPLS.ipynb

MLNIRdata dataset: 208 NIR (near infrared) spectra and their derivatives for density prediction

References

paper: Dual-sPLS: a family of Dual Sparse Partial Least Squares regressions for feature selection and prediction with tunable sparsity; evaluation on simulated and near-infrared (NIR) data,Chemometrics and Intelligent Laboratory Systems, June 2023
arxiv: Dual-sPLS: a family of Dual sparse partial least squares regressions for feature selection and prediction with tunable sparsity; evaluation on simulated and near-infrared (NIR) data
SSRN eLibrary: Dual-sPLS, preprint page
Improving PLS with lasso shrinkage using the dual.spls package (slides)
A generalized method for Sparse Partial Least Squares (Dual-sPLS): theory and applications
CalValXy: well-balanced and stratified calibration/validation splitting using both predictors $X$ and response $y$ (to be submitted, Technometrics)

Entertainment

#dual Sparse PLS: while you are waiting for our results and papers, an addictive musical compilation

From Z to A, 26 tunes on LSD en PLS : Little Substance Dictionary / Petit Lexique de Substances, another link between chemistry and... whatever