TY - JOUR
T1 - A stacked regression ensemble approach for the quantitative determination of biomass feedstock compositions using near infrared spectroscopy
AU - Dumancas, Gerard
AU - Adrianto, Indra
PY - 2022/8/5
Y1 - 2022/8/5
N2 - Rapid, robust, and accurate biomass compositional analyses are required in the bioenergy industry to accurately determine the chemical composition of biomass feedstocks. A stacked regression ensemble approach using near infrared spectroscopic method was developed for the quantitative determination of glucan, xylan, lignin, ash, and extract in biomass feedstocks. A comprehensive comparison of the performance of various machine learning techniques including support vector regression (linear and radial), least absolute shrinkage and selection operator (LASSO), ridge regression, elastic net, partial least squares, random forests, recursive partitioning and regression trees, gradient boosting, and gaussian process regression was assessed in the training set data (n = 188). The predictive performance of the aforementioned machine learning approaches was then compared with stacked regression, an ensemble learning algorithm which collates the performance of the abovementioned machine learning regression techniques. Results show that the stacked regression primarily outperformed other machine learning techniques (Root mean square error of prediction (RMSEP)average=1.660%wt,R2=0.907) across all five constituents in the validation set data (n = 81). Further results also show that the RMSEP of the stacked ensemble technique is significantly different than that of the partial least squares (PLS) approach in predicting glucan, ash, lignin, and extract components in biomass samples. The stacked ensemble learning approach offers an alternative method for a more accurate prediction of biomass compositions than the traditional PLS technique.
AB - Rapid, robust, and accurate biomass compositional analyses are required in the bioenergy industry to accurately determine the chemical composition of biomass feedstocks. A stacked regression ensemble approach using near infrared spectroscopic method was developed for the quantitative determination of glucan, xylan, lignin, ash, and extract in biomass feedstocks. A comprehensive comparison of the performance of various machine learning techniques including support vector regression (linear and radial), least absolute shrinkage and selection operator (LASSO), ridge regression, elastic net, partial least squares, random forests, recursive partitioning and regression trees, gradient boosting, and gaussian process regression was assessed in the training set data (n = 188). The predictive performance of the aforementioned machine learning approaches was then compared with stacked regression, an ensemble learning algorithm which collates the performance of the abovementioned machine learning regression techniques. Results show that the stacked regression primarily outperformed other machine learning techniques (Root mean square error of prediction (RMSEP)average=1.660%wt,R2=0.907) across all five constituents in the validation set data (n = 81). Further results also show that the RMSEP of the stacked ensemble technique is significantly different than that of the partial least squares (PLS) approach in predicting glucan, ash, lignin, and extract components in biomass samples. The stacked ensemble learning approach offers an alternative method for a more accurate prediction of biomass compositions than the traditional PLS technique.
KW - Biomass
KW - Chemometrics
KW - Near infrared spectroscopy
KW - Partial least squares
KW - Stacking
UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85128180785&origin=inward
UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=85128180785&origin=inward
U2 - 10.1016/j.saa.2022.121231
DO - 10.1016/j.saa.2022.121231
M3 - Article
SN - 1386-1425
VL - 276
JO - Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy
JF - Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy
M1 - 121231
ER -