Estimating biomass major chemical constituents from ultimate analysis using a random forest model.


State Key Laboratory of Clean Energy Utilization, Zhejiang University, Hangzhou 310027, China. Electronic address: [Email]


Chemical constituents are important properties for utilization of biomass, and experimental approaches are always expensive and time-consuming to determinate those properties. Here, a novel random forest (RF) model is developed for accurately predicting biomass major chemical constituents from the much-easier available ultimate analysis, and compared with the previous correlation as well as the experimental data. Two databases are constructed for training and application of the RF model from available literature. The training results show that the determination coefficients (R2) of the RF model predictions are 0.954, 0.933 and 0.968 for cellulose, hemicellulose and lignin, respectively. The application results show that the present RF model can give accurate predictions on chemical constituents for various biomasses with MAPE<20%, and R2 are 0.862, 0.904 and 0.962 for predictions of cellulose, hemicellulose and lignin, respectively. While the previous correlation only works for a narrow range used to develop the correlation, and gives unrealistic negative predictions with MAPE>500% for outside samples.


Biomass,Chemical constituents,Random forest,Ultimate analysis,