Predicting groundwater arsenic contamination: Regions at risk in highest populated state of India.


Analytical and Geochemistry Laboratory, Dept. of Energy and Environment, TERI School of Advanced Studies, New Delhi, India. Electronic address: [Email]


Arsenic (As) contamination of groundwater is a public health concern, impacting the lives of approximately 100 million people in India. Chronic exposure to As significantly increases mortality due to the occurrence of several types of cancer, respiratory and cardiac diseases. Uttar Pradesh is a part of the middle Indo-Gangetic plains and has been found to be severely affected by As contamination of groundwater, as established by several small-scale studies. The current study incorporates a hybrid method based on a random forest ensemble algorithm and univariate feature selection using 1473 data points for predicting As in the region. Twenty direct/proxy predictor variables were considered to describe the geochemical environment, aquifer conditions and topography that are responsible for As enrichment in groundwater. The map of As predicted through the hybrid random forest ensemble model shows an overall accuracy of 84.67%. The hybrid random forest model performs better than the univariate, logistic, fuzzy, adaptive fuzzy and adaptive neuro fuzzy inference systems, which have been widely used for As prediction. The projected number of rural populations at risk due to high As exposure is 12% of the total population of the region, which accounts for 23.48 million people who are at risk. The predictive map provides insight for the regions where future testing campaigns and interventions for mitigation should be prioritized by policymakers.


Arsenic,Hybrid random forest model,India,Prediction,Regression,