详细信息
Enhancing public water management through interpretable machine learning: A random forest-SHAP analysis of spatio-temporal pH prediction ( SCI-EXPANDED收录)
文献类型:期刊文献
英文题名:Enhancing public water management through interpretable machine learning: A random forest-SHAP analysis of spatio-temporal pH prediction
作者:Zhang, Huanyu[1];Jiang, Bei[2];Chen, Ming[3]
第一作者:张鸿燕
通讯作者:Chen, M[1]
机构:[1]Beijing Union Univ, Coll Tourism, Beijing 100101, Peoples R China;[2]Shanghai Maritime Univ, Shanghai 201306, Peoples R China;[3]Beijing Union Univ, Smart City Coll, Beijing 100101, Peoples R China
第一机构:北京联合大学旅游学院
通讯机构:[1]corresponding author), Beijing Union Univ, Smart City Coll, Beijing 100101, Peoples R China.|[1141733]北京联合大学继续教育学院;[11417]北京联合大学;
年份:2026
卷号:326
外文期刊名:DESALINATION AND WATER TREATMENT
收录:;WOS:【SCI-EXPANDED(收录号:WOS:001752979800001)】;
语种:英文
外文关键词:Water quality prediction; Random forest; SHAP interpretability; PH modeling; Machine learning; Cross-validation
摘要:This study developed an interpretable machine learning framework based on Random Forest and SHAP analysis for spatiotemporal prediction of pH in public water systems. A systematic comparison of eight machine learning algorithms was conducted, including Linear Regression, Ridge Regression, Lasso Regression, K-Nearest Neighbors, Support Vector Regression, Decision Tree, Gradient Boosting, and Random Forest. Using multi-site monitoring data encompassing dissolved oxygen, specific conductance, temperature, and historical pH values, the Random Forest model achieved competitive and stable performance with test R2 of 0.8449, RMSE of 0.0116, and MAE of 0.0066, demonstrating the lowest cross-validation variance among ensemble methods with CV R2 of 0.8386 +/- 0.0066 among all evaluated models. SHAP interpretability analysis revealed that maximum dissolved oxygen dominates pH prediction with a mean absolute SHAP value of 0.0119, followed by maximum pH and specific conductance indicators. Feature importance validation through a reduced model containing only six key features retained 97.0% of the predictive performance, providing empirical support for monitoring network optimization. Temporal performance evaluation confirmed that the model maintains stable accuracy under diverse water quality conditions with minimal systematic bias. The results confirm that the Random Forest-SHAP framework can provide accurate predictions and mechanistic insights, offering guidance for water quality management and improved understanding of hydrochemical dynamics.
参考文献:
正在载入数据...
