Abstract

Pentachlorophenol (PCP) is a commonly found recalcitrant and toxic groundwater contaminant that resists degradation, bioaccumulates, and has a potential for long-range environmental transport. Taking proper actions to deal with the pollutant accounting for the life cycle consequences requires a better understanding of its behavior in the subsurface. We recognize the huge potential for enhancing decision-making at contaminated groundwater sites with the arrival of machine learning (ML) techniques in environmental applications. We used ML to enhance the understanding of the dynamics of PCP transport properties in the subsurface, and to determine key hydrochemical and hydrogeological drivers affecting its transport and fate. We demonstrate how this complementary knowledge, provided by data-driven methods, may enable a more targeted planning of monitoring and remediation at two highly contaminated Swedish groundwater sites, where the method was validated. We evaluated 6 interpretable ML methods, 3 linear regressors and 3 non-linear (i.e., tree-based) regressors, to predict PCP concentration in the groundwater. The modeling results indicate that simple linear ML models were found to be useful in the prediction of observations for datasets without any missing values, while tree-based regressors were more suitable for datasets containing missing values. Considering that missing values are common in datasets collected during contaminated site investigations, this could be of significant importance for contaminated site planners and managers, ultimately reducing site investigation and monitoring costs. Furthermore, we interpreted the proposed models using the SHAP (SHapley Additive exPlanations) approach to decipher the importance of different drivers in the prediction and simulation of critical hydrogeochemical variables. Among these, sum of chlorophenols is of highest significance in the analyses. Setting that aside from the model, tetra chlorophenols, dissolved organic carbon, and conductivity found to be of highest importance. Accordingly, ML methods could potentially be used to improve the understanding of groundwater contamination transport dynamics, filling gaps in knowledge that remain when using more sophisticated deterministic modeling approaches.

Original languageEnglish
Article number123449
JournalEnvironmental Pollution
Volume345
Early online date2024 Jan 24
DOIs
Publication statusPublished - 2024 Mar 15

Subject classification (UKÄ)

  • Environmental Sciences

Free keywords

  • Contaminated sites
  • Explainable artificial intelligence
  • SHAP value
  • Sustainable remediation
  • Tree-based regression

Fingerprint

Dive into the research topics of 'Interpretable machine learning for predicting the fate and transport of pentachlorophenol in groundwater'. Together they form a unique fingerprint.

Cite this