Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley
Groundwater, the second largest stock of freshwater on the planet, is an important water source used for municipal water supply, irrigation, or industrial needs. For instance, California’s arid Central Valley relies on groundwater resources to produce a quarter of the United States’ food demand as f...
| Published in: | Journal of Hydrology X |
|---|---|
| Main Authors: | , , , , , , , |
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2023-12-01
|
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2589915523000147 |
| _version_ | 1850111377443651584 |
|---|---|
| author | Gabriela May-Lagunes Valerie Chau Eric Ellestad Leyla Greengard Paolo D'Odorico Puya Vahabi Alberto Todeschini Manuela Girotto |
| author_facet | Gabriela May-Lagunes Valerie Chau Eric Ellestad Leyla Greengard Paolo D'Odorico Puya Vahabi Alberto Todeschini Manuela Girotto |
| author_sort | Gabriela May-Lagunes |
| collection | DOAJ |
| container_title | Journal of Hydrology X |
| description | Groundwater, the second largest stock of freshwater on the planet, is an important water source used for municipal water supply, irrigation, or industrial needs. For instance, California’s arid Central Valley relies on groundwater resources to produce a quarter of the United States’ food demand as farmers rely on this precious resource when surface water is scarce. Despite its importance, the nexus between groundwater dynamics and climate drivers remains difficult to quantify, model, and predict because of the lack of a comprehensive observation network. In this study, machine learning techniques were used to predict groundwater levels with a 3-month forecasting horizon for the Sacramento River Basin. For this, publicly available meteorological and hydrological datasets and in-situ well-level measurements were used. Time series, ensemble-based, and deep-learning models including transformers were all tested, with an ensemble-based, XGBoost model, producing the best mean standard deviation percent error (MSPE) of 32.23% and a root mean squared error (RMSE) of 1.05 m (m) when using a 3- month forecasting horizon and when tested using a monthly rolling window over the years 2017–2020. The model proved to be better at predicting into wet months than the dry summer months and was found to be better at extracting seasonality than explaining well-level residuals, with well-specific features, as opposed to exogenous meteorological features specific to the hydrological unit of the well, ranking as the most important features to the model. Though other forecasting horizons were tested, a 3-month look-ahead window resulted in the best balance of precision and accuracy, where smaller forecasting horizons resulted in smaller RMSE but larger MSPE scores and vice-versa for larger forecasting horizons. |
| format | Article |
| id | doaj-art-68cc3ff43fb14e9a94fbd4ab6ea41a4e |
| institution | Directory of Open Access Journals |
| issn | 2589-9155 |
| language | English |
| publishDate | 2023-12-01 |
| publisher | Elsevier |
| record_format | Article |
| spelling | doaj-art-68cc3ff43fb14e9a94fbd4ab6ea41a4e2025-08-19T23:59:33ZengElsevierJournal of Hydrology X2589-91552023-12-012110016110.1016/j.hydroa.2023.100161Forecasting groundwater levels using machine learning methods: The case of California’s Central ValleyGabriela May-Lagunes0Valerie Chau1Eric Ellestad2Leyla Greengard3Paolo D'Odorico4Puya Vahabi5Alberto Todeschini6Manuela Girotto7University of California, School of Information, Berkeley, CA 94720, USA; University of California, Department of Environmental Science, Policy and Management, Berkeley, CA 94720, USA; Corresponding authors at: University of California, School of Information, Berkeley, CA 94720, USA (Gabriela May-Lagunes).University of California, School of Information, Berkeley, CA 94720, USAUniversity of California, School of Information, Berkeley, CA 94720, USAUniversity of California, School of Information, Berkeley, CA 94720, USA; Corresponding authors at: University of California, School of Information, Berkeley, CA 94720, USA (Gabriela May-Lagunes).University of California, Department of Environmental Science, Policy and Management, Berkeley, CA 94720, USAUniversity of California, School of Information, Berkeley, CA 94720, USAUniversity of California, School of Information, Berkeley, CA 94720, USAUniversity of California, Department of Environmental Science, Policy and Management, Berkeley, CA 94720, USAGroundwater, the second largest stock of freshwater on the planet, is an important water source used for municipal water supply, irrigation, or industrial needs. For instance, California’s arid Central Valley relies on groundwater resources to produce a quarter of the United States’ food demand as farmers rely on this precious resource when surface water is scarce. Despite its importance, the nexus between groundwater dynamics and climate drivers remains difficult to quantify, model, and predict because of the lack of a comprehensive observation network. In this study, machine learning techniques were used to predict groundwater levels with a 3-month forecasting horizon for the Sacramento River Basin. For this, publicly available meteorological and hydrological datasets and in-situ well-level measurements were used. Time series, ensemble-based, and deep-learning models including transformers were all tested, with an ensemble-based, XGBoost model, producing the best mean standard deviation percent error (MSPE) of 32.23% and a root mean squared error (RMSE) of 1.05 m (m) when using a 3- month forecasting horizon and when tested using a monthly rolling window over the years 2017–2020. The model proved to be better at predicting into wet months than the dry summer months and was found to be better at extracting seasonality than explaining well-level residuals, with well-specific features, as opposed to exogenous meteorological features specific to the hydrological unit of the well, ranking as the most important features to the model. Though other forecasting horizons were tested, a 3-month look-ahead window resulted in the best balance of precision and accuracy, where smaller forecasting horizons resulted in smaller RMSE but larger MSPE scores and vice-versa for larger forecasting horizons.http://www.sciencedirect.com/science/article/pii/S2589915523000147GroundwaterWeatherWellsCaliforniaXgboostSupervised learning |
| spellingShingle | Gabriela May-Lagunes Valerie Chau Eric Ellestad Leyla Greengard Paolo D'Odorico Puya Vahabi Alberto Todeschini Manuela Girotto Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley Groundwater Weather Wells California Xgboost Supervised learning |
| title | Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley |
| title_full | Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley |
| title_fullStr | Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley |
| title_full_unstemmed | Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley |
| title_short | Forecasting groundwater levels using machine learning methods: The case of California’s Central Valley |
| title_sort | forecasting groundwater levels using machine learning methods the case of california s central valley |
| topic | Groundwater Weather Wells California Xgboost Supervised learning |
| url | http://www.sciencedirect.com/science/article/pii/S2589915523000147 |
| work_keys_str_mv | AT gabrielamaylagunes forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley AT valeriechau forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley AT ericellestad forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley AT leylagreengard forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley AT paolododorico forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley AT puyavahabi forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley AT albertotodeschini forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley AT manuelagirotto forecastinggroundwaterlevelsusingmachinelearningmethodsthecaseofcaliforniascentralvalley |
