Impact Statement
Extreme climate hazards and the increased scarcity and demand for fresh water necessitates improvements in reservoir storage prediction. The research goal is to provide water resource managers with improved ability to make informed decisions about water usage that minimize impacts to local communities and businesses. This paper demonstrates 14-day short-term reservoir predictions as a building block for developing better models that incorporate weather predictions, soil moisture, and statistical weather data for reservoir prediction out to 90 days or longer. This project leverages expertise in data science, software engineering, signals processing, meteorology, and climate science to produce promising results.
1. Introduction
Water resource management plays an important role in a community’s climate resilience. Reservoirs can be used to supply water resources to nearby communities, generate clean hydroelectric energy, allow aquatic recreation in inland areas, or provide habitats for marine life. Reservoirs also account for weather-induced changes in the local water balance. During dry periods, reservoirs provide water that sustains nearby agricultural practices. During heavy rain events, reservoirs with excess storage can accept extra runoff, thereby mitigating and/or reducing the effects of flash floods.
Resource managers must be able to predict near-term reservoir levels to maintain optimal operating conditions and prevent unnecessary risks, such as surface water scarcity, in their communities. The complexity of reservoir prediction makes development of predictive tools difficult, so few systems exist to help these managers make such decisions. Non-linearities associated with processing data from episodic natural phenomena, time lags caused by the flow of water through drainage basins, and uncertainties introduced by the inclusion of weather forecast data all contribute to the difficulty of prediction. As a result, most reservoir management strategies tend to be reactive to the local weather conditions (Tounsi et al., Reference Tounsi, Temimi and Gourley2022). Creation of forecasting solutions, such as those developed by Tounsi et al. (Reference Tounsi, Temimi and Gourley2022) and Shiri et al. (Reference Shiri, Shamshirband, Kisi, Karimi, Bateni, Hosseini Nezhad and Hashemi2016), allows for more proactive reservoir management strategies. Better reservoir level forecasts also improve resource managers’ abilities to plan for extreme climate events such as drought and floods. Most applications of artificial intelligence (AI) on water resource management focus on water demand forecasting and are typically catered toward water utility companies (such as Antunes et al., Reference Antunes, Andrade-Campos, Sardinha-Lourenço and Oliveira2018). Applications of AI-based technologies in water infrastructure and water management systems are growing and are expected to continue to grow as technology develops (Mehmood et al., Reference Mehmood, Liao and Mahadeo2020; Niknam et al., Reference Niknam, Zare, Hosseininasab, Mostafaeipour and Herrera2022).
Existing reservoir prediction models use statistical techniques to predict future reservoir levels. A common implementation involves the autoregressive integrated moving average (ARIMA) family of models, which enhances the investigation of time series data by comparing data against time-lagged versions of itself. Sabzi et al. (Reference Sabzi, Abudu, Alizadeh, Soltanisehat, Dilekli and King2016) created a set of ARIMA models to predict reservoir inflow and develop operations strategies for reservoirs in southern New Mexico, USA. Similarly, Valipour et al. (Reference Valipour, Banihabib and Behbahani2012) developed models to predict inflow to Iranian reservoirs. Patle et al. (Reference Patle, Singh, Sarangi, Rai, Khanna and Sanoo2015) developed models to analyze groundwater usage in Haryana, India. The work of Musarat et al. (Reference Musarat, Alaloul, Rabbani, Ali, Altaf, Fediuk, Vatin, Klyuev, Bukhari, Sadiq, Rafiq and Farooq2021) forecasts discharges on the Kabul River in Pakistan.
Technological advancements have contributed to the development of more intricate computational techniques for analyzing complex problems. Developments in fields such as machine learning (ML) have shifted the burden of data analysis from manual, human-centric techniques toward automated, computerized techniques. These ML methods allow for improved data analysis, particularly in situations where data are highly dimensional and show few meaningful correlation patterns to the human eye. ML also allows for faster model prototyping and development. Niu et al. (Reference Niu, Feng, Feng, Min, Cheng and Zhou2019) took advantage of these advances in computation by showing that multiple ML techniques outperformed standard multiple linear regression (MLR) when predicting reservoir levels in China. Similarly, Shamim et al. (Reference Shamim, Hassan, Ahmad and Zeeshan2016) showed that localized linear ML models are capable of predicting reservoir levels in Pakistan. Qie et al. (Reference Qie, Zhang, Getahun and Mamer2022) analyzed reservoir outflow for two sites in Illinois, USA and showed promising results using multiple different statistical techniques. ML models are also frequently used in water quality research, including recent studies in Vietnam (Nguyen et al., Reference Nguyen, Ha, Nguyen and Pham2021), Hong Kong (Deng et al., Reference Deng, Chau and Duan2021), and Ghana (Ewusi et al., Reference Ewusi, Ahenkorah and Aikins2021).
One particular ML algorithm used in hydrological domains is the Artificial Neural Network (ANN). Originally devised by McCulloch and Pitts (Reference McCulloch and Pitts1943), the ANN is designed to mimic human brain functionality by implementing a series of logical decision gates, known as neurons, to perform data analysis. Different types of ANN can be formed by altering the decision function at each gate and/or the internal architecture of the network. Das et al. (Reference Das, Ghosh, Chowdary, Saikrishnaveni and Sharma2016) applied Bayesian probabilistic analysis at each logic gate to produce a model that outperformed both ARIMA and traditional ANNs for predictions at a reservoir in Jharkhand, India. Chang and Chang (Reference Chang and Chang2006) and Unes et al. (Reference Unes, Gumuscan and Demirci2017) implemented fuzzy logic at neural gates to achieve similar levels of success at predicting reservoir status in Taiwan and Turkey, respectively.
Continued research on ANNs has led to the creation of specialized network structures for particular implementations. One such structure is the Recurrent Neural Network (RNN), which loops data through the network multiple times before “forgetting” the data. These loops allow for the analysis of recent history, making the RNN a particularly useful tool for analyzing sequential data, such as time series-based data (Hewamalage et al., Reference Hewamalage, Bergmeir and Bandara2021). To combat mathematical peculiarities that may arise during calculation, Hochreiter and Schmidhuber (Reference Hochreiter and Schmidhuber1997) developed the long short-term memory (LSTM) extension to the RNN theory. Zhang et al. (Reference Zhang, Peng, Lin, Wang, Liu and Zhuang2019) showed that RNN models enhanced with LSTM outperformed other neural models in modeling reservoir outflow at a hydropower station on the Jinsha River in China. Similarly, Liu et al. (Reference Liu, Yuan, Zeng, Jiao, Li, Zhong and Yao2022) used LSTM to augment hydrological simulations to improve forecast accuracy by as much as 6% for streamflow predictions at a hydropower station in Guangxi, China.
A common thread linking previous research is that almost all projects focus on implementing one or more statistical techniques at a single station, drainage basin, or limited set of reservoirs (often just one or two). The work outlined is a broader predictive analytics approach to model reservoirs across multiple basins and variations in climate in Texas, USA. Successful ML models would predict levels at selected sites in Texas, with 7-day forecasts having no more than 5% error. The reservoirs span a wide range of climate divisions in the state and are selected based on continuous data availability and length of period of record. Successful model development within the study area indicates potential for expansion to other reservoirs across the USA and the world. The novelty of the proposed technique lies in the applicability of deep learning (DL) models across a broad swath of reservoirs, that spanned varying climatological and hydrological conditions.
2. Methodology
2.1. Study area and time period
This project focuses on 17 reservoirs in the state of Texas, USA. The 17 sites are listed in Table 1 and mapped in Figure 1. The 17 reservoirs are located in 16 different watersheds, as defined by 8-digit United States Geological Survey (USGS) Hydrologic Unit Code. Joe Pool Lake and Lake Weatherford share a watershed. The reservoirs also span nine of the 10 climate divisions in Texas. Some reservoirs lie on the boundary between climate divisions.
The selected reservoirs are spread across most of Texas east of the Pecos River. Reservoir data for sites west of the Pecos River are not available through the chosen data sources. The geographic diversity of the study area requires the models to be robust against a bevy of potential weather inputs and operational use cases, including warmer and wetter conditions near the Gulf of Mexico and drier and colder reservoir conditions in the Texas panhandle. Reservoir level and/or storage data are publicly available through the United States Geologic Survey’s stream gage network. The Texas Water Board provides elevation-area-capacity (EAC) rating curve information in a machine-readable format, allowing for rapid conversion from gage height to reservoir storage capacities. For each reservoir, historical stream gage data and climatological information are collected.
2.2. Training/validation/test data splits
Data from January 1, 2010 to December 31, 2020 are used for training while data from January 1, 2021 to December, 2022 are used for testing the trained models. The rationale for the training and test splits was that the sequential division of the dataset preserves autocorrelation within the data to the greatest extent possible. Additionally, the last 30% of the training set, or December 2017 to December 2020, is used for validation during training to track model performance between epochs. The Parameter-elevation Regressions on Independent Slopes Model (PRISM) dataset (Daly and Bryant, Reference Daly and Bryant2013) provides gridded precipitation and temperature data during the study period. Values from multiple gridpoints were spatially averaged to produce a single value for a watershed. Linear interpolation techniques were used to fill any missing values in the dataset. Factors leading to this division of data include uniformity and completeness. The date ranges were chosen such that the same date ranges were applied to each reservoir in the study area. The studied reservoirs all had data for the same period of record, keeping the dates uniform across all models. Should the authors decide to add new reservoirs with different periods of record, then the authors may amend the data division protocols. Additionally, the 70/30 split between model training and validation data was chosen such that 3 years of data would be captured in the validation set. This increases the likelihood of capturing extreme precipitation and drought events in the validation set. For similar reasons, the test set was chosen to include 2 years of data.
2.3. Model construction
Based on the literature review and the applicability of DL for the task at hand, a RNN that incorporates LSTM was used for training a model for each reservoir. Each model uses historical temperature, precipitation, and reservoir storage data within the associated watershed. Reservoir levels exhibit time series behavior and correlate well with meteorological time series in the local watershed. Additionally, the LSTM element allows for the inclusion of recent weather phenomena in the analysis.
The trained model predicts daily changes in reservoir storage given a 14-day history of reservoir storage, temperature, and precipitation. Model performance is then evaluated using data in the test range. Multi-day forecasts are generated by iterating model outputs over the desired length of the forecast. Seven- and 14-day forecasts are generated for every date in the test range, and then compared to observed reservoir storage, to generate accuracy metrics such as the root mean squared error (RMSE) and mean absolute percent error (MAPE).
Figure 2 traces the flow of data from the individual datasets through the modeling effort.
The RNN consists of two main parts: an LSTM structure (where the recurrence occurs) and a densely connected structure. Figure 3 provides a visual representation of the RNN schema. The LSTM structure is a single layer of 50 nodes. Data are fed through the LSTM repeatedly until the nodes “forget” about the data. The models were defined such that data older than 14 days are “forgotten.” The dense structure contains four layers of 50 nodes each. Nodes in neighboring layers all connect, though not all connections may be active during a calculation. The LSTM structure uses a sigmoidal activation function while the densely connected layers use hyperbolic tangent activation functions. A learning rate of $ {10}^{-4} $ is used for training. Each model predicts the next day’s change in reservoir levels for its associated reservoir. Performance is estimated after each epoch by using a subset of the training data known as the validation set. The model with the smallest validation error is saved and considered to be the trained model for that site.
A separate model, using the same methodology, is trained for each reservoir in the study area. Each reservoir is influenced locally by its surrounding hydrological and weather conditions, so it makes sense to train individual models at a local level. In future work, the authors aim to show that the process applies across reservoirs outside of the study area. Implementation at other reservoirs requires availability of long-term reservoir digital data records, additional data collection and model training, including the collection of additional EAC data, which is not always easy to find. Future studies will investigate model performance at new reservoirs that are not present during this initial study. Development of a single model for all reservoirs may also occur in future work.
2.4. Hindcasting and forecasting
To validate the models, an iterative hindcasting process is deployed. The reservoir storage for the first day is predicted using a training data matrix containing the previous 14 days of storage, temperature, and precipitation readings. Once a forecasted (or predicted) value is obtained, the oldest storage, temperature, and precipitation values are removed from the data matrix. Then, the forecasted storage, next temperature value, and next precipitation value are appended to the data matrix. This maintains 14 days of data in the data matrix. Thus, the data matrix for the second day’s forecast would contain the previous 13 days of observed data, the first day’s storage forecast, and the first day’s temperature and precipitation values. The temperature and precipitation values used during hindcasting are the historically observed data for future dates obtained from the PRISM dataset.
To convert to a forecasting environment, the same methodology can be applied. Data observations are no longer available, so use of a forecast model, such as the National Oceanic and Atmospheric Administration’s (NOAA) Global Forecast System (GFS) model is required for the meteorological component.
This iterative hindcast/forecast process provides an added benefit of highlighting models that are consistently biased toward over-prediction or under-prediction. Over the course of a 14-day output period, prediction error for each error could potentially compound. Therefore, any potential biases would continually stack and become noticeably evident. For example, if a model for a particular site is biased toward overprediction, then the residual errors produced by the hindcast process would also be biased toward overprediction. With 730 samples (one for each day in the 2-year test set), biases would be clearly shown.
2.5. Evaluation
Two metrics are used to evaluate the performance of the predictions: RMSE and MAPE. These metrics are computed separately for each reservoir and the hindcast time period.
RMSE generally is not comparable across reservoirs since the reservoirs vary in capacity and operating range. For example, Lake Ray Hubbard is one of the bigger reservoirs in this study with a minimum storage during the study period of roughly 257,000 acre-feet and a storage capacity of roughly 452,000 acre-feet. Meanwhile, Lake Weatherford is one of the smallest reservoirs with a minimum storage of roughly 9,300 acre-feet and a storage capacity of roughly 17,800 acre-feet. The differences in reservoir characteristics justifies the use of MAPE to weight model performance relative to the characteristics of the reservoir. However, it is still useful to have absolute error statistics since these errors also represent changes in reservoir height.
3. Results
Once models were trained, hindcast outputs were calculated for each day within the 2-year test period. Tabulated MAPE and RMSE for 7-day and 14-day hindcasts are provided in Table 2. From these results, it is clear that the models are capable of predicting reservoir storage within the established benchmarks. Eight of the 17 reservoirs had MAPE rates below 1% for 7-day hindcasts. Additionally, eight of the 17 reservoirs had MAPE rates below 2% for 14-day hindcasts. Lake Weatherford achieved the 1% threshold for 7-day hindcasts but not the 2% threshold for 14-day hindcasts while Joe Pool Lake showed the opposite behavior.
Figures 4 and 5 plot the 7- and 14-day predictions, respectively, compared to the observed reservoir storage over a 2-year span at Lake Meredith, Lake Corpus Christi, Joe Pool Lake, and Twin Buttes Reservoir. For these graphs, the x-axis represents the date hindcasted, meaning that a 7-day hindcast initiated with observed storage up to April 1, 2022 would appear as the value for April 8, 2022. That value would be compared against the observed storage value for April 8, 2022.
Included in Figures 4 and 5 is the daily precipitation over the 2-year test period. This data is used to better understand trends in the reservoir storage levels, especially extreme weather events that cause sharp changes in reservoir storage over the course of a few days.
4. Discussion
Among the 17 reservoirs, the best-performing reservoir was Lake Meredith, which posted a 7-day MAPE of 0.32% and a 14-day MAPE of 0.54%. This success may be attributed to the consistency of Lake Meredith’s storage levels. Its lowest storage values were around 192,000 acre-feet while the highest storage values were around 232,000 acre-feet, representing a 17.2% difference between highest and lowest readings. Canyon Lake was the next best performing model with a 7-day MAPE of 0.42% and a 14-day MAPE of 0.86%. Interestingly, over the period of study, the reservoir storage fluctuates between 286,000 and 546,000 acre-feet, a 47.6% difference. However, over the test period, Canyon Lake only fluctuated between 302,000 and 380,000 acre-feet. Consistency in the test dataset appears to drive test set accuracy metrics.
On the other hand, the worst-case scenario for prediction occurred at Lake J.B. Thomas, which lagged behind its peers by posting a 7-day MAPE of 2.10% and a 14-day MAPE of 3.84%. These errors may be due to large fluctuations in reported reservoir levels. One such instance occurred prior to September 2014, when the reservoir storage maintained a level lower than 16,000 acre-feet. In April 2013, the level went below the dead pool capacity of 673 acre-feet. During the test period, J.B. Thomas achieved its minimum storage on May 15, 2021 of approximately 22,700 acre-feet. By July 14, 2021, J.B. Thomas had reached a storage of approximately 98,570 acre-feet. This 77% difference in values is abnormal among studied reservoirs. Despite these fluctuations, the prediction results are still well within the established 5% error benchmark. Visually, the hindcasted storage values align with the observed storage values. This is particularly evident for the 7-day hindcasts in Figure 4, where model predictions correspond to large fluctuations in storage caused by the presence or absence of precipitation within the watershed. This is particularly true during May 2021, when Joe Pool Lake’s storage increased by over 26.5% in response to large precipitation events. Similarly, Lake Corpus Christi increases from 238,000 acre-feet to 302,000 acre-feet over a span of 11 days from May 13 to 24, 2021. Then, the reservoir jumps up another 48,000 acre-feet to 350,000 acre-feet by June 10, 2021. These events are hindcasted accurately, and the model output is reacting accordingly to strong rain events.
In some cases, the 7- and 14-day forecasts in Figures 4 and 5 show a potential for over-sensitivity to precipitation. For example, observing Lake Corpus Christi over the same periods in May and June, 2021, a cause-and-effect due to rain events are seen with sharp upward spikes in the hindcast. This is also seen for Joe Pool Lake in May and July, 2022. For longer-term hindcasts, the models are less sensitive to rapid declines in reservoir levels. An example of this is June 10–27, 2021 for Joe Pool Lake. The reservoir has a steady decline of roughly 2900 acre-feet per day. This occurs because the reservoir is well above its conservation storage capacity of 175,000 acre-feet, and reservoir managers significantly increase reservoir outflow to return the reservoir to capacity. The current models do not incorporate inflow and outflow data because of the lack of historical data availability. However, the models may significantly improve with the inclusion of this data which the authors are currently investigating.
5. Conclusion and Future Work
A novel AI-based, climate resilient methodology has been proposed for water planning. The trained models and hindcast results meet predictive accuracy benchmarks, allowing reservoir managers to accurately predict reservoir levels up to 14 days in the future. Potential future work includes incorporating additional data such as inflow, outflow, cloud cover, soil moisture, and other related meteorological parameters as well as expansion to other reservoirs in the USA and other countries. Extending the prediction range to 30 days and beyond may be possible by training models to predict reservoir changes over multiple days instead of incrementing each day. To make the trained models useful to water managers, the proposed models can be deployed in a forecasting environment. By changing the test data set to forecast “future” scenarios, one can replace historical climate data from the hindcast model with forecasted meteorological data using a global forecast model, such as the NOAA GFS model. This work demonstrates an important first step toward developing a water resource prediction product for water resource managers.
Acknowledgments
We are grateful for the technical assistance of other TRABUS software engineers—Andrew Tec, Andrew Smith, Cathy Hsieh, and Andy Van Pelt, in providing us a data infrastructure and the DL environment to conduct this research.
Author contribution
Conceptualization: E.R. and D.S.; Data curation: E.R. and N.W.; Methodology: E.R., N.W., and D.S.; Project administration: D.S.; Software: E.R. and N.W.; Visualization: E.R. and N.W.; Writing—original draft: E.R., N.W., and D.S.; Writing—review and editing: E.R., N.W., and D.S.
Competing interest
The authors declare no competing interests exist.
Data availability statement
Relevant training data for the proposed technique have been made available at https://zenodo.org/badge/latestdoi/651280043 and can be cited using Rohli et al. (Reference Rohli, Woolsey and Sathiaraj2023).
Ethics statement
The research meets all ethical guidelines, including adherence to the legal requirements of the study country.
Funding statement
This research was supported by TRABUS’ internal research and development funding.
Provenance statement
This article is part of the Climate Informatics 2023 proceedings and was accepted in Environmental Data Science on the basis of the Climate Informatics peer-review process.