Quantitative Precipitation Estimates Using Machine Learning Approaches with Operational Dual-Polarization Radar Data

Traditional radar-based rainfall estimation is typically done by known functional relationships between the rainfall intensity (R) and radar measurables, such as R–Z<sub>h</sub>, R–(Z<sub>h</sub>, Z<sub>DR</sub>), etc. One of the biggest advantages of machine lear...

Full description

Bibliographic Details
Main Authors: Kyuhee Shin, Joon Jin Song, Wonbae Bang, GyuWon Lee
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/13/4/694
Description
Summary:Traditional radar-based rainfall estimation is typically done by known functional relationships between the rainfall intensity (R) and radar measurables, such as R–Z<sub>h</sub>, R–(Z<sub>h</sub>, Z<sub>DR</sub>), etc. One of the biggest advantages of machine learning algorithms is the applicability to a non-linear relationship between a dependent variable and independent variables without any predefined relationships. We explored the potential use of two supervised machine learning methods (regression tree and random forest) in rainfall estimation using dual-polarization radar variables. The regression tree does not require normalization and scaling of data; however, this method is quite unstable since each split depends on the parent split. Since the random forest is an ensemble method of regression trees, it has less variability in prediction compared with regression trees, but consumes more computer resources. We considered several different configurations for machine learning algorithms with different sets of dependent and independent variables. The random forest model was appropriately tuned. In the test of variable importance, the specific differential phase (differential reflectivity) was the most important variable to predict the rainfall rate (residual that is the difference between the true rainfall rate and the one estimated from the R–Z relationship). The models were evaluated by 10-fold cross-validation. The best model was the random forest model using a residual with the non-classified training set. The results indicated that the machine learning algorithms outperformed the traditional R–Z relationship. Then, we applied the best machine learning model to an S-band dual-polarization radar (Mt. Myeonbong) and validated the result with ground rain gauges. The results of the application to radar data showed that the estimates of the residuals had spatial variability. The stratiform and weak rain areas had positive residuals while convective areas had negative residuals, indicating that the spatial error structure driven by the R–Z relationship was well captured by the model. The rainfall rates of all pixels over the study area were adjusted with the estimated residuals. The rainfall rates adjusted by residual showed excellent agreement with the rain gauge, especially at high rainfall rates.
ISSN:2072-4292