A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat

Developing rapid and non-destructive methods for chlorophyll estimation over large spatial areas is a topic of much interest, as it would provide an indirect measure of plant photosynthetic response, be useful in monitoring soil nitrogen content, and offer the capacity to assess vegetation structura...

Full description

Bibliographic Details
Main Authors:	Syed Haleem Shah, Yoseline Angel, Rasmus Houborg, Shawkat Ali, Matthew F. McCabe
Format:	Article
Language:	English
Published:	MDPI AG 2019-04-01
Series:	Remote Sensing
Subjects:	chlorophyll wheat photosynthetic pigment linear regression vegetation indices hyperspectral leaf retrieval prediction
Online Access:	https://www.mdpi.com/2072-4292/11/8/920

id	doaj-713e8c90c6e34a68843852ef6681b30d
record_format	Article
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Syed Haleem Shah Yoseline Angel Rasmus Houborg Shawkat Ali Matthew F. McCabe
spellingShingle	Syed Haleem Shah Yoseline Angel Rasmus Houborg Shawkat Ali Matthew F. McCabe A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat Remote Sensing chlorophyll wheat photosynthetic pigment linear regression vegetation indices hyperspectral leaf retrieval prediction
author_facet	Syed Haleem Shah Yoseline Angel Rasmus Houborg Shawkat Ali Matthew F. McCabe
author_sort	Syed Haleem Shah
title	A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat
title_short	A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat
title_full	A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat
title_fullStr	A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat
title_full_unstemmed	A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat
title_sort	random forest machine learning approach for the retrieval of leaf chlorophyll content in wheat
publisher	MDPI AG
series	Remote Sensing
issn	2072-4292
publishDate	2019-04-01
description	Developing rapid and non-destructive methods for chlorophyll estimation over large spatial areas is a topic of much interest, as it would provide an indirect measure of plant photosynthetic response, be useful in monitoring soil nitrogen content, and offer the capacity to assess vegetation structural and functional dynamics. Traditional methods of direct tissue analysis or the use of handheld meters, are not able to capture chlorophyll variability at anything beyond point scales, so are not particularly useful for informing decisions on plant health and status at the field scale. Examining the spectral response of plants via remote sensing has shown much promise as a means to capture variations in vegetation properties, while offering a non-destructive and scalable approach to monitoring. However, determining the optimum combination of spectra or spectral indices to inform plant response remains an active area of investigation. Here, we explore the use of a machine learning approach to enhance the estimation of leaf chlorophyll (<i>Chl<sub>t</sub></i>), defined as the sum of chlorophyll a and b, from spectral reflectance data. Using an ASD FieldSpec 4 Hi-Res spectroradiometer, 2700 individual leaf hyperspectral reflectance measurements were acquired from wheat plants grown across a gradient of soil salinity and nutrient levels in a greenhouse experiment. The extractable <i>Chl<sub>t</sub></i> was determined from laboratory analysis of 270 collocated samples, each composed of three leaf discs. A random forest regression algorithm was trained against these data, with input predictors based upon (1) reflectance values from 2102 bands across the 400–2500 nm spectral range; and (2) 45 established vegetation indices. As a benchmark, a standard univariate regression analysis was performed to model the relationship between measured <i>Chl<sub>t</sub></i> and the selected vegetation indices. Results show that the root mean square error (RMSE) was significantly reduced when using the machine learning approach compared to standard linear regression. When exploiting the entire spectral range of individual bands as input variables, the random forest estimated <i>Chl<sub>t</sub></i> with an RMSE of 5.49 µg·cm<sup>−2</sup> and an <i>R</i><sup>2</sup> of 0.89. Model accuracy was improved when using vegetation indices as input variables, producing an RMSE ranging from 3.62 to 3.91 µg·cm<sup>−2</sup>, depending on the particular combination of indices selected. In further analysis, input predictors were ranked according to their importance level, and a step-wise reduction in the number of input features (from 45 down to 7) was performed. Implementing this resulted in no significant effect on the RMSE, and showed that much the same prediction accuracy could be obtained by a smaller subset of indices. Importantly, the random forest regression approach identified many important variables that were not good predictors according to their linear regression statistics. Overall, the research illustrates the promise in using established vegetation indices as input variables in a machine learning approach for the enhanced estimation of <i>Chl<sub>t</sub></i> from hyperspectral data.
topic	chlorophyll wheat photosynthetic pigment linear regression vegetation indices hyperspectral leaf retrieval prediction
url	https://www.mdpi.com/2072-4292/11/8/920
work_keys_str_mv	AT syedhaleemshah arandomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT yoselineangel arandomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT rasmushouborg arandomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT shawkatali arandomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT matthewfmccabe arandomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT syedhaleemshah randomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT yoselineangel randomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT rasmushouborg randomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT shawkatali randomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat AT matthewfmccabe randomforestmachinelearningapproachfortheretrievalofleafchlorophyllcontentinwheat
_version_	1716813059784179712
spelling	doaj-713e8c90c6e34a68843852ef6681b30d2020-11-24T20:46:16ZengMDPI AGRemote Sensing2072-42922019-04-0111892010.3390/rs11080920rs11080920A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in WheatSyed Haleem Shah0Yoseline Angel1Rasmus Houborg2Shawkat Ali3Matthew F. McCabe4Hydrology, Agriculture and Land Observation Group, Division of Biological and Environmental Science and Engineering, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi ArabiaHydrology, Agriculture and Land Observation Group, Division of Biological and Environmental Science and Engineering, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi ArabiaPlanet, San Francisco, CA 94107, USAKentville Research and Development Centre, Agriculture and Agri-Food Canada, 32 Main Street Kentville, Kentville, NS B4N 1J5, CanadaHydrology, Agriculture and Land Observation Group, Division of Biological and Environmental Science and Engineering, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi ArabiaDeveloping rapid and non-destructive methods for chlorophyll estimation over large spatial areas is a topic of much interest, as it would provide an indirect measure of plant photosynthetic response, be useful in monitoring soil nitrogen content, and offer the capacity to assess vegetation structural and functional dynamics. Traditional methods of direct tissue analysis or the use of handheld meters, are not able to capture chlorophyll variability at anything beyond point scales, so are not particularly useful for informing decisions on plant health and status at the field scale. Examining the spectral response of plants via remote sensing has shown much promise as a means to capture variations in vegetation properties, while offering a non-destructive and scalable approach to monitoring. However, determining the optimum combination of spectra or spectral indices to inform plant response remains an active area of investigation. Here, we explore the use of a machine learning approach to enhance the estimation of leaf chlorophyll (<i>Chl<sub>t</sub></i>), defined as the sum of chlorophyll a and b, from spectral reflectance data. Using an ASD FieldSpec 4 Hi-Res spectroradiometer, 2700 individual leaf hyperspectral reflectance measurements were acquired from wheat plants grown across a gradient of soil salinity and nutrient levels in a greenhouse experiment. The extractable <i>Chl<sub>t</sub></i> was determined from laboratory analysis of 270 collocated samples, each composed of three leaf discs. A random forest regression algorithm was trained against these data, with input predictors based upon (1) reflectance values from 2102 bands across the 400–2500 nm spectral range; and (2) 45 established vegetation indices. As a benchmark, a standard univariate regression analysis was performed to model the relationship between measured <i>Chl<sub>t</sub></i> and the selected vegetation indices. Results show that the root mean square error (RMSE) was significantly reduced when using the machine learning approach compared to standard linear regression. When exploiting the entire spectral range of individual bands as input variables, the random forest estimated <i>Chl<sub>t</sub></i> with an RMSE of 5.49 µg·cm<sup>−2</sup> and an <i>R</i><sup>2</sup> of 0.89. Model accuracy was improved when using vegetation indices as input variables, producing an RMSE ranging from 3.62 to 3.91 µg·cm<sup>−2</sup>, depending on the particular combination of indices selected. In further analysis, input predictors were ranked according to their importance level, and a step-wise reduction in the number of input features (from 45 down to 7) was performed. Implementing this resulted in no significant effect on the RMSE, and showed that much the same prediction accuracy could be obtained by a smaller subset of indices. Importantly, the random forest regression approach identified many important variables that were not good predictors according to their linear regression statistics. Overall, the research illustrates the promise in using established vegetation indices as input variables in a machine learning approach for the enhanced estimation of <i>Chl<sub>t</sub></i> from hyperspectral data.https://www.mdpi.com/2072-4292/11/8/920chlorophyllwheatphotosynthetic pigmentlinear regressionvegetation indiceshyperspectralleafretrievalprediction

A Random Forest Machine Learning Approach for the Retrieval of Leaf Chlorophyll Content in Wheat

Similar Items