Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning

This article aimed to determine a workflow for more efficient large-scale crop mapping using a time series of images from the Sentinel-2 Satellite, statistical methods of attribute selection, and machine learning. The proposed methodology explores the best possible combination of spectral variables...

Full description

Bibliographic Details
Published in:ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Main Authors: J. G. de Oliveira Júnior, J. C. D. M. Esquerdo, R. A. C. Lamparelli
Format: Article
Language:English
Published: Copernicus Publications 2024-11-01
Online Access:https://isprs-annals.copernicus.org/articles/X-3-2024/85/2024/isprs-annals-X-3-2024-85-2024.pdf
_version_ 1849669641955180544
author J. G. de Oliveira Júnior
J. C. D. M. Esquerdo
J. C. D. M. Esquerdo
R. A. C. Lamparelli
R. A. C. Lamparelli
author_facet J. G. de Oliveira Júnior
J. C. D. M. Esquerdo
J. C. D. M. Esquerdo
R. A. C. Lamparelli
R. A. C. Lamparelli
author_sort J. G. de Oliveira Júnior
collection DOAJ
container_title ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
description This article aimed to determine a workflow for more efficient large-scale crop mapping using a time series of images from the Sentinel-2 Satellite, statistical methods of attribute selection, and machine learning. The proposed methodology explores the best possible combination of spectral variables related to vegetation (16 vegetation indices in the RGB, NIR, SWIR, and Red Edge regions) to characterize different spectro-temporal profiles of Land Use and Land Cover (LULC) in spatially heterogeneous landscapes. First, we applied a data dimensionality reduction analysis using the PCA (Principal Component Analysis) method. Subsequently, the variables that showed the highest statistical correlation between each other were used in the spectro-temporal classification process, using the Random Forest, TempCNN, and LightTAE algorithms, following three different strategies: C1 (ALL), C2 (BE + IV <sub>(Red Edge)</sub>) and C3 (BE + IV <sub>(without Red Edge)</sub>), where ALL &ndash; All variables; BE &ndash; Spectral Bands; IV &ndash; Vegetation Indices. Given the results found, the C2 classification scenario (with bands B3, B4, B5, B6, B7, B8, and B8A and the NDRE1, RESI, and MSR indexes) demonstrated the best LULC classification accuracy at the crop pattern level, compared to the other scenarios, with average values of 0.91, 0.88, 0.91, 0.89, and 0.89 (Global Accuracy, Producer Accuracy, User Accuracy, Kappa index, and F1-Score, respectively, for the TempCNN model), the which emphasized the importance of both qualitative and quantitative variability of sampling data and variables based on the Red Edge region for improving LULC classification processes in large-scale heterogeneous landscapes.
format Article
id doaj-art-0c044b74f5b346aa8bcdaafbd3bc8039
institution Directory of Open Access Journals
issn 2194-9042
2194-9050
language English
publishDate 2024-11-01
publisher Copernicus Publications
record_format Article
spelling doaj-art-0c044b74f5b346aa8bcdaafbd3bc80392025-08-20T02:18:43ZengCopernicus PublicationsISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences2194-90422194-90502024-11-01X-3-2024859210.5194/isprs-annals-X-3-2024-85-2024Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine LearningJ. G. de Oliveira Júnior0J. C. D. M. Esquerdo1J. C. D. M. Esquerdo2R. A. C. Lamparelli3R. A. C. Lamparelli4UNICAMP - Universidade Estadual de Campinas, BrazilUNICAMP - Universidade Estadual de Campinas, BrazilEmbrapa Agricultura Digital, BrazilUNICAMP - Universidade Estadual de Campinas, BrazilNIPE - Núcleo Interdisciplinar de Planejamento Energético, BrazilThis article aimed to determine a workflow for more efficient large-scale crop mapping using a time series of images from the Sentinel-2 Satellite, statistical methods of attribute selection, and machine learning. The proposed methodology explores the best possible combination of spectral variables related to vegetation (16 vegetation indices in the RGB, NIR, SWIR, and Red Edge regions) to characterize different spectro-temporal profiles of Land Use and Land Cover (LULC) in spatially heterogeneous landscapes. First, we applied a data dimensionality reduction analysis using the PCA (Principal Component Analysis) method. Subsequently, the variables that showed the highest statistical correlation between each other were used in the spectro-temporal classification process, using the Random Forest, TempCNN, and LightTAE algorithms, following three different strategies: C1 (ALL), C2 (BE + IV <sub>(Red Edge)</sub>) and C3 (BE + IV <sub>(without Red Edge)</sub>), where ALL &ndash; All variables; BE &ndash; Spectral Bands; IV &ndash; Vegetation Indices. Given the results found, the C2 classification scenario (with bands B3, B4, B5, B6, B7, B8, and B8A and the NDRE1, RESI, and MSR indexes) demonstrated the best LULC classification accuracy at the crop pattern level, compared to the other scenarios, with average values of 0.91, 0.88, 0.91, 0.89, and 0.89 (Global Accuracy, Producer Accuracy, User Accuracy, Kappa index, and F1-Score, respectively, for the TempCNN model), the which emphasized the importance of both qualitative and quantitative variability of sampling data and variables based on the Red Edge region for improving LULC classification processes in large-scale heterogeneous landscapes.https://isprs-annals.copernicus.org/articles/X-3-2024/85/2024/isprs-annals-X-3-2024-85-2024.pdf
spellingShingle J. G. de Oliveira Júnior
J. C. D. M. Esquerdo
J. C. D. M. Esquerdo
R. A. C. Lamparelli
R. A. C. Lamparelli
Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning
title Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning
title_full Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning
title_fullStr Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning
title_full_unstemmed Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning
title_short Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning
title_sort optimum combination of spectral variables for crop mapping in heterogeneous landscapes based on sentinel 2 time series and machine learning
url https://isprs-annals.copernicus.org/articles/X-3-2024/85/2024/isprs-annals-X-3-2024-85-2024.pdf
work_keys_str_mv AT jgdeoliveirajunior optimumcombinationofspectralvariablesforcropmappinginheterogeneouslandscapesbasedonsentinel2timeseriesandmachinelearning
AT jcdmesquerdo optimumcombinationofspectralvariablesforcropmappinginheterogeneouslandscapesbasedonsentinel2timeseriesandmachinelearning
AT jcdmesquerdo optimumcombinationofspectralvariablesforcropmappinginheterogeneouslandscapesbasedonsentinel2timeseriesandmachinelearning
AT raclamparelli optimumcombinationofspectralvariablesforcropmappinginheterogeneouslandscapesbasedonsentinel2timeseriesandmachinelearning
AT raclamparelli optimumcombinationofspectralvariablesforcropmappinginheterogeneouslandscapesbasedonsentinel2timeseriesandmachinelearning