Exploration of machine learning methods for the classification of infrared limb spectra of polar stratospheric clouds
<p>Polar stratospheric clouds (PSCs) play a key role in polar ozone depletion in the stratosphere. Improved observations and continuous monitoring of PSCs can help to validate and improve chemistry–climate models that are used to predict the evolution of the polar ozone hole. In this paper, we...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Copernicus Publications
2020-07-01
|
Series: | Atmospheric Measurement Techniques |
Online Access: | https://www.atmos-meas-tech.net/13/3661/2020/amt-13-3661-2020.pdf |
Summary: | <p>Polar stratospheric clouds (PSCs) play a key role in polar ozone depletion in the
stratosphere. Improved observations and continuous monitoring of PSCs can help to validate and
improve chemistry–climate models that are used to predict the evolution of the polar ozone
hole. In this paper, we explore the potential of applying machine learning (ML) methods to
classify PSC observations of infrared limb sounders. Two datasets were considered in this
study. The first dataset is a collection of infrared spectra captured in Northern Hemisphere
winter 2006/2007 and Southern Hemisphere winter 2009 by the Michelson Interferometer for Passive
Atmospheric Sounding (MIPAS) instrument on board the European Space Agency's (ESA) Envisat satellite. The second dataset is the
cloud scenario database (CSDB) of simulated MIPAS spectra. We first performed an initial analysis
to assess the basic characteristics of the CSDB and to decide which features to extract from
it. Here, we focused on an approach using brightness temperature differences (BTDs). From both
the measured and the simulated infrared spectra, more than 10 000 BTD features were
generated. Next, we assessed the use of ML methods for the reduction of the dimensionality of this
large feature space using principal component analysis (PCA) and kernel principal component
analysis (KPCA) followed by a classification with the support vector machine (SVM). The random
forest (RF) technique, which embeds the feature selection step, has also been used as
a classifier. All methods were found to be suitable to retrieve information on the composition of
PSCs. Of these, RF seems to be the most promising method, being less prone to overfitting and
producing results that agree well with established results based on conventional classification
methods.</p> |
---|---|
ISSN: | 1867-1381 1867-8548 |