Classification of lidar measurements using supervised and unsupervised machine learning methods

<p>While it is relatively straightforward to automate the processing of lidar signals, it is more difficult to choose periods of “good” measurements to process. Groups use various ad hoc procedures involving either very simple (e.g. signal-to-noise ratio) or more complex procedures <span cl...

Full description

Bibliographic Details
Main Authors:	G. Farhani, R. J. Sica, M. J. Daley
Format:	Article
Language:	English
Published:	Copernicus Publications 2021-01-01
Series:	Atmospheric Measurement Techniques
Online Access:	https://amt.copernicus.org/articles/14/391/2021/amt-14-391-2021.pdf

id	doaj-dfc3268180b540d6ae2e04d59c1469c3
record_format	Article
spelling	doaj-dfc3268180b540d6ae2e04d59c1469c32021-01-18T08:55:07ZengCopernicus PublicationsAtmospheric Measurement Techniques1867-13811867-85482021-01-011439140210.5194/amt-14-391-2021Classification of lidar measurements using supervised and unsupervised machine learning methodsG. Farhani0R. J. Sica1M. J. Daley2Department of Physics and Astronomy, The University of Western Ontario, 1151 Richmond St., London, ON, N6A 3K7, CanadaDepartment of Physics and Astronomy, The University of Western Ontario, 1151 Richmond St., London, ON, N6A 3K7, CanadaDepartment of Computer Science, The Vector Institute for Artificial Intelligence, The University of Western Ontario, 1151 Richmond St., London, ON, N6A 3K7, Canada<p>While it is relatively straightforward to automate the processing of lidar signals, it is more difficult to choose periods of “good” measurements to process. Groups use various ad hoc procedures involving either very simple (e.g. signal-to-noise ratio) or more complex procedures <span class="cit" id="xref_paren.1">(e.g. <a href="#bib1.bibx24">Wing et al.</a>, <a href="#bib1.bibx24">2018</a>)</span> to perform a task that is easy to train humans to perform but is time-consuming. Here, we use machine learning techniques to train the machine to sort the measurements before processing. The presented method is generic and can be applied to most lidars. We test the techniques using measurements from the Purple Crow Lidar (PCL) system located in London, Canada. The PCL has over 200 000 raw profiles in Rayleigh and Raman channels available for classification. We classify raw (level-0) lidar measurements as “clear” sky profiles with strong lidar returns, “bad” profiles, and profiles which are significantly influenced by clouds or aerosol loads. We examined different supervised machine learning algorithms including the random forest, the support vector machine, and the gradient boosting trees, all of which can successfully classify profiles. The algorithms were trained using about 1500 profiles for each PCL channel, selected randomly from different nights of measurements in different years. The success rate of identification for all the channels is above 95 %. We also used the <span class="inline-formula"><i>t</i></span>-distributed stochastic embedding (<span class="inline-formula"><i>t</i></span>-SNE) method, which is an unsupervised algorithm, to cluster our lidar profiles. Because the <span class="inline-formula"><i>t</i></span>-SNE is a data-driven method in which no labelling of the training set is needed, it is an attractive algorithm to find anomalies in lidar profiles. The method has been tested on several nights of measurements from the PCL measurements. The <span class="inline-formula"><i>t</i></span>-SNE can successfully cluster the PCL data profiles into meaningful categories. To demonstrate the use of the technique, we have used the algorithm to identify stratospheric aerosol layers due to wildfires.</p>https://amt.copernicus.org/articles/14/391/2021/amt-14-391-2021.pdf
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	G. Farhani R. J. Sica M. J. Daley
spellingShingle	G. Farhani R. J. Sica M. J. Daley Classification of lidar measurements using supervised and unsupervised machine learning methods Atmospheric Measurement Techniques
author_facet	G. Farhani R. J. Sica M. J. Daley
author_sort	G. Farhani
title	Classification of lidar measurements using supervised and unsupervised machine learning methods
title_short	Classification of lidar measurements using supervised and unsupervised machine learning methods
title_full	Classification of lidar measurements using supervised and unsupervised machine learning methods
title_fullStr	Classification of lidar measurements using supervised and unsupervised machine learning methods
title_full_unstemmed	Classification of lidar measurements using supervised and unsupervised machine learning methods
title_sort	classification of lidar measurements using supervised and unsupervised machine learning methods
publisher	Copernicus Publications
series	Atmospheric Measurement Techniques
issn	1867-1381 1867-8548
publishDate	2021-01-01
description	<p>While it is relatively straightforward to automate the processing of lidar signals, it is more difficult to choose periods of “good” measurements to process. Groups use various ad hoc procedures involving either very simple (e.g. signal-to-noise ratio) or more complex procedures <span class="cit" id="xref_paren.1">(e.g. <a href="#bib1.bibx24">Wing et al.</a>, <a href="#bib1.bibx24">2018</a>)</span> to perform a task that is easy to train humans to perform but is time-consuming. Here, we use machine learning techniques to train the machine to sort the measurements before processing. The presented method is generic and can be applied to most lidars. We test the techniques using measurements from the Purple Crow Lidar (PCL) system located in London, Canada. The PCL has over 200 000 raw profiles in Rayleigh and Raman channels available for classification. We classify raw (level-0) lidar measurements as “clear” sky profiles with strong lidar returns, “bad” profiles, and profiles which are significantly influenced by clouds or aerosol loads. We examined different supervised machine learning algorithms including the random forest, the support vector machine, and the gradient boosting trees, all of which can successfully classify profiles. The algorithms were trained using about 1500 profiles for each PCL channel, selected randomly from different nights of measurements in different years. The success rate of identification for all the channels is above 95 %. We also used the <span class="inline-formula"><i>t</i></span>-distributed stochastic embedding (<span class="inline-formula"><i>t</i></span>-SNE) method, which is an unsupervised algorithm, to cluster our lidar profiles. Because the <span class="inline-formula"><i>t</i></span>-SNE is a data-driven method in which no labelling of the training set is needed, it is an attractive algorithm to find anomalies in lidar profiles. The method has been tested on several nights of measurements from the PCL measurements. The <span class="inline-formula"><i>t</i></span>-SNE can successfully cluster the PCL data profiles into meaningful categories. To demonstrate the use of the technique, we have used the algorithm to identify stratospheric aerosol layers due to wildfires.</p>
url	https://amt.copernicus.org/articles/14/391/2021/amt-14-391-2021.pdf
work_keys_str_mv	AT gfarhani classificationoflidarmeasurementsusingsupervisedandunsupervisedmachinelearningmethods AT rjsica classificationoflidarmeasurementsusingsupervisedandunsupervisedmachinelearningmethods AT mjdaley classificationoflidarmeasurementsusingsupervisedandunsupervisedmachinelearningmethods
_version_	1724333639722860544

Classification of lidar measurements using supervised and unsupervised machine learning methods

Similar Items