Summary: | 碩士 === 元智大學 === 工業工程與管理學系 === 107 === Today, the world where we live has significantly altered from the decades ago, where natural habitat such as forest has been replaced with highly-populated settlement areas, factories, commercial centers, and busy road with a lot of vehicles. As a result, not many green spaces are left to filter out the dust, smoke and other dangerous substances which lead to the air pollution problem. Air pollution accounts for 1.3 million deaths annually according to the WHO report, pointing out the high urgency that this issue holds. Many researchers had attempted to predict the occurrence of the bad air quality, but most of the researches produced were only satisfied with couple-years dataset. A couple-years dataset only would not be sufficient to explain all the possible seasonality that resemble the real case in the air pollution problem. Several prediction models that utilize an eleven years’ dataset gathered from the Environmental Protection Administration (EPA) Taiwan were proposed to fill the gap from the limited dataset. Machine learning methods including Random forest, AdaBoost, SVM, ANN, and stacking ensemble learning will be trained to learn 11 years’ data. The results show that machine learning is qualified to be applied in the prediction of AQI level especially in Taiwan, considering that the results are quite promising. From 9 experiments through 3 different datasets and target predictions, top 3 algorithms are always among stacking algorithm, AdaBoost, and random forest. Stacking and AdaBoost are competing each other in which superiority of R2 and RMSE score can be always found in stacking model, while the best MAE is usually obtained by AdaBoost. Additionally, data from EPA will be used for the other purpose, in which together with the other data from CWB (Central Weather Bureau), this information will be compared to our own dataset, obtained from an air pollution monitoring device we deployed. To ensure the reliability of data it generates, the calibration process was conducted on the reading of the temperature-humidity sensor (DHT-11) and PM10/dust sensor (GPY2Y1010AU0F) that are installed to the device. Machine learning algorithms are also adopted into the calibration setting. The resulting calibration models indicate that it had successfully corrected both temperature and humidity reading, even though only mediocre results were found for the humidity. As a contrast, PM10 sensor reading appears to be highly irrelevant with the benchmark values. By combining the observation in the field and the data summary for dust reading, the outcome for PM10 calibration signals that either the sensor has a random error or a technical limitation problem, hence the preferable step was to replace the sensor into a more reliable one. The whole scheme including the preparation of AQI forecasting model as well as deployment of air monitoring device are part of the endeavor to develop an Air Pollution Early Warning and Monitoring System. The ultimate goal of such system in the end is to promote a low-cost air pollution EWMS to complement or even substitute the current expensive monitoring system.
|