پیش بین غلظت آلاینده PM2.5 با استفاده از شبکه ترکیبی (ANN-GA) مطالعه موردی : شهر ارومیه

نوع مقاله : مقاله پژوهشی

نویسندگان

گروه عمران، دانشکده فنی و مهندسی، دانشگاه ارومیه، ارومیه، ایران

10.22034/jess.2023.417455.2132

چکیده

به دلیل اهمیت مشکلات مربوط به محیط زیست و سلامتی که ناشی از آلودگی هوا است، روش های پیش بینی آلاینده ها به عنوان یک ابزار مهم در تحقیقات مربوط به آلودگی هوا مد نظر بوده اند. در میان آلاینده های مختلف اثرگذار بر کیفیت هوا، ذرات با قطر آیرودینامیکی کمتر از 5/2 میکرومتر (PM2.5) یکی از مسائل اصلی در مدیریت کنترل آلودگی هوا هستند. در این مطالعه، شبکه های عصبی مصنوعی (ANN) در ترکیب با الگوریتم ژنتیک (GA)، برای پیش بینی ذرات PM2.5 در یک دوره ی کوتاه مدت در شهر ارومیه، استفاده شده اند. از فیلتر Savitzky-Golay (SG) جهت پیش پردازش و هموار سازی داده های ایستگاه انداز ه گیری ذرات PM2.5 استفاده گردید. دو روش پرکردن شکاف داده ها (روش های KNN و SPLINE) به منظور به حداقل رساندن انحراف آموزشی و بهبود دقت شبکه به کار گرفته شده اند. داده‌های PM10، PM2.5 ، دی اکسید نیتروژن، دی اکسید گوگرد ، مونوکسید کربن و داده های هواشناسی نیز برای این پیش بینی ها استفاده شده اند. طبق نتایج به دست آمده، روش ANN-GA (ترکیب روش های شبکه عصبی مصنوعی و الگوریتم ژنتیک)، یک بهبود 40 درصدی در همبستگی نتایج پیش بینی نسبت به روش شبکه عصبی مصنوعی ارائه داد. خطای MSE ۰۰۱/۰ (در مقیاس ۱-۰) و ضریب همبستگی R، به مقدار ۹۱/۰ در پیش بینی مشاهده گردید.

کلیدواژه‌ها


عنوان مقاله [English]

Prediction of PM2.5 using a hybrid network (ANN-GA) Case study: Urmia city

نویسندگان [English]

  • Mohammad Teyefeh Taherloo
  • Amir Asadi Vaighan
Department of Civil Engineering, Faculty of Engineering, Urmia University, Urmia, Iran
چکیده [English]

Abstract

Introduction
For the last 50 years, activities like urbanization, industrialization and population growth, make air as a significant inseparable part of our life. Air pollution can be defined as the presence of chemicals or toxic compounds in the air to extent that they pose a health risk. Emissions from cars, plant chemicals, dust, pollen and mold spores are introduced as particulate matter (PM). The World Health Organization reported that ambient air pollution causes 4.2 million deaths from strokes, heart disease, lung cancer and chronic respiratory diseases. Of the various pollutants affecting air quality, particulate matter smaller than 2.5 microns is the major air pollution problem (Ścibor et al., 2020). As well, there is growing evidence of the effects of PM10 and PM2.5 on cardiovascular disease (CVD) and respiratory disease (DR).
Forecasting air pollutants provides an opportunity to determine the intensity of air pollution in different areas and prevent irreversible impacts. In addition, these models also allow decision-makers to make the right decisions and prepare for the prevention or control of the PMs in the future. Some of the models used in air pollution forecasting studies are auto-regressive Integrated Moving Average (ARIMA), artificial neural network (ANN), Community Multiscale Air Quality Model (CMAQ), the Weather Research and Forecasting (WRF) model coupled with Chemistry (WRF-CHEM), Fuzzy models, grey model and/or hybrid models. ANN has been used extensively by scientists to provide rapid and parsimonious solutions to mitigate the negative impacts of air pollution worldwide. Neural networks, as an alternative, have been successfully used in air pollution forecasting and have produced accurate results in time series data. Different types of noise and nonlinear structure were present in the data. Hybrid modeling approaches have a wide variety of applications in which numerous methods or attributes are merged to create a more sophisticated model with superior performance in certain scenarios.
Urmia is one of Iran's most polluted cities, owing to continuous traffic and traffic congestion, growing CO2 and PM levels, and a lack of knowledge on regulating and locating industrial manufacturing units. Dust from Iraq affects the region, as well as inversion, which occurs 90 days a year, are instances of region-specific air pollution. In addition, the drying of Urmia Lake, which can result in salt storms, is one of the critical concerns that will lead to significant pollution in the near future.
In this study, ANN-GA with missing data imputation was used to predict PM2.5 in Urmia, Iran, in the short-term to demonstrate how data-gap filling and preprocessing methods could improve hybrid models' performance.

Methodology
The concentrations of air pollutants (carbon monoxide, nitrogen dioxide, and sulfur dioxide) as well as meteorological data (temperature, relative humidity, and wind velocity) were used as inputs in this research to predict PM2.5. Air pollution concentrations and meteorological data over a two-year period were obtained from Monitoring Station No. 3, Urmia municipality, and Iran's meteorology website (Data.irimo.ir).
The data was then preprocessed with the Savitsky-Golay filter before being fed into the ANN and ANN-GA networks. Data gaps and imputed data (KNN/SPLINE method) were used as input in each network, and the results were compared.
In this study, a single system contains two hidden layers and one output layer. The time series method was used to introduce the data to the network. The data was divided into three parts. 70% of the data is used for training, 15% for validation, and 15% for testing. Data import scenarios were defined in two ways. The first scenario used no imputation, while the second used SPLINE and KNN to fill in data gaps. As a transfer function, a sigmoid (logsig) layer was used for hidden layers, and a linear layer (Purelin) was used for the output layer. The Levenberg-Marquardt algorithm was chosen as the learning algorithm based on the type of problem and the speed of convergence. To improve the results, the number of neurons, repetition parameters, number of permitted evaluations, Levenberg algorithm parameters, and reliability were all adjusted through a trial-and-error process.
New ANN-GA network was used in this study and GA was used as a training function. After introducing the data as a time series and selecting the amount of data for each episode of learning, evaluation, and testing, the structure and number of network layers were created with the "newff" function. The main difference is that the genetic learning process was used instead of the "train" function. It's worth noting that the network layer characteristics in both methods were the same. To learn how to complete the process, the new learning function requires several side processes, including cost function creation, selection, intersection, and mutation. Three methods of roulette selection, tournament selection and random were used in the selection process. To introduce the cost function, weights were taken from those created by the "newff" function. Different values were assigned to the initial population variables, maximum mutation number, and selection pressure coefficient by trial-and-error method. Moreover, two data import scenarios were defined.

Conclusion
Forecasting methods have been considered an important tool in research on air pollution. Among the various pollutants that influence air quality, particles with an aerodynamic diameter of less than 2.5 micrometers (PM2.5) are one of the key issues in air pollution control management. In this study, a model for predicting future concentrations of PM2.5 was developed by the Hybrid Network (ANN-GA). Two methods of data imputation (KNN and SPLINE) were used to minimize training issues and improve network accuracy. PM10, PM2.5, nitrogen dioxide, oxide, carbon monoxide, and weather data were used for predictions. The results show that multi-line neural networks are relatively efficient for predictive purposes but lack sufficient accuracy to predict. The ANN network produced MSE error of 0.023 and coherence coefficient of R 0.543 only with data gap filling methods. In order to improve R and reduce network errors, a genetic algorithm was used in combination with a multi-layer neural network (ANN-GA). As the results showed, MSE and R for hybrid networks (ANN-GA) were improved (R=0.91 and MSE=0.001). In addition, compared to ANN, the R increased by 40 percent and the MSE improved by 95 percent. Thus, it can be concluded that ANN-GA can be used as a powerful and reliable tool for predicting air pollution.
Keywords
Air pollution Prediction; Artificial Neural Network; Genetic Alghorithm; PM2.5; Hybrid Network

کلیدواژه‌ها [English]

  • Air pollution Prediction
  • Artificial Neural Network
  • Genetic Algorithm
  • PM2.5