摘要: |
在过去的几十年里,快速的经济发展以及工业化、城镇化进程加速使得中国的资源环境承担的压力不断加大。作为影响空气质量的首要污染物,PM2.5和PM10(记为PM2.5/10)直接影响着广大人民群众的身体健康。因此,针对PM2.5/10浓度进行遥感反演研究,对环境监测和控制改善全国空气环境质量具有重要的意义。近些年来,随着对近地面PM2.5/10浓度研究的不断深入,基于遥感影像数据进行PM2.5/10浓度的反演方法也日益增多。本文利用Google Earth engine(GEE)平台获取了海量的Landsat 8 OLI遥感影像数据,并结合气象信息、空间特征等参数,采用机器学习中常用的多层映射反向传播神经网络构建了波段反射率与PM2.5/10浓度之间的反演模型,以获得PM2.5/10在研究区域的连续分布。为了提高基础PM2.5/10反演模型的反演精度,还从影响因素和前溯时间两个维度出发,探寻了模型的最优化输入参数组合,并最终实现了对PM2.5/10浓度的精准反演。以北京市地区为例,模型的PM2.5和PM10的反演精度R2分别达到0.814和0.796,均方根误差RMSE分别为19.21 μg∙m−3和28.31 μg∙m−3。鉴于该反演结果具有较高的准确性和可靠性,本文所建立的方法模型为研究PM2.5/10在空间上的连续分布特征提供了新的思路和方法,具有较为重要的科研意义与广泛的应用价值。 |
关键词: 机器学习 卫星遥感 PM2.5/10反演 空间连续分布 |
DOI:10.7515/JEE192063 |
CSTR:32259.14.JEE192063 |
分类号: |
基金项目:国家自然科学基金项目(41871315) |
英文基金项目:National Natural Science Foundation of China (41871315) |
|
Research on the methods to retrieve continuous spatial distribution of PM2.5/10 based on machine learning and satellite imagery |
ZHANG Meng, ZHANG Bo
|
School of Human Settlements and Civil Engineering, Xi’an Jiaotong University, Xi’an 710049, China
|
Abstract: |
Background, aim, and scope In the past decades, the rapid industrialization and urbanization have increased the environmental burden in China. PM2.5/10 (incl. PM2.5 and PM10), which have significant impacts on human health, have become the primary pollutant affecting ambient air quality in most large cities in most large cities of North China. How to efficiently as well as accurately obtain the temporal and spatial distribution of PM2.5/10 concentration has become a popular research topic. Due to the uneven and sparse distribution of ground monitoring stations, it is difficult to get the accurate continuous distributions of PM2.5/10 concentration for one whole research area by the means of spatial interpolation and/or numerical simulation based on the discrete values from the ground monitoring stations. Taking advantages of high time effect, wide coverage, high load ability and robust characteristics, the retrieval of the PM2.5/10 concentrations based on the satellite imagery has become more and more valuable and popular. In the literatures published so far, the data of aerosol optical depth (AOD) are often employed to retrieve the PM2.5/10 concentrations by implementing linear and/or nonlinear regression analysis methods. However, the processes to calculate the AOD necessitate some kinds of special parameters, which are hard to be calculated and could be distinct in different research areas, i.e. there is still a lack of generic and robust methods/models to retrieve the PM2.5/10 concentrations from the satellite images. Due to the ill-posed issues mentioned above, this article is dedicated to developing a generic model to obtain the continuous spatial distributions of the PM2.5/10 concentration from satellite imagery based on the method of Multilayer Back Propagation Neural-network. Materials and methods In this research, the platform of Google Earth Engine (GEE) has been employed to acquire the large amount sample data of (1) Landsat 8 OLI remote sensing images with the spatial resolution of 30 m, (2) the spatial parameters such as latitude, longitude and altitude, (3) the meteorological parameters of barometric pressure, relative humidity, temperature, wind direction, wind speed, etc., and (4) PM2.5/10 concentrations from ground monitoring stations. Hereinto, the data of (1), (2) and (3) can be treated as the ‘input’, while the data of (4) is the ‘output’. In order to find the optimal combination of the input parameters for the best ‘retrieval’ performances, the alternative ‘input’ parameters have been categorized into different groups from two the aspects of ‘influence factor’ and ‘retroactive time’, which were stepwise employed by the proposed model for the neural-network training. Results Taking Beijing as the research area, the retrieval precisions, measured by R2, have reached 0.814 and 0.796 for the PM2.5 and PM10 respectively. Simultaneously, the RMSE reached 19.21 μg·m−3 and 28.31 μg·m−3. The retrieval results of the proposed model have been compared with the results calculated by Kriging interpolation. Their general spatial distribution characteristics are consistent to each other— the higher PM2.5/10 concentrations occurred in the south, while the lower values are located in the north of Beijing, China. Discussion The retrieval accuracy of the proposed model is satisfactory and higher than many established models based on the AOD method. To make the validation analysis, the proposed model has been implemented to obtain the continuous spatial distribution graphs of the PM2.5/10 concentrations in Beijing, which are significantly better than that of Kriging interpolations in terms of resolution and clarity. Furthermore, as the Landsat 8 OLI are taken as the ‘basis’ for the retrievals, the distribution graphs of PM2.5/10 concentrations generated in this article have a much higher spatial resolution than many other works. Conclusions The proposed model based on Multilayer Back Propagation Neural-network has yielded considerable PM2.5/10 retrieval performances in terms of high accuracy and strong reliability. Meanwhile, due to the fact that all the data employed by the proposed model can cover the whole area of mainland China and are opened for the public uses, the proposed model has revealed strong robustness and generic nature as well. Recommendations and perspectives The proposed model provides a new and generic method to retrieve the PM2.5/10 concentrations from Landsat 8 OLI images. The high retrieval performance and generic nature indicate that the proposed model in this article can be widely implemented to calculate the continuous spatial distribution of PM2.5/10 with high resolution in various areas. Rather than a significant research prototype, the proposed model has gained the capability to be widely utilized in real-world applications. |
Key words: machine learning satellite imagery PM2.5/10retrieval continuous spatial distribution |