Weather forecasting is a typical problem of coupling big data with physical-process models, according to Prof. Pingwen Zhang, academician of Chinese Academy of Sciences, Director of the National Engineering Laboratory for Big Data Analysis and Application Technology, Director of the Center for Computational Science & Engineering, Peking University. Prof. Zhang is the corresponding author of a collaborated study by Peking University and Institute of Atmospheric Physics, Chinese Academy of Sciences.
Generally speaking, weather forecasting is a largely successful practice in the geosciences and, nowadays, it is inseparable from numerical weather prediction (NWP). However, because the outputs of NWP and observations contain different systematic errors, a "weather consultation" is an indispensable part of the process towards further improving the accuracy of forecasts.
"In fact, the theory-driven physical model and data-driven machine learning are complementary tools. Combining these two approaches, an intelligent weather consultation system can be built to assist the current manual process of weather consultation," says Prof. ZHANG. "One of the challenges linked with this is to build appropriate feature engineering for both types of information to make full use of the data."
To solve these problems, Prof. ZHANG and his team have proposed the "model output machine learning" (MOML) method for simulating weather consultation, and this research has recently been published in Advances in Atmospheric Sciences.
MOML is a post-processing method based on machine learning, which matches NWP forecasts against observations through a regression function. To test the new approach for grid temperature forecasts, the 2-m surface air temperature in the Beijing area was employed. The MOML method, with different feature engineering, was compared against the ECMWF model forecast and modified model output statistics (MOS) method. MOML showed better numerical performance than the ECMWF model and MOS, especially for winter; the accuracy when using MOML increased by 27.91% and 15.52% respectively.
Weather consultation data are unique, and mainly include information contained in both NWP model data and observational data. They have different data structures and features, which makes feature engineering a complicated task. The quality of feature engineering directly affects the final result. Zhang's group has proposed several feature engineering schemes following extensive numerical experiments. These schemes ensure the calculation efficiency and were employed in meteorological studies for the first time. Prof. ZHANG points out that the MOML method allows the observational data to directly participate in the calculation, and uses both the high- and low-frequency information of the data to make the forecast results more accurate. The MOML method proposed in this study could be applied to forecasting the weather during the upcoming 2022 Winter Olympics, hopefully providing more accurate, intelligent and efficient weather forecasting services for this international event.
Machine learning and deep learning offer diverse tools for weather forecasts in the era of big data, but there are also many challenges in practical applications.
"It is an important future research direction to incorporate weather forecast data and coupled models into a hybrid computing framework to explore and study the structure and features of observational and NWP data, and propose data-driven machine learning algorithms suitable for weather forecasting," Prof. Zhang concludes.