Forecasting air pollution is the task of predicting future values of a given sequence using either, historical data from the same signal (univariate forecasting) or, historical data from several interrelated signals (multivariate forecasting)(Martinez et al. 2013).
AirVisual’s forecast data is formulated using multivariate forecasting. The AirVisual databases learn deep data architecture to look for signal and process information in the noise.
This method of deep architecture learning is favored to other methods for its many levels of nonlinearity; it has the the theoretical ability to learn complex features, whilst also achieving better generalization.
Deep learning is a combination of research areas involving neural networks, artificial intelligence, graphical modeling, optimization, pattern recognition, and signal processing. This algorithm is described as “deep” because the input has to pass through several non-linearities before generating the output. The advantage of having many layers is the ability to compactly represent highly nonlinear and highly varying functions (Dalto, M).
When confronted with a large set of data, machine learning will “cluster” these categories based on similarity more accurately than a human would. The process then learns to categorize incrementally - starting with lower levels and ending with higher-level categories. This process is known as unsupervised learning.
Two key aspects characterize deep learning:
- Models consisting of multiple layers or stages of non-linear information processing.
- Methods for supervised or unsupervised learning of feature representation at successively higher, more abstract layers.
To forecast the pollution levels our formula comprises of deep learning/architecture, and an algorithm that helps to distinguish the relationships between categories and pollution levels. This is because deep learning lacks ways of representing relationships and often faces challenges in acquiring these relationships – therefore deep learning is only part of the larger challenge of building intelligent machines, and requires help to create an accurate output.
Machine learning studies the patterns linking the current air quality and current weather conditions with weather forecast and historical air quality. Naturally the more data that is received the more accurate the forecast will be.
The downside to multi-layer neural networks is that they are quite complicated. The set up is difficult and complex and there are many parameters to tune.
Air pollution in itself is affected by many factors - environmental and human - therefore errors in forecasting can occur due to the unpredictable of these factors.
Figure 1 looks to graphically explain the processes involved in computing the air quality forecast used by AirVisual. This is a closed loop system, also known as a feedback control system, that enables the system to adjust its performances to meet the desired response. The system captures all the real-time data (current weather conditions, current air quality), historical data (air quality, weather conditions), and the historical patterns (weather, air quality); these are components make up the input for the engine. All the data, besides real-time, is controlled by an artificial intelligence system - this system has its own method of learning and continues to learn during the process of computing the data. The forecasting engine uses a series of formulae to determine the air quality forecast, this output is then assessed through the feedback loop process to provide a more accurate prediction.
It is important to note that the AirVisual forecaster cannot take into account unpredictable events, including natural disasters and government sanctioned clean air (emission laws, car restrictions). As such, the AirVisual forecast during these events may be inaccurate. Other limitations include weather forecasts (developing countries often lack highly-accurate weather forecasting) and location/geography.
This forecasting model, and all resulting air quality predictions, is intended to provide accurate information related to air quality. Steps have been implemented to ensure its quality and accuracy. However:
- AirVisual relies on Numerical Weather Prediction (NWP) models, especially the Global Forecast System (GFS). For some countries (such as China), the accuracy of the GFS is lower than other countries, therefore sometimes impacting the accuracy of the air quality forecasts.
- We do not assume any legal liability or responsibility for the accuracy, completeness, or correctness of the forecast information.
- We do not assume any legal liability for damage or losses that may have occured either directly or indirectly as a result of any information obtained from the AirVisual forecast.
Cambria, E., Huang, G. B., Kasun, L. L. C., Zhou, H., Vong, C. M., Lin, J., ... & Liu, J. (2013). Extreme learning machines [trends & controversies]. Intelligent Systems, IEEE, 28(6), 30-59.
Dalto, M. Deep neural networks for time series prediction with applications in ultra-short-term wind forecasting. Rn (Θ1), 1, 2.
Deng, L., Li, J., Huang, J. T., Yao, K., Yu, D., Seide, F., ... & Acero, A. (2013, May). Recent advances in deep learning for speech research at Microsoft. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (pp. 8604-8608). IEEE.
Romeu, P., Zamora-Martínez, F., Botella-Rocamora, P., & Pardo, J. (2013). Time-Series Forecasting of Indoor Temperature Using Pre-trained Deep Neural Networks. In Artificial Neural Networks and Machine Learning–ICANN 2013 (pp. 451-458). Springer Berlin Heidelberg.