How are the Street-by-Street Air Quality Maps created? Follow
Welcome to the pollution maps of the future! Like all good scientists, I’m sure you’re thinking to yourself: They’re beautiful and informative (we hope) but how are they made? Where does the data come from?
If that’s the case then read on for the answers you’ve been looking for!
Illustration: Live pollution in New York, New York
Our mapping platform relies on various datasets which are assimilated using state-of-the-art machine learning methods. The spatial granularity of the maps is adapted to the available data, from 10 meters in large urban areas to a few kilometers in poorly monitored regions. In a nutshell, the more data sources we have, the more detail we can create.
Map input datasets
Our street-by-street maps are based on the following data sources:
-
Official air quality monitoring networks
-
Atmospheric models (run by atmospheric science labs)
-
Weather forecasts
-
Anthropogenic emissions datasets
Official air quality monitoring networks
We collect the air quality measures from 10k+ monitoring stations across the world. This is the backbone of our maps. This data set provides real-time air quality estimates at each station location. However, there’s no way we would leave it at that! There are nowhere near enough of these stations to provide accurate air quality estimates for the level of detail we want. We need to add more information to fill in the gaps!
Illustration: monitoring stations in part of Europe
Atmospheric models
Atmospheric science labs across the world build large-scale atmospheric models producing air quality forecasts over a large area: North America, Europe, Asia and even worldwide for some of them. These models use atmospheric science and machine learning to make predictions about what the air pollution is like in any given area.
Air pollution models commonly rely on the following:
-
Fine-grained assumptions on the main sources of emissions in the area (traffic or power plants for example)
-
Models of the pollutant transport (how far the pollution travels) due to the wind and other weather conditions
-
Models of the chemical reactions happening in the atmosphere. For example, when NO2 meets sunshine, you get the nasty pollutant ozone.
Illustration: PM2.5 output over the London area of an atmospheric model at a given time
Weather forecasts
Air quality is affected tremendously by meteorological conditions. For example: A heavy rain actually cleans the air, and wind can move pollutants over large distances. We use real-time weather estimates and forecasts to model how air pollution changes over time.
Illustration: wind speed in the eastern direction
Anthropogenic emissions datasets
Anthropogenic emissions obviously have a high impact on air quality, and come from various sources: road traffic, coal power plants, farming, etc.
We use various datasets to estimate all kinds of anthropogenic emissions across the world, including:
-
Real-time traffic
-
Urban datasets classifying the land into dozens of categories (residential areas, industrial areas, natural parks, etc)
-
Population density
Illustration: Traffic count estimates in Japan and S. Korea
Mapping methodology
At 1 billion+ grid points across the world, we have built dozens of variables characterizing how the air quality looks like at each location. There are two main types of variables:
-
Variables giving an average estimate of air quality in the area, for example:
-
Average air quality measure provided by the air quality monitoring stations nearby
-
Average air quality forecasted by an atmospheric model
-
Variables giving air pollution emissions estimates in the area, for example:
-
Number of cars driving in the area
-
Distance to the closest coal power plant
-
Population density
Those variables are computed using several spatial resolutions (i.e. the size of the area considered around a grid point). This allows us to define high-resolution variables where there is enough available data, and low-resolution ones otherwise.
Then, machine learning models are learnt over several years of monitoring stations measures. All the features listed previously are combined in many different ways to reproduce the measures.
Finally, the mapping system can be used at every grid point to produce a predicted air quality value. The grid density is adapted to the quantity of information available: There may be one point every 10 meters in large cities with very dense monitoring networks and fine-grained estimates of air pollutant emissions, and one point every few kilometers in regions with less available data.
Illustrations: Los Angeles high-resolution map (left) / US 1km-resolution heatmap (right)
Comments
0 comments
Please sign in to leave a comment.