Overture Maps Loader¶
OvertureMapsLoader
can download the Overture Maps data from the s3 bucket for a given region.
It is a wrapper around the OvertureMaestro
library that can download the data in the original format but also have some advanced functions.
In the SRAI
context, OvertureMapsLoader
utilizes so-called wide format for returning features with columns representing potential categories of the object. If you want to read more in-depth about this format, you can checkout this OvertureMaestro's docs page.
from shapely.geometry import box
from srai.constants import GEOMETRY_COLUMN
from srai.loaders import OvertureMapsLoader
from srai.regionalizers import geocode_to_region_gdf
Using OvertureMapsLoader to download data for a specific area¶
Download all available features in Paris, France¶
loader = OvertureMapsLoader()
paris = geocode_to_region_gdf("Paris")
paris_features_gdf = loader.load(paris)
paris_features_gdf
Finished operation in 0:00:27
geometry | base|infrastructure|aerialway | base|infrastructure|airport | base|infrastructure|barrier | base|infrastructure|bridge | base|infrastructure|communication | base|infrastructure|emergency | base|infrastructure|manhole | base|infrastructure|pedestrian | base|infrastructure|pier | ... | places|place|professional_services | places|place|public_service_and_government | places|place|real_estate | places|place|religious_organization | places|place|retail | places|place|structure_and_geography | places|place|travel | transportation|segment|rail | transportation|segment|road | transportation|segment|water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||||
08b1fb466e895fff0001bd163118da4a | LINESTRING (2.41281 48.86676, 2.4135 48.8673, ... | False | False | False | True | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb466e882fff0001b56bea9efdfc | LINESTRING (2.41396 48.86759, 2.41378 48.86744... | False | False | False | True | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb466e882fff0001b74ff032275e | LINESTRING (2.41374 48.86723, 2.41377 48.86728... | False | False | False | True | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb466e882fff0001af362bfc6a09 | POINT (2.41388 48.86739) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb466e886fff0001a478e49ef9a6 | POINT (2.41378 48.86744) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
08b1fb4662850fff02002b554c456958 | POLYGON ((2.35874 48.86704, 2.35877 48.86704, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb466284bfff02001eb6226a4658 | POLYGON ((2.36154 48.86714, 2.36157 48.86713, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb4662ba4fff0200b09be1618aee | POLYGON ((2.36149 48.86705, 2.36154 48.86713, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb4662b16fff020030f2b670d99e | POLYGON ((2.36233 48.86724, 2.3623 48.86718, 2... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb466284bfff0200b13fbcd2b6e9 | POLYGON ((2.36187 48.86699, 2.36195 48.86715, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
572036 rows × 115 columns
Plot features¶
Colours from the this palette: https://colorhunt.co/palette/f8ededff8225b43f3f173b45
ax = paris.plot(color="#F8EDED", figsize=(16, 16))
# plot water
water_columns = [c for c in paris_features_gdf.columns if "water" in c]
water_data = paris_features_gdf[paris_features_gdf[water_columns].any(axis=1)]
water_data.plot(ax=ax, color="#FF8225", markersize=0)
# plot_roads
roads_data = paris_features_gdf[paris_features_gdf["transportation|segment|road"]]
roads_data.plot(ax=ax, color="#B43F3F", markersize=0, linewidth=0.25)
# plot buildings
building_columns = [c for c in paris_features_gdf.columns if c.startswith("buildings")]
buildings_data = paris_features_gdf[paris_features_gdf[building_columns].any(axis=1)]
buildings_data.plot(ax=ax, color="#173B45", markersize=0)
paris.boundary.plot(ax=ax, color="#173B45", linewidth=2, alpha=0.5)
xmin, ymin, xmax, ymax = paris.total_bounds
ax.set_xlim(xmin - 0.001, xmax + 0.001)
ax.set_ylim(ymin - 0.001, ymax + 0.001)
ax.set_axis_off()
Download more detailed data with higher hierarchy value¶
By default, the hierarchy_depth
value is equal to 1
, but it can be set to None
to get a full list of all possible columns.
manhattan_bbox = box(-73.994551, 40.762396, -73.936872, 40.804239)
loader = OvertureMapsLoader(hierarchy_depth=None)
new_york_features_gdf = loader.load(manhattan_bbox)
new_york_features_gdf
Finished operation in 0:00:31
geometry | base|infrastructure|aerialway|aerialway_station | base|infrastructure|aerialway|cable_car | base|infrastructure|aerialway|chair_lift | base|infrastructure|aerialway|drag_lift | base|infrastructure|aerialway|gondola | base|infrastructure|aerialway|goods | base|infrastructure|aerialway|j-bar | base|infrastructure|aerialway|magic_carpet | base|infrastructure|aerialway|mixed_lift | ... | transportation|segment|road|service|parking_aisle | transportation|segment|road|steps | transportation|segment|road|tertiary | transportation|segment|road|tertiary|link | transportation|segment|road|track | transportation|segment|road|trunk | transportation|segment|road|trunk|link | transportation|segment|road|unclassified | transportation|segment|road|unknown | transportation|segment|water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||||
08b2a100f32f0fff0001b39a4f0003f6 | LINESTRING (-73.88819 40.77247, -73.88914 40.7... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a100d2ce9fff0001b50e5a03898d | LINESTRING (-73.97214 40.7268, -73.97341 40.72... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a10725843fff0001a5d4ee060f99 | POINT (-73.99301 40.7627) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a10725850fff0001a15cab262e5f | POINT (-73.99422 40.76333) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a10725850fff0001aa0db52328c3 | POINT (-73.99409 40.76328) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
08b2a100f2650fff02008c2684f54385 | POLYGON ((-73.93679 40.77279, -73.93683 40.772... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a100f2656fff0200f667ed909313 | POLYGON ((-73.93719 40.77324, -73.93714 40.773... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a100f260bfff0200ec356893da2f | POLYGON ((-73.93741 40.77379, -73.9375 40.7738... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a100f2604fff0200b0ba2bc875ff | POLYGON ((-73.93661 40.77495, -73.9363 40.7755... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a100f2622fff0200df47d5bd8207 | POLYGON ((-73.93694 40.77586, -73.93697 40.775... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
65080 rows × 2633 columns
As you can see, there are over 2600
columns available.
Let's see top 20 most popular columns.
new_york_features_gdf.drop(columns=GEOMETRY_COLUMN).sum().sort_values(ascending=False).head(20)
buildings|building 12487 base|land|tree|tree 5665 base|infrastructure|transportation|crossing 3846 transportation|segment|road|footway 3373 places|place|health_and_medical 3063 base|infrastructure|barrier|kerb 3005 places|place|health_and_medical|doctor 2350 transportation|segment|road|footway|sidewalk 2145 places|place|eat_and_drink|restaurant 1795 buildings|building|residential|apartments 1706 transportation|segment|road|footway|crosswalk 1646 places|place|professional_services 1418 base|infrastructure|transit|bicycle_parking 994 base|infrastructure|transportation|traffic_signals 990 places|place|beauty_and_spa 875 places|place|health_and_medical|hospital 868 places|place|retail 866 transportation|segment|road|residential 832 places|place|beauty_and_spa|beauty_salon 829 places|place|attractions_and_activities|landmark_and_historical_building 793 dtype: int64
Configure places dataset¶
Places schema is the only one that is treated differently than other data types.
By default, places use both primary
and alternate
categories to define a feature.
Additionally, there is a filter applied to get only features with confidence score >= 0.75
.
There are two dedicated parameters: places_minimal_confidence
and places_use_primary_category_only
to configure how the data should be transformed.
Let's do example with both of these parameters. We will also use a theme_type_pairs
parameter to limit the scope of the downloaded data.
default_confidence_loader = OvertureMapsLoader(
theme_type_pairs=[("places", "place")], places_use_primary_category_only=True
)
strict_confidence_loader = OvertureMapsLoader(
theme_type_pairs=[("places", "place")],
places_minimal_confidence=0.99,
places_use_primary_category_only=True,
)
songpa = geocode_to_region_gdf("Songpa-gu, Seoul")
songpa_default_confidence_features_gdf = default_confidence_loader.load(songpa)
songpa_strict_confidence_features_gdf = strict_confidence_loader.load(songpa)
print(f"Default confidence score: {len(songpa_default_confidence_features_gdf)}")
print(f"Strict confidence score: {len(songpa_strict_confidence_features_gdf)}")
Finished operation in 0:00:03
Finished operation in 0:00:03
Default confidence score: 5306 Strict confidence score: 8
Let's see the count of categories in the places dataset with confidence score >= 0.99
.
songpa_strict_confidence_features_df = songpa_strict_confidence_features_gdf.drop(
columns=GEOMETRY_COLUMN
)
songpa_strict_confidence_features_df.sum().loc[lambda x: x > 0].sort_values(ascending=False)
places|place|retail 7 places|place|beauty_and_spa 1 dtype: int64
Plot features¶
Now we will see the difference between default list of places (gray dots) and strict ones (coloured circles)