Overture Maps Loader¶
OvertureMapsLoader
can download the Overture Maps data from the s3 bucket for a given region.
It is a wrapper around the OvertureMaestro
library that can download the data in the original format but also have some advanced functions.
In the SRAI
context, OvertureMapsLoader
utilizes so-called wide format for returning features with columns representing potential categories of the object. If you want to read more in-depth about this format, you can checkout this OvertureMaestro's docs page.
from shapely.geometry import box
from srai.constants import GEOMETRY_COLUMN
from srai.loaders import OvertureMapsLoader
from srai.regionalizers import geocode_to_region_gdf
Using OvertureMapsLoader to download data for a specific area¶
Download all available features in Paris, France¶
loader = OvertureMapsLoader()
paris = geocode_to_region_gdf("Paris")
paris_features_gdf = loader.load(paris)
paris_features_gdf
Finished operation in 0:00:24
geometry | base|infrastructure|aerialway | base|infrastructure|airport | base|infrastructure|barrier | base|infrastructure|bridge | base|infrastructure|communication | base|infrastructure|emergency | base|infrastructure|manhole | base|infrastructure|pedestrian | base|infrastructure|pier | ... | places|place|professional_services | places|place|public_service_and_government | places|place|real_estate | places|place|religious_organization | places|place|retail | places|place|structure_and_geography | places|place|travel | transportation|segment|rail | transportation|segment|road | transportation|segment|water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||||
08b1fb4641664fff0001bae506c96b07 | LINESTRING (2.43214 48.81815, 2.43222 48.81815... | False | False | True | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb4641746fff0001b8007d7f2a5c | LINESTRING (2.42812 48.82011, 2.42911 48.82012... | False | False | True | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb4641199fff0001a698116d18c6 | POINT (2.42325 48.82426) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb464118afff0001a3ea03fa8e41 | POINT (2.42329 48.82431) | False | False | False | False | False | False | False | True | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb464119dfff0001a131eef3a77e | POINT (2.42308 48.82432) | False | False | False | False | False | False | False | True | False | ... | False | False | False | False | False | False | False | False | False | False |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
08b1fb46666c3fff0200fc980dc0086e | POLYGON ((2.34719 48.86707, 2.34702 48.86707, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb46666d5fff02002ad3925f625a | POLYGON ((2.34549 48.86726, 2.34557 48.86714, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb46666d4fff0200a7a74d316f84 | POLYGON ((2.3453 48.86719, 2.34537 48.86709, 2... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb46666c3fff0200899070512c7d | POLYGON ((2.34721 48.86722, 2.34711 48.86721, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b1fb46666d4fff02009c620aee0a88 | POLYGON ((2.34537 48.86722, 2.34544 48.86711, ... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
567674 rows × 115 columns
Plot features¶
Colours from the this palette: https://colorhunt.co/palette/f8ededff8225b43f3f173b45
ax = paris.plot(color="#F8EDED", figsize=(16, 16))
# plot water
water_columns = [c for c in paris_features_gdf.columns if "water" in c]
water_data = paris_features_gdf[paris_features_gdf[water_columns].any(axis=1)]
water_data.plot(ax=ax, color="#FF8225", markersize=0)
# plot_roads
roads_data = paris_features_gdf[paris_features_gdf["transportation|segment|road"]]
roads_data.plot(ax=ax, color="#B43F3F", markersize=0, linewidth=0.25)
# plot buildings
building_columns = [c for c in paris_features_gdf.columns if c.startswith("buildings")]
buildings_data = paris_features_gdf[paris_features_gdf[building_columns].any(axis=1)]
buildings_data.plot(ax=ax, color="#173B45", markersize=0)
paris.boundary.plot(ax=ax, color="#173B45", linewidth=2, alpha=0.5)
xmin, ymin, xmax, ymax = paris.total_bounds
ax.set_xlim(xmin - 0.001, xmax + 0.001)
ax.set_ylim(ymin - 0.001, ymax + 0.001)
ax.set_axis_off()
Download more detailed data with higher hierarchy value¶
By default, the hierarchy_depth
value is equal to 1
, but it can be set to None
to get a full list of all possible columns.
manhattan_bbox = box(-73.994551, 40.762396, -73.936872, 40.804239)
loader = OvertureMapsLoader(hierarchy_depth=None)
new_york_features_gdf = loader.load(manhattan_bbox)
new_york_features_gdf
Finished operation in 0:00:34
geometry | base|infrastructure|aerialway|aerialway_station | base|infrastructure|aerialway|cable_car | base|infrastructure|aerialway|chair_lift | base|infrastructure|aerialway|drag_lift | base|infrastructure|aerialway|gondola | base|infrastructure|aerialway|goods | base|infrastructure|aerialway|j-bar | base|infrastructure|aerialway|magic_carpet | base|infrastructure|aerialway|mixed_lift | ... | transportation|segment|road|service|parking_aisle | transportation|segment|road|steps | transportation|segment|road|tertiary | transportation|segment|road|tertiary|link | transportation|segment|road|track | transportation|segment|road|trunk | transportation|segment|road|trunk|link | transportation|segment|road|unclassified | transportation|segment|road|unknown | transportation|segment|water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||||
08b2a100d2ce9fff0001b50e5a03898d | LINESTRING (-73.97214 40.7268, -73.97341 40.72... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a10725843fff0001a5d4ee060f99 | POINT (-73.99301 40.7627) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a10725850fff0001a15cab262e5f | POINT (-73.99422 40.76333) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a10725850fff0001aa0db52328c3 | POINT (-73.99409 40.76328) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a10725842fff0001aea4a2db9be8 | POINT (-73.9933 40.76271) | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
08b2a1008d813fff02005d5fc8ff050a | POLYGON ((-73.93783 40.80407, -73.9379 40.8041... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a1008d811fff0200004d8e3c14fe | POLYGON ((-73.93786 40.80399, -73.93781 40.804... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a1008d811fff02005e044a307913 | POLYGON ((-73.9379 40.8041, -73.93783 40.80407... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a1008d810fff0200e8f5d6d218e5 | POLYGON ((-73.93791 40.80412, -73.93783 40.804... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
08b2a1008d804fff020083505b4a5d0a | POLYGON ((-73.93618 40.80402, -73.93598 40.804... | False | False | False | False | False | False | False | False | False | ... | False | False | False | False | False | False | False | False | False | False |
65142 rows × 2632 columns
As you can see, there are over 2600
columns available.
Let's see top 20 most popular columns.
new_york_features_gdf.drop(columns=GEOMETRY_COLUMN).sum().sort_values(ascending=False).head(20)
buildings|building 12498 base|land|tree|tree 5665 base|infrastructure|transportation|crossing 3845 transportation|segment|road|footway 3369 places|place|health_and_medical 3084 base|infrastructure|barrier|kerb 3003 places|place|health_and_medical|doctor 2299 transportation|segment|road|footway|sidewalk 2145 places|place|eat_and_drink|restaurant 1715 buildings|building|residential|apartments 1703 transportation|segment|road|footway|crosswalk 1645 places|place|professional_services 1492 places|place|beauty_and_spa 1020 base|infrastructure|transit|bicycle_parking 994 base|infrastructure|transportation|traffic_signals 989 places|place|health_and_medical|hospital 886 places|place|retail 844 transportation|segment|road|residential 832 places|place|beauty_and_spa|beauty_salon 818 places|place|attractions_and_activities|landmark_and_historical_building 786 dtype: int64
Configure places dataset¶
Places schema is the only one that is treated differently than other data types.
By default, places use both primary
and alternate
categories to define a feature.
Additionally, there is a filter applied to get only features with confidence score >= 0.75
.
There are two dedicated parameters: places_minimal_confidence
and places_use_primary_category_only
to configure how the data should be transformed.
Let's do example with both of these parameters. We will also use a theme_type_pairs
parameter to limit the scope of the downloaded data.
default_confidence_loader = OvertureMapsLoader(
theme_type_pairs=[("places", "place")], places_use_primary_category_only=True
)
strict_confidence_loader = OvertureMapsLoader(
theme_type_pairs=[("places", "place")],
places_minimal_confidence=0.99,
places_use_primary_category_only=True,
)
shibuya = geocode_to_region_gdf("Shibuya, Tokyo")
shibuya_default_confidence_features_gdf = default_confidence_loader.load(shibuya)
shibuya_strict_confidence_features_gdf = strict_confidence_loader.load(shibuya)
print(f"Default confidence score: {len(shibuya_default_confidence_features_gdf)}")
print(f"Strict confidence score: {len(shibuya_strict_confidence_features_gdf)}")
Finished operation in 0:00:05
Finished operation in 0:00:05
Default confidence score: 22592 Strict confidence score: 452
Let's see the count of categories in the places dataset with confidence score >= 0.99
.
shibuya_strict_confidence_features_df = shibuya_strict_confidence_features_gdf.drop(
columns=GEOMETRY_COLUMN
)
shibuya_strict_confidence_features_df.sum().loc[lambda x: x > 0].sort_values(ascending=False)
places|place|eat_and_drink 120 places|place|retail 113 places|place|beauty_and_spa 56 places|place|education 35 places|place|health_and_medical 34 places|place|active_life 22 places|place|professional_services 22 places|place|attractions_and_activities 17 places|place|arts_and_entertainment 14 places|place|accommodation 5 places|place|real_estate 4 places|place|automotive 3 places|place|business_to_business 2 places|place|religious_organization 2 places|place|financial_service 1 places|place|mass_media 1 places|place|public_service_and_government 1 dtype: int64
Plot features¶
Now we will see the difference between default list of places (gray dots) and strict ones (coloured circles)