Administrative boundary regionalizer
import geopandas as gpd
import plotly.express as px
from shapely.geometry import Point, box
from srai.regionalizers import AdministrativeBoundaryRegionalizer, geocode_to_region_gdf
from srai.plotting.folium_wrapper import plot_regions
Regionalize city¶
Basic usage of the AdministrativeBoundaryRegionalizer
with a city boundary.
Here admin_level
equal to 9 defines city districts in Poland.
wroclaw_gdf = geocode_to_region_gdf(query=["R451516"], by_osmid=True)
plot_regions(wroclaw_gdf)
abr = AdministrativeBoundaryRegionalizer(admin_level=9)
wro_result_gdf = abr.transform(gdf=wroclaw_gdf)
wro_result_gdf.head()
Loading boundaries: 48: 100%|██████████| 1/1 [00:00<00:00, 1.60it/s] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:167: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead. ].geometry.unary_union
geometry | |
---|---|
region_id | |
Osiedle Ołtaszyn | POLYGON ((17.01348 51.05742, 17.01446 51.05859... |
Osiedle Wojszyce | POLYGON ((17.05334 51.07127, 17.05398 51.06999... |
Osiedle Krzyki-Partynice | POLYGON ((16.98336 51.07738, 16.98405 51.07724... |
Osiedle Gaj | POLYGON ((17.02233 51.07306, 17.02267 51.07304... |
Osiedle Borek | POLYGON ((17.00113 51.08951, 17.00615 51.08810... |
plot_regions(wro_result_gdf)
Regionalize country¶
How to return an empty region covering water bodies outside of the land.
Here admin_level
equal to 4 defines country regions in Madagascar.
madagascar_bbox = box(minx=43.21418, miny=-25.61143, maxx=50.48704, maxy=-11.951126)
madagascar_bbox_gdf = gpd.GeoDataFrame({"geometry": [madagascar_bbox]}, crs="EPSG:4326")
abr = AdministrativeBoundaryRegionalizer(admin_level=4, return_empty_region=True)
madagascar_result_gdf = abr.transform(gdf=madagascar_bbox_gdf)
madagascar_result_gdf.tail()
Loading boundaries: 25: 100%|██████████| 1/1 [00:01<00:00, 1.64s/it] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:167: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead. ].geometry.unary_union
geometry | |
---|---|
region_id | |
Diana Region | MULTIPOLYGON (((48.35872 -13.60528, 48.35853 -... |
Mayotte | MULTIPOLYGON (((45.01840 -12.64340, 45.01860 -... |
Moheli | MULTIPOLYGON (((43.62245 -12.28540, 43.62292 -... |
Anjouan | MULTIPOLYGON (((44.19878 -12.16511, 44.19946 -... |
EMPTY | MULTIPOLYGON (((43.21418 -25.61143, 43.21418 -... |
plot_regions(madagascar_result_gdf)
Regionalize Europe¶
Option to slightly increase the value of toposiplify
to simplify geometries even more.
Here admin_level
equal to 2 defines countries.
eu_bbox = box(minx=-10.478556, miny=34.633284672291, maxx=32.097916, maxy=70.096054)
eu_bbox_gdf = gpd.GeoDataFrame({"geometry": [eu_bbox]}, crs="EPSG:4326")
abr = AdministrativeBoundaryRegionalizer(admin_level=2, toposimplify=0.0005)
eu_result_gdf = abr.transform(gdf=eu_bbox_gdf)
Loading boundaries: 54: 100%|██████████| 1/1 [00:13<00:00, 13.55s/it] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:167: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead. ].geometry.unary_union
eu_result_gdf.head()
geometry | |
---|---|
region_id | |
France | MULTIPOLYGON (((-2.77861 49.27876, -2.73805 49... |
Algeria | POLYGON ((-2.18305 35.33251, -2.05473 35.32209... |
Morocco | POLYGON ((-5.88545 35.96488, -5.66652 35.93567... |
Netherlands | MULTIPOLYGON (((3.08610 51.57045, 3.09529 51.5... |
Spain | MULTIPOLYGON (((-1.77176 43.48807, -1.77176 43... |
plot_regions(eu_result_gdf)
Toposimplify differences¶
Shows differences in simplification of small regions using four values: 1e-4
, 1e-3
, 1e-2
and 0.1
. Those values are in degress, since it uses Douglas-Peucker simplification algorithm.
1e-4
is the default value and is equal to about 11.1m accuracy.
More info: https://github.com/mattijn/topojson
Here admin_level
equal to 6 defines city districts in Singapore.
singapore_bbox = box(minx=103.5111238, miny=1.1263707, maxx=104.1313374, maxy=1.4787511)
singapore_bbox_gdf = gpd.GeoDataFrame({"geometry": [singapore_bbox]}, crs="EPSG:4326")
results = {}
for value in [0.0001, 0.001, 0.01, 0.1]:
abr = AdministrativeBoundaryRegionalizer(admin_level=6, toposimplify=value)
results[value] = abr.transform(gdf=singapore_bbox_gdf)
Loading boundaries: 8: 100%|██████████| 1/1 [00:00<00:00, 10.03it/s] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:167: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead. ].geometry.unary_union Loading boundaries: 8: 100%|██████████| 1/1 [00:00<00:00, 9.72it/s] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:167: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead. ].geometry.unary_union Loading boundaries: 8: 100%|██████████| 1/1 [00:00<00:00, 9.84it/s] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:167: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead. ].geometry.unary_union Loading boundaries: 8: 100%|██████████| 1/1 [00:00<00:00, 9.75it/s] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:167: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead. ].geometry.unary_union
minx, miny, maxx, maxy = singapore_bbox.bounds
for epsilon, result in results.items():
fig = px.choropleth_mapbox(
result,
geojson=result,
color=result.index,
locations=result.index,
center={"lat": 1.3119350704252704, "lon": 103.82412242562575},
mapbox_style="carto-positron",
zoom=9.5,
)
fig.update_layout(margin={"r": 0, "t": 0, "l": 0, "b": 0})
fig.update_traces(marker={"opacity": 0.6}, selector=dict(type="choroplethmapbox"))
fig.update_traces(showlegend=False)
fig.update_geos(
projection_type="equirectangular",
lataxis_range=[miny - 0.1, maxy + 0.1],
lonaxis_range=[minx - 0.1, maxx + 0.1],
showlakes=False,
showcountries=False,
showframe=False,
resolution=50,
)
size = len(result.to_json().encode("utf-8"))
fig.update_layout(
height=450,
width=700,
margin={"r": 0, "t": 50, "l": 0, "b": 0},
title_text=f"Toposimplify value: {epsilon} ({size/1000} KB)",
)
fig.show(renderer="png") # replace with fig.show() to allow interactivity
Regionalize points¶
How to return original regions without clipping and select them using list of points. Showed using list of metro stations in Paris.
Here admin_level
equal to 8 defines communes in France.
import requests
r = requests.get("https://raw.githubusercontent.com/w8r/paris-metro-graph/master/metro.json").json()
stations_gdf = gpd.GeoDataFrame(
{"geometry": [Point(s["longitude"], s["latitude"]) for s in r["nodes"]]}, crs="EPSG:4326"
)
stations_gdf
geometry | |
---|---|
0 | POINT (2.30159 48.81081) |
1 | POINT (2.29703 48.81534) |
2 | POINT (2.24947 48.88824) |
3 | POINT (2.27210 48.88106) |
4 | POINT (2.28937 48.87558) |
... | ... |
309 | POINT (2.33812 48.88228) |
310 | POINT (2.34955 48.88729) |
311 | POINT (2.35482 48.84599) |
312 | POINT (2.29583 48.87493) |
313 | POINT (2.40655 48.87706) |
314 rows × 1 columns
abr = AdministrativeBoundaryRegionalizer(
admin_level=8, return_empty_region=False, clip_regions=False
)
paris_districts_result = abr.transform(gdf=stations_gdf)
Loading boundaries: 0: 0%| | 0/314 [00:00<?, ?it/s]/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/shapely/predicates.py:639: RuntimeWarning: invalid value encountered in covered_by Loading boundaries: 29: 100%|██████████| 314/314 [00:00<00:00, 731.07it/s] /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/srai/regionalizers/administrative_boundary_regionalizer.py:170: FutureWarning: `unary_union` returned None due to all-None GeoSeries. In future, `unary_union` will return 'GEOMETRYCOLLECTION EMPTY' instead.
paris_districts_result.head()
geometry | |
---|---|
region_id | |
Châtillon | POLYGON ((2.27988 48.81091, 2.28325 48.81056, ... |
Malakoff | POLYGON ((2.30132 48.82513, 2.31413 48.82226, ... |
Puteaux | POLYGON ((2.25372 48.88691, 2.25244 48.88551, ... |
Neuilly-sur-Seine | POLYGON ((2.24562 48.87636, 2.24777 48.87849, ... |
Paris | POLYGON ((2.23208 48.86951, 2.24046 48.87189, ... |
folium_map = plot_regions(paris_districts_result, tiles_style="CartoDB positron")
stations_gdf.explore(
m=folium_map,
style_kwds=dict(color="#444", opacity=1, fillColor="#f2f2f2", fillOpacity=1),
marker_kwds=dict(radius=1),
)