QuackOSM Basic Usage¶
QuackOSM exposes some basic functions in the main Python module. Full documentation for them is available here.
This notebook will show how to use the library in a few simple scenarios.
To learn more about CLI
usage, see this page. The help
page for the CLI is available here.
To learn more details about PbfFileReader
class, see this page, or documentation page.
import quackosm as qosm
Reading existing PBF file to GeoDataFrame¶
qosm.convert_pbf_to_geodataframe("https://download.geofabrik.de/europe/monaco-latest.osm.pbf")
Finished operation in 0:00:05
tags | geometry | |
---|---|---|
feature_id | ||
way/773124328 | {'admin_level': '2', 'border_type': 'territori... | LINESTRING (7.5 43.5164, 7.49789 43.51525, 7.4... |
way/37811849 | {'admin_level': '8', 'boundary': 'administrati... | LINESTRING (7.3902 43.72642, 7.39023 43.72641,... |
way/1172678032 | {'layer': '-1', 'location': 'underground', 'op... | LINESTRING (7.41984 43.73961, 7.41901 43.74108... |
node/4970548842 | {'crossing': 'uncontrolled', 'crossing:marking... | POINT (7.4086 43.72958) |
way/155085114 | {'destination:colour:forward': 'bueu;green;whi... | LINESTRING (7.40836 43.72896, 7.40835 43.72901... |
... | ... | ... |
way/161191598 | {'admin_level': '2', 'border_type': 'contiguou... | LINESTRING (7.50024 43.51654, 7.50109 43.51438... |
way/770774362 | {'admin_level': '2', 'border_type': 'territori... | LINESTRING (7.58316 43.56156, 7.57976 43.56075... |
way/46428446 | {'border_type': 'contiguous', 'boundary': 'mar... | LINESTRING (7.53299 43.53683, 7.53384 43.53467... |
node/593676834 | {'ref': '14', 'source:website': 'http://www.le... | POINT (7.58316 43.56156) |
way/1225956054 | {'name': '7.5 degrees east'} | LINESTRING (7.5 43.5164, 7.5 37.26898) |
9697 rows × 2 columns
Transforming existing PBF file to GeoParquet¶
qosm.convert_pbf_to_parquet("https://download.geofabrik.de/europe/monaco-latest.osm.pbf")
PosixPath('files/monaco-latest_nofilter_noclip_compact_sorted.parquet')
qosm.convert_osm_extract_to_geodataframe("Vatican City")
tags | geometry | |
---|---|---|
feature_id | ||
way/807854854 | {'electrified': 'contact_line', 'frequency': '... | LINESTRING (12.42648 41.89492, 12.42684 41.895... |
way/807854853 | {'electrified': 'contact_line', 'frequency': '... | LINESTRING (12.4476 41.89905, 12.44744 41.8990... |
way/48538610 | {'leisure': 'pitch', 'name': 'Campo Sportivo "... | POLYGON ((12.44559 41.89698, 12.44606 41.89743... |
way/35884353 | {'landcover': 'trees', 'landuse': 'forest'} | POLYGON ((12.44756 41.89597, 12.4476 41.89609,... |
way/1136874888 | {'access': 'private', 'highway': 'footway'} | LINESTRING (12.44818 41.8971, 12.44822 41.8970... |
... | ... | ... |
node/4838792571 | {'access': 'yes', 'addr:housenumber': '4', 'ad... | POINT (12.46107 41.89772) |
node/4838792569 | {'addr:housenumber': '2', 'addr:street': 'Piaz... | POINT (12.46126 41.8976) |
way/113567005 | {'addr:city': 'Roma', 'addr:housenumber': '4',... | POLYGON ((12.46021 41.89611, 12.46062 41.89611... |
node/9913395156 | {'entrance': 'yes'} | POINT (12.46115 41.89649) |
node/1564080931 | {'crossing': 'marked', 'highway': 'crossing', ... | POINT (12.46998 41.89199) |
8825 rows × 2 columns
Find an OSM PBF extract file using text and transform it to a GeoParquet¶
qosm.convert_osm_extract_to_parquet("Vatican City")
PosixPath('files/osmfr_europe_vatican_city_nofilter_noclip_compact_sorted.parquet')
Get OSM data for a given geometry as a GeoDataFrame¶
area = qosm.geocode_to_geometry("Songpa-gu, Seoul")
qosm.convert_geometry_to_geodataframe(area)
Finished operation in 0:00:15
tags | geometry | |
---|---|---|
feature_id | ||
way/378066781 | {'name': '한강', 'name:de': 'Han-Fluss', 'name:e... | LINESTRING (127.30772 37.51939, 127.30562 37.5... |
relation/13333338 | {'alt_name': '한강 이북 서울', 'boundary': 'region',... | POLYGON ((126.85363 37.57179, 126.8542 37.5710... |
way/923696946 | {'name': '한강수상택시 관광', 'name:ja': '漢江水上タクシー 観光'... | LINESTRING (127.08265 37.51851, 127.08276 37.5... |
relation/2419945 | {'admin_level': '6', 'boundary': 'administrati... | POLYGON ((127.0572 37.53012, 127.06287 37.5266... |
relation/3881559 | {'admin_level': '8', 'boundary': 'administrati... | POLYGON ((127.09362 37.54349, 127.09392 37.544... |
... | ... | ... |
relation/9035438 | {'name': '탄천', 'name:el': 'Τάντσον', 'name:en'... | POLYGON ((127.07439 37.50143, 127.07335 37.501... |
way/768407483 | {'name': '탄천', 'name:en': 'Tancheon Stream', '... | LINESTRING (127.12103 37.35083, 127.12113 37.3... |
relation/13333337 | {'alt_name': '한강 이남 서울', 'boundary': 'region',... | POLYGON ((126.76622 37.55424, 126.76636 37.553... |
relation/2297418 | {'ISO3166-2': 'KR-11', 'admin_level': '4', 'al... | POLYGON ((126.76622 37.55424, 126.76636 37.553... |
relation/2419954 | {'admin_level': '6', 'boundary': 'administrati... | POLYGON ((127.06659 37.52488, 127.0668 37.5233... |
21678 rows × 2 columns
Save OSM data for a given geometry as a GeoParquet¶
qosm.convert_geometry_to_parquet(area)
PosixPath('files/95be730f_nofilter_compact_sorted.parquet')
More advanced examples¶
Filter out data based on geometry from existing PBF file¶
area = qosm.geocode_to_geometry("Monaco-Ville, Monaco")
gdf = qosm.convert_pbf_to_geodataframe(
"https://download.geofabrik.de/europe/monaco-latest.osm.pbf", geometry_filter=area
)
gdf
Finished operation in 0:00:07
tags | geometry | |
---|---|---|
feature_id | ||
relation/2220206 | {'ISO3166-2': 'MC-FO', 'admin_level': '10', 'b... | POLYGON ((7.41204 43.7281, 7.4121 43.72826, 7.... |
way/585881905 | {'man_made': 'pier'} | POLYGON ((7.41855 43.73091, 7.41878 43.73074, ... |
way/722801035 | {'layer': '-1', 'location': 'underground', 'ma... | LINESTRING (7.41335 43.72748, 7.4181 43.73083,... |
way/93091312 | {'foot': 'no', 'highway': 'secondary', 'name':... | LINESTRING (7.41831 43.73109, 7.41799 43.73086... |
way/167625718 | {'cycleway:left': 'no', 'cycleway:right': 'lan... | LINESTRING (7.41743 43.73034, 7.41806 43.73081... |
... | ... | ... |
way/1089844236 | {'highway': 'steps', 'incline': 'down'} | LINESTRING (7.42745 43.73268, 7.42748 43.7327) |
way/1089844229 | {'highway': 'steps', 'incline': 'down'} | LINESTRING (7.42745 43.73281, 7.42743 43.73285) |
relation/10691624 | {'admin_level': '2', 'border_type': 'territori... | POLYGON ((7.41852 43.72476, 7.41901 43.72512, ... |
relation/2220322 | {'admin_level': '8', 'boundary': 'administrati... | MULTIPOLYGON (((7.43459 43.74587, 7.43454 43.7... |
relation/2220207 | {'ISO3166-2': 'MC-MO', 'admin_level': '10', 'a... | POLYGON ((7.41752 43.73157, 7.41754 43.73179, ... |
933 rows × 2 columns
Plot downloaded data
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import geopandas as gpd
fig = plt.figure(figsize=(10, 10))
ax = fig.subplots()
gdf.plot(ax=ax, markersize=1, zorder=1, alpha=0.4)
gdf.boundary.plot(ax=ax, markersize=0, zorder=1, alpha=0.8)
gpd.GeoSeries([area], crs=4326).plot(
ax=ax,
color=(0, 0, 0, 0),
zorder=2,
hatch="///",
edgecolor="orange",
linewidth=1.5,
)
blue_patch = mpatches.Patch(color="C0", alpha=0.8, label="OSM features")
orange_patch = mpatches.Patch(
facecolor=(0, 0, 0, 0), edgecolor="orange", hatch="///", linewidth=1.5, label="Geometry filter"
)
ax.legend(handles=[blue_patch, orange_patch], loc="lower right")
plt.show()
area = qosm.geocode_to_geometry("Barcelona")
gdf = qosm.convert_geometry_to_geodataframe(
area, tags_filter={"amenity": "bicycle_rental"}
)
gdf
Finished operation in 0:00:14
amenity | geometry | |
---|---|---|
feature_id | ||
node/2281906710 | bicycle_rental | POINT (2.12949 41.37252) |
node/11330146031 | bicycle_rental | POINT (2.1329 41.36868) |
node/3859760369 | bicycle_rental | POINT (2.13421 41.36756) |
node/5262549939 | bicycle_rental | POINT (2.13677 41.36351) |
node/5262548938 | bicycle_rental | POINT (2.13309 41.36537) |
... | ... | ... |
node/5262548949 | bicycle_rental | POINT (2.18043 41.37204) |
node/5262548948 | bicycle_rental | POINT (2.18041 41.37203) |
node/3797124854 | bicycle_rental | POINT (2.16642 41.37162) |
node/430848848 | bicycle_rental | POINT (2.16673 41.37189) |
node/12620399151 | bicycle_rental | POINT (2.16434 41.36848) |
516 rows × 2 columns
Show downloaded data on a map
m = gdf.explore(color="orangered", tiles="CartoDB positron")
gpd.GeoSeries([area], crs=4326).boundary.explore(m=m)
Save the result GeoParquet with WKT geometry¶
qosm.convert_pbf_to_parquet(
"https://download.geofabrik.de/europe/monaco-latest.osm.pbf",
save_as_wkt=True,
sort_result=False, # sorting is disabled for wkt output
)
Finished operation in 0:00:05
PosixPath('files/monaco-latest_nofilter_noclip_compact_wkt.parquet')
Specify result file path¶
qosm.convert_geometry_to_parquet(
area, result_file_path='barcelona_osm_output.parquet'
)
Finished operation in 0:00:24
PosixPath('barcelona_osm_output.parquet')
Force recalculation of the result¶
By default, running the same command twice will result in reusing the saved GeoParquet file. You can force QuackOSM to recalculate the data.
qosm.convert_pbf_to_parquet(
"https://download.geofabrik.de/europe/monaco-latest.osm.pbf", ignore_cache=True
)
Finished operation in 0:00:05
PosixPath('files/monaco-latest_nofilter_noclip_compact_sorted.parquet')
Result file sorting¶
By default, QuackOSM sorts the result file by geometry using Hilbert curve to make it smaller. It adds some time to the overall execution, but can significantly reduce the file size.
Sorting can be disabled by the user.
unsorted_pq = qosm.convert_geometry_to_parquet(
area, tags_filter={"building": True}, sort_result=False
)
Finished operation in 0:00:18
sorted_pq = qosm.convert_geometry_to_parquet(
area, tags_filter={"building": True}, sort_result=True
)
Finished operation in 0:00:20
unsorted_pq, sorted_pq
(PosixPath('files/41b45844_ae99e3d9_exploded.parquet'), PosixPath('files/41b45844_ae99e3d9_exploded_sorted.parquet'))
import geopandas as gpd
from matplotlib import pyplot as plt
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 10))
gpd.read_parquet(unsorted_pq).reset_index().reset_index().plot(
column="index", ax=ax1, cmap="jet", markersize=1
)
gpd.read_parquet(sorted_pq).reset_index().reset_index().plot(
column="index", ax=ax2, cmap="jet", markersize=1
)
unsorted_size = unsorted_pq.stat().st_size
sorted_size = sorted_pq.stat().st_size
ax1.set_title(f"Unsorted: {unsorted_size} bytes")
ax2.set_title(
f"Sorted: {sorted_size} bytes ({100 - (100 * sorted_size) / unsorted_size:.2f}% reduction)"
)
plt.show()