Generated using DALLΒ·E 3 model with this prompt: A logo for a python library with White background, high quality, 8k. Cute duck and globe with cartography elements. Library for reading OpenStreetMap data using DuckDB.
QuackOSM¶
An open-source tool for reading OpenStreetMap PBF files using DuckDB.
What is QuackOSM π¦?¶
- Scalable reader for OpenStreetMap ProtoBuffer (
pbf
) files. - Is based on top of
DuckDB
1 with itsSpatial
2 extension. - Saves files in the
GeoParquet
3 file format for easier integration with modern cloud stacks. - Utilizes multithreading unlike GDAL that works in a single thread only.
- Can filter data based on geometry without the need for
ogr2ogr
clipping before operation. - Can filter data based on OSM tags.
- Utilizes caching to reduce repeatable computations.
- Can be used as Python module as well as a beautiful CLI based on
Typer
4.
Installing¶
As pure Python module¶
With beautiful CLI¶
Required Python version?¶
QuackOSM supports Python >= 3.9
Dependencies¶
Required: - duckdb (>=0.9.2) - pyarrow (>=13.0.0) - geoarrow-pyarrow (>=0.1.1) - geopandas - shapely - typeguard
Optional: - typer[all] (click, colorama, rich, shellingham)
Usage¶
Load data as a GeoDataFrame¶
>>> import quackosm as qosm
>>> qosm.get_features_gdf(monaco_pbf_path)
tags geometry
feature_id
node/10005045289 {'shop': 'bakery'} POINT (7.42245 43.73105)
node/10020887517 {'leisure': 'swimming_pool', ... POINT (7.41316 43.73384)
node/10021298117 {'leisure': 'swimming_pool', ... POINT (7.42777 43.74277)
node/10021298717 {'leisure': 'swimming_pool', ... POINT (7.42630 43.74097)
node/10025656383 {'ferry': 'yes', 'name': 'Qua... POINT (7.42550 43.73690)
... ... ...
way/990669427 {'amenity': 'shelter', 'shelt... POLYGON ((7.41461 43.7338...
way/990669428 {'highway': 'secondary', 'jun... LINESTRING (7.41366 43.73...
way/990669429 {'highway': 'secondary', 'jun... LINESTRING (7.41376 43.73...
way/990848785 {'addr:city': 'Monaco', 'addr... POLYGON ((7.41426 43.7339...
way/993121275 {'building': 'yes', 'name': ... POLYGON ((7.43214 43.7481...
[7906 rows x 2 columns]
Just convert PBF to GeoParquet¶
>>> import quackosm as qosm
>>> gpq_path = qosm.convert_pbf_to_gpq(monaco_pbf_path)
>>> gpq_path.as_posix()
'files/monaco_nofilter_noclip_compact.geoparquet'
Inspect the file with duckdb¶
>>> import duckdb
>>> duckdb.load_extension('spatial')
>>> duckdb.read_parquet(str(gpq_path)).project(
... "* REPLACE (ST_GeomFromWKB(geometry) AS geometry)"
... ).order("feature_id")
ββββββββββββββββββββ¬βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββ
β feature_id β tags β geometry β
β varchar β map(varchar, varchβ¦ β geometry β
ββββββββββββββββββββΌβββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββ€
β node/10005045289 β {shop=bakery} β POINT (7.4224498 43.7310532) β
β node/10020887517 β {leisure=swimming_β¦ β POINT (7.4131561 43.7338391) β
β node/10021298117 β {leisure=swimming_β¦ β POINT (7.4277743 43.7427669) β
β node/10021298717 β {leisure=swimming_β¦ β POINT (7.4263029 43.7409734) β
β node/10025656383 β {ferry=yes, name=Qβ¦ β POINT (7.4254971 43.7369002) β
β node/10025656390 β {amenity=restauranβ¦ β POINT (7.4269287 43.7368818) β
β node/10025656391 β {name=Capitainerieβ¦ β POINT (7.4272127 43.7359593) β
β node/10025656392 β {name=Direction deβ¦ β POINT (7.4270392 43.7365262) β
β node/10025656393 β {name=IQOS, openinβ¦ β POINT (7.4275175 43.7373195) β
β node/10025656394 β {artist_name=Anna β¦ β POINT (7.4293446 43.737448) β
β Β· β Β· β Β· β
β Β· β Β· β Β· β
β Β· β Β· β Β· β
β way/986864693 β {natural=bare_rock} β POLYGON ((7.4340482 43.745598, 7.4340263 4β¦ β
β way/986864694 β {barrier=wall} β LINESTRING (7.4327547 43.7445382, 7.432808β¦ β
β way/986864695 β {natural=bare_rock} β POLYGON ((7.4332994 43.7449315, 7.4332912 β¦ β
β way/986864696 β {barrier=wall} β LINESTRING (7.4356006 43.7464325, 7.435574β¦ β
β way/986864697 β {natural=bare_rock} β POLYGON ((7.4362767 43.74697, 7.4362983 43β¦ β
β way/990669427 β {amenity=shelter, β¦ β POLYGON ((7.4146087 43.733883, 7.4146192 4β¦ β
β way/990669428 β {highway=secondaryβ¦ β LINESTRING (7.4136598 43.7334433, 7.413640β¦ β
β way/990669429 β {highway=secondaryβ¦ β LINESTRING (7.4137621 43.7334251, 7.413746β¦ β
β way/990848785 β {addr:city=Monaco,β¦ β POLYGON ((7.4142551 43.7339622, 7.4143113 β¦ β
β way/993121275 β {building=yes, namβ¦ β POLYGON ((7.4321416 43.7481309, 7.4321638 β¦ β
ββββββββββββββββββββ΄βββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββ€
β 7906 rows (20 shown) 3 columns β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Use as CLI¶
$ quackosm monaco.osm.pbf
β Ή [ 1/18] Filtering nodes β’ 0:00:00
β § [ 2/18] Filtering ways β’ 0:00:00
β ΄ [ 3/18] Filtering relations β’ 0:00:00
β Ή [ 4/18] Loading required ways β’ 0:00:00
β Ό [ 5/18] Loading required nodes β’ 0:00:00
β [ 6/18] Saving nodes with geometries β’ 0:00:00
β [ 7/18] Saving filtered nodes with structs β’ 0:00:00
β [ 8/18] Grouping required ways β’ 0:00:00
[ 9/18] Saving required ways with linestrings 100% ββββββββββββββββββββββββββββββββββββββββ 1/1 β’ 0:00:00 β’ 0:00:00
β Ή [ 10/18] Saving ways with geometries β’ 0:00:00
β Έ [ 11/18] Saving valid relation parts β’ 0:00:00
β [12.1/18] Saving relation inner parts - valid geometries β’ 0:00:00
β [12.2/18] Saving relation inner parts - invalid geometries β’ 0:00:00
β [13.1/18] Saving relation outer parts - valid geometries β’ 0:00:00
β [13.2/18] Saving relation outer parts - invalid geometries β’ 0:00:00
β [ 14/18] Saving relation outer parts with holes β’ 0:00:00
β [ 15/18] Saving relation outer parts without holes β’ 0:00:00
β [ 16/18] Saving relation with geometries β’ 0:00:00
β Ή [17.1/18] Saving valid features β’ 0:00:00
β [ 18/18] Saving final geoparquet file β’ 0:00:00
files/monaco_nofilter_noclip_compact.geoparquet
You can find full API + more examples in the docs.