Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased¶
0.12.1 - 2025-01-03¶
Added¶
- Automatic download progress bar hiding when verbosity is set to
silent. - Cached nominatim geocoding results to speed up tests
0.12.0 - 2024-11-03¶
Added¶
- Option to pass custom SQL filters with
custom_sql_filter(and--custom-sql-filter) parameter #67
Fixed¶
- Delayed progress bar appearing during nodes intersection step
0.11.4 - 2024-10-28¶
Changed¶
- Improved multiprocessing intersection algorithm by early stopping processes start-up if finished quicker than expected
0.11.3 - 2024-10-25¶
Changed¶
- Moved location of the OSM extracts providers to the global cache #173
0.11.2 - 2024-10-14¶
Added¶
- Option to pass a bounding box as a geometry filter in CLI #169
Changed¶
- Modified CLI descriptions and hid unnecessary default values #169
0.11.1 - 2024-10-09¶
Added¶
0.11.0 - 2024-09-24¶
Changed¶
- Bumped minimal DuckDB version to
1.1.0 - Refactored geoparquet operations for compatibility with new DuckDB version
- Excluded
conftest.pyfile from the final library build - Replaced
unary_unioncalls withunion_all()on all GeoDataFrames - Silenced
poochlibrary warnings regarding empty SHA hash
0.10.0 - 2024-09-23¶
Changed¶
- BREAKING Changed required minimal number of points in polygon from 3 to 4
- Added removal of repeated points in linestrings
Fixed¶
- Removed support for yanked polars version
1.7.0
0.9.4 - 2024-09-11¶
Changed¶
- Excluded DuckDB
1.1.0version from dependencies
0.9.3 - 2024-09-10¶
Removed¶
geoarrow-rust-corefrom dependencies
0.9.2 - 2024-08-28¶
Changed¶
- Removed
pyarrow-opsdependency and replaced it with simpler implementation - Removed
sraidependency from tests - Set minimal
numpyversion
0.9.1 - 2024-08-28¶
Fix¶
- Changed
geopydependency to required, to fix missing import forquackosm.geocode_to_geometryfunction
0.9.0 - 2024-08-12¶
Added¶
- Functions
convert_osm_extract_to_parquetandconvert_osm_extract_to_geodataframewith option to search and download OSM extracts by text query #119 - Function for downloading an OSM extract PBF file using a text query (
quackosm.osm_extracts.download_extract_by_query) - Function for displaying available OSM extracts from multiple sources (
quackosm.osm_extracts.display_available_extractsand--show-extracts/--show-osm-extractsin cli) in the form of a tree - New parameter
geometry_coverage_iou_threshold(and--iou-thresholdin cli) to enable configuration of the Intersection over Union metric value sensitivity for covering the geometry with OSM extracts - Two new notebook examples for documentation purposes - basic usage and OSM extracts deep dive
- Improved tests configuration by downloading precalculated extracts indexes from a dedicated repository
Changed¶
- Refactored searching OSM extracts for a given geometry filter to utilize Intersection over Union metric #110 #115
- Moved multiple modules imports inside certain functions to speed up CLI responsiveness
- Replaced default
GeofabrikOSM extract download source withanyto include all available resources - Refactored OSM extracts sources cache files to calculate area in kilometers squared and added
parentandfile_namefields
Deprecated¶
- Function
find_smallest_containing_extractfromquackosm.osm_extractshave been deprecated in favor offind_smallest_containing_extracts
0.8.3 - 2024-07-25¶
Added¶
- New function
quackosm.geocode_to_geometryfor quick geocoding of the text query to a geometry
Changed¶
- Replaced
OSMnxdependency withGeoPyfor geometry geocoding #135
0.8.2 - 2024-06-04¶
Added¶
geoarrow-rust-corelibrary to the main dependencies- Test for hashing geometry filter with mixed order
- Test for parquet multiprocessing logic
- Test for new intersection step
- Option to pass URL directly as PBF path #114
- Dedicated
MultiprocessingRuntimeErrorfor multiprocessing errors
Changed¶
- Added new internal parquet dataset processing logic using multiprocessing
- Refactored nodes intersection step from
ST_Intersectsin DuckDB to Shapely'sSTRtree#112 PbfFileReader's internalgeometry_filteris additionally clipped by PBF extract geometry to speed up intersections #116OsmTagsFilterandGroupedOsmTagsFiltertype fromdicttoMappingto make it covariant- Tqdm's
disableparameter for non-TTY environments fromNonetoFalse - Refactored final GeoParquet file saving logic to greatly reduce memory usage
- Bumped minimal
pyarrowversion to 16.0 - Default
multiprocessing.Poolinitialization mode fromforktospawn
0.8.1 - 2024-05-11¶
Added¶
- Option to convert multiple
*.osm.pbffiles to a singleparquetfile
Changed¶
- Names of the functions have been unified to all start with
convert_prefix - Simplified internal conversion API
Deprecated¶
- Functions
convert_pbf_to_gpq,convert_geometry_to_gpq/convert_geometry_filter_to_gpq,get_features_gdfandget_features_gdf_from_geometryhave been deprecated in favor ofconvert_pbf_to_parquet,convert_geometry_to_parquet,convert_pbf_to_geodataframeandconvert_geometry_to_geodataframe - Parameter
file_pathshas been replaced withpbf_path
Fixed¶
- Removed the
parquetextension installation step after opening the DuckDB connection
0.8.0 - 2024-05-08¶
Added¶
- Polars library to the main dependencies
Changed¶
- Refactored ways grouping logic from duckdb to polars
LazyFrameAPI for faster operations - Default result file extension from
geoparquettoparquet#99 - Moved
richto the main dependencies #95 - Set minimal versions of multiple dependencies
- Added tests for minimal dependencies versions
Fixed¶
- Steps numbering after encountering
MemoryError
Removed¶
h3ronpyfrom dependencies and replaced logic with pureh3calls
Deprecated¶
- Reusing existing
geoparquetfiles from cache will be supported, but will result in deprecation warning #99
0.7.3 - 2024-05-07¶
Added¶
- Debug mode that keeps all temporary files for further inspection, activated with
debugflag
Changed¶
- Refactored parsing native
LINESTRING_2Dtypes when reading them from saved parquet file
0.7.2 - 2024-04-28¶
Changed¶
- Refactored geometry fixing by utilizing
ST_MakeValidfunction added in DuckDB0.10.0version
0.7.1 - 2024-04-25¶
Changed¶
- Simplified GDAL parity tests by precalculating result files and uploading them to additional repository
Fixed¶
- Added exception if parts of provided geometry have no area #85
0.7.0 - 2024-04-24¶
Added¶
- Transient mode of reporting progress with output being removed after operation #77
- Tracking for multiple files within single operation
- New tests for all 3 methods of combining result files together with duplicated features removal
Changed¶
- Refactored internal Rich progress reporting process
- Replaced
silent_modeparameter withverbosity_modeargument - Changed default
OSMExtractSourcevalue fromanytoGeofabrik - Modified OpenStreetMap_fr scraping process with better progress bar UI
Removed¶
silent_modeparameter from the Python API
Fixed¶
- Replaced slash characters in Geofabrik index IDs with underscore to prevent nested directories creation
- Added additional check on number of points in a LineString when trying to represent them as a polygon
0.6.1 - 2024-04-17¶
Changed¶
- Set minimal
duckdbversion to0.10.2 - Added support for Python 3.12
0.6.0 - 2024-04-16¶
Added¶
- Option to filter by OSM tags with negative values (
False) and with wildcard asterisk (*) expansion in both keys and values #49 #53
Changed¶
- Set minimal
typerversion to0.9.0
0.5.3 - 2024-04-05¶
Fixed¶
- Made geometry orientation agnostic hash algorithm
0.5.2 - 2024-04-03¶
Added¶
- Progress bars for final merge of multiple geoparquet files into a single file
- Option to allow provided geometry to not be fully covered by existing OSM extracts #68
Fixed¶
- Changed tqdm's kwargs for parallel OSM extracts checks
0.5.1 - 2024-03-23¶
Fixed¶
- Added alternative way to remove
feature_idduplicates for big data operations - Slowed down rich progress bars refresh rate
0.5.0 - 2024-03-14¶
Added¶
- Option to disable progress reporting with the
--silentflag andsilent_modeargument #14 - New example notebook dedicated to the command line interface
- Option to save parquet files with WKT geometry #7
- Total elapsed time summary at the end #15
Changed¶
- Simplified and improved ways grouping process
- Renamed
rows_per_bucketparameter torows_per_group
Fixed¶
- Set minimal
h3andh3ronpyversions in requirements
0.4.5 - 2024-03-07¶
Fixed¶
- Added automatic downscaling of the
rows_per_bucketparameter for ways grouping operation #50
0.4.4 - 2024-02-14¶
Fixed¶
- Locked DuckDB's version to 0.9.2 to avoid segmentation fault
0.4.3 - 2024-02-13¶
Fixed¶
- Added parquet schema unification when joining multiple files together #42
0.4.2 - 2024-02-02¶
Fixed¶
- Removed last grouping step when using
keep_all_tagsparameter withGroupedOsmTagsFilterfilter
0.4.1 - 2024-01-31¶
Changed¶
- Removed additional redundancy of GeoParquet result files when only one extract covers whole area #35
Fixed¶
- Added missing
requestsdependency
0.4.0 - 2024-01-31¶
Added¶
- Option to automatically download PBF files for geometries #32
- Filtering data using 3 global grid systems: Geohash, H3 and S2 #30
Changed¶
- Filter OSM IDs are now expected to be passed after comma instead of repeating
--filter-osm-idevery time #30
Fixed¶
- Remove duplicated features when parsing multiple PBF files
- Geometry orienting to eliminate hash differences between operating systems and different equal versions of the same geometry
0.3.3 - 2024-01-16¶
Added¶
- Option to pass OSM tags filter in the form of JSON file to the CLI
- Option to keep all tags when filtering with the OSM tags #25
Changed¶
- Logic for
explode_tagsparameter when filtering based on tags, but still keeping them all
Fixed¶
- Typos in the CLI docs
0.3.2 - 2024-01-10¶
Added¶
- Option to pass
parquet_compressionparameter to DuckDB - Bigger PBF parsing test as a benchmark
Changed¶
- Increased number of rows per group for environments with more than 24 GB of memory
- Simplified temporal directory path propagation within
PbfFileReaderclass - Reduced disk spillage by removing more files during operation
- Optimized final geometries concatenation by removing
UNIONoperation - Tests execution order
0.3.1 - 2024-01-06¶
Added¶
- Speed column for Rich progress bar
Changed¶
- Simplified ways grouping logic by removing some steps
0.3.0 - 2024-01-02¶
Added¶
- Automatic scaling for grouping operations when working in the environment with less than 16GB of memory
- More detailed steps names
Changed¶
- Locked minimal Shapely version
- Modified ways grouping logic to be faster
- Split filtered and required ways to be parsed separately
Fixed¶
- Increased speed estimation period for Rich time progress
0.2.0 - 2023-12-29¶
Added¶
- CLI based on Typer for converting PBF files into GeoParquet
0.1.0 - 2023-12-29¶
Added¶
- Created QuackOSM repository
- Implemented PbfFileReader