Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased¶
0.12.1 - 2025-01-03¶
Added¶
- Automatic download progress bar hiding when verbosity is set to
silent
. - Cached nominatim geocoding results to speed up tests
0.12.0 - 2024-11-03¶
Added¶
- Option to pass custom SQL filters with
custom_sql_filter
(and--custom-sql-filter
) parameter #67
Fixed¶
- Delayed progress bar appearing during nodes intersection step
0.11.4 - 2024-10-28¶
Changed¶
- Improved multiprocessing intersection algorithm by early stopping processes start-up if finished quicker than expected
0.11.3 - 2024-10-25¶
Changed¶
- Moved location of the OSM extracts providers to the global cache #173
0.11.2 - 2024-10-14¶
Added¶
- Option to pass a bounding box as a geometry filter in CLI #169
Changed¶
- Modified CLI descriptions and hid unnecessary default values #169
0.11.1 - 2024-10-09¶
Added¶
0.11.0 - 2024-09-24¶
Changed¶
- Bumped minimal DuckDB version to
1.1.0
- Refactored geoparquet operations for compatibility with new DuckDB version
- Excluded
conftest.py
file from the final library build - Replaced
unary_union
calls withunion_all()
on all GeoDataFrames - Silenced
pooch
library warnings regarding empty SHA hash
0.10.0 - 2024-09-23¶
Changed¶
- BREAKING Changed required minimal number of points in polygon from 3 to 4
- Added removal of repeated points in linestrings
Fixed¶
- Removed support for yanked polars version
1.7.0
0.9.4 - 2024-09-11¶
Changed¶
- Excluded DuckDB
1.1.0
version from dependencies
0.9.3 - 2024-09-10¶
Removed¶
geoarrow-rust-core
from dependencies
0.9.2 - 2024-08-28¶
Changed¶
- Removed
pyarrow-ops
dependency and replaced it with simpler implementation - Removed
srai
dependency from tests - Set minimal
numpy
version
0.9.1 - 2024-08-28¶
Fix¶
- Changed
geopy
dependency to required, to fix missing import forquackosm.geocode_to_geometry
function
0.9.0 - 2024-08-12¶
Added¶
- Functions
convert_osm_extract_to_parquet
andconvert_osm_extract_to_geodataframe
with option to search and download OSM extracts by text query #119 - Function for downloading an OSM extract PBF file using a text query (
quackosm.osm_extracts.download_extract_by_query
) - Function for displaying available OSM extracts from multiple sources (
quackosm.osm_extracts.display_available_extracts
and--show-extracts
/--show-osm-extracts
in cli) in the form of a tree - New parameter
geometry_coverage_iou_threshold
(and--iou-threshold
in cli) to enable configuration of the Intersection over Union metric value sensitivity for covering the geometry with OSM extracts - Two new notebook examples for documentation purposes - basic usage and OSM extracts deep dive
- Improved tests configuration by downloading precalculated extracts indexes from a dedicated repository
Changed¶
- Refactored searching OSM extracts for a given geometry filter to utilize Intersection over Union metric #110 #115
- Moved multiple modules imports inside certain functions to speed up CLI responsiveness
- Replaced default
Geofabrik
OSM extract download source withany
to include all available resources - Refactored OSM extracts sources cache files to calculate area in kilometers squared and added
parent
andfile_name
fields
Deprecated¶
- Function
find_smallest_containing_extract
fromquackosm.osm_extracts
have been deprecated in favor offind_smallest_containing_extracts
0.8.3 - 2024-07-25¶
Added¶
- New function
quackosm.geocode_to_geometry
for quick geocoding of the text query to a geometry
Changed¶
- Replaced
OSMnx
dependency withGeoPy
for geometry geocoding #135
0.8.2 - 2024-06-04¶
Added¶
geoarrow-rust-core
library to the main dependencies- Test for hashing geometry filter with mixed order
- Test for parquet multiprocessing logic
- Test for new intersection step
- Option to pass URL directly as PBF path #114
- Dedicated
MultiprocessingRuntimeError
for multiprocessing errors
Changed¶
- Added new internal parquet dataset processing logic using multiprocessing
- Refactored nodes intersection step from
ST_Intersects
in DuckDB to Shapely'sSTRtree
#112 PbfFileReader
's internalgeometry_filter
is additionally clipped by PBF extract geometry to speed up intersections #116OsmTagsFilter
andGroupedOsmTagsFilter
type fromdict
toMapping
to make it covariant- Tqdm's
disable
parameter for non-TTY environments fromNone
toFalse
- Refactored final GeoParquet file saving logic to greatly reduce memory usage
- Bumped minimal
pyarrow
version to 16.0 - Default
multiprocessing.Pool
initialization mode fromfork
tospawn
0.8.1 - 2024-05-11¶
Added¶
- Option to convert multiple
*.osm.pbf
files to a singleparquet
file
Changed¶
- Names of the functions have been unified to all start with
convert_
prefix - Simplified internal conversion API
Deprecated¶
- Functions
convert_pbf_to_gpq
,convert_geometry_to_gpq
/convert_geometry_filter_to_gpq
,get_features_gdf
andget_features_gdf_from_geometry
have been deprecated in favor ofconvert_pbf_to_parquet
,convert_geometry_to_parquet
,convert_pbf_to_geodataframe
andconvert_geometry_to_geodataframe
- Parameter
file_paths
has been replaced withpbf_path
Fixed¶
- Removed the
parquet
extension installation step after opening the DuckDB connection
0.8.0 - 2024-05-08¶
Added¶
- Polars library to the main dependencies
Changed¶
- Refactored ways grouping logic from duckdb to polars
LazyFrame
API for faster operations - Default result file extension from
geoparquet
toparquet
#99 - Moved
rich
to the main dependencies #95 - Set minimal versions of multiple dependencies
- Added tests for minimal dependencies versions
Fixed¶
- Steps numbering after encountering
MemoryError
Removed¶
h3ronpy
from dependencies and replaced logic with pureh3
calls
Deprecated¶
- Reusing existing
geoparquet
files from cache will be supported, but will result in deprecation warning #99
0.7.3 - 2024-05-07¶
Added¶
- Debug mode that keeps all temporary files for further inspection, activated with
debug
flag
Changed¶
- Refactored parsing native
LINESTRING_2D
types when reading them from saved parquet file
0.7.2 - 2024-04-28¶
Changed¶
- Refactored geometry fixing by utilizing
ST_MakeValid
function added in DuckDB0.10.0
version
0.7.1 - 2024-04-25¶
Changed¶
- Simplified GDAL parity tests by precalculating result files and uploading them to additional repository
Fixed¶
- Added exception if parts of provided geometry have no area #85
0.7.0 - 2024-04-24¶
Added¶
- Transient mode of reporting progress with output being removed after operation #77
- Tracking for multiple files within single operation
- New tests for all 3 methods of combining result files together with duplicated features removal
Changed¶
- Refactored internal Rich progress reporting process
- Replaced
silent_mode
parameter withverbosity_mode
argument - Changed default
OSMExtractSource
value fromany
toGeofabrik
- Modified OpenStreetMap_fr scraping process with better progress bar UI
Removed¶
silent_mode
parameter from the Python API
Fixed¶
- Replaced slash characters in Geofabrik index IDs with underscore to prevent nested directories creation
- Added additional check on number of points in a LineString when trying to represent them as a polygon
0.6.1 - 2024-04-17¶
Changed¶
- Set minimal
duckdb
version to0.10.2
- Added support for Python 3.12
0.6.0 - 2024-04-16¶
Added¶
- Option to filter by OSM tags with negative values (
False
) and with wildcard asterisk (*
) expansion in both keys and values #49 #53
Changed¶
- Set minimal
typer
version to0.9.0
0.5.3 - 2024-04-05¶
Fixed¶
- Made geometry orientation agnostic hash algorithm
0.5.2 - 2024-04-03¶
Added¶
- Progress bars for final merge of multiple geoparquet files into a single file
- Option to allow provided geometry to not be fully covered by existing OSM extracts #68
Fixed¶
- Changed tqdm's kwargs for parallel OSM extracts checks
0.5.1 - 2024-03-23¶
Fixed¶
- Added alternative way to remove
feature_id
duplicates for big data operations - Slowed down rich progress bars refresh rate
0.5.0 - 2024-03-14¶
Added¶
- Option to disable progress reporting with the
--silent
flag andsilent_mode
argument #14 - New example notebook dedicated to the command line interface
- Option to save parquet files with WKT geometry #7
- Total elapsed time summary at the end #15
Changed¶
- Simplified and improved ways grouping process
- Renamed
rows_per_bucket
parameter torows_per_group
Fixed¶
- Set minimal
h3
andh3ronpy
versions in requirements
0.4.5 - 2024-03-07¶
Fixed¶
- Added automatic downscaling of the
rows_per_bucket
parameter for ways grouping operation #50
0.4.4 - 2024-02-14¶
Fixed¶
- Locked DuckDB's version to 0.9.2 to avoid segmentation fault
0.4.3 - 2024-02-13¶
Fixed¶
- Added parquet schema unification when joining multiple files together #42
0.4.2 - 2024-02-02¶
Fixed¶
- Removed last grouping step when using
keep_all_tags
parameter withGroupedOsmTagsFilter
filter
0.4.1 - 2024-01-31¶
Changed¶
- Removed additional redundancy of GeoParquet result files when only one extract covers whole area #35
Fixed¶
- Added missing
requests
dependency
0.4.0 - 2024-01-31¶
Added¶
- Option to automatically download PBF files for geometries #32
- Filtering data using 3 global grid systems: Geohash, H3 and S2 #30
Changed¶
- Filter OSM IDs are now expected to be passed after comma instead of repeating
--filter-osm-id
every time #30
Fixed¶
- Remove duplicated features when parsing multiple PBF files
- Geometry orienting to eliminate hash differences between operating systems and different equal versions of the same geometry
0.3.3 - 2024-01-16¶
Added¶
- Option to pass OSM tags filter in the form of JSON file to the CLI
- Option to keep all tags when filtering with the OSM tags #25
Changed¶
- Logic for
explode_tags
parameter when filtering based on tags, but still keeping them all
Fixed¶
- Typos in the CLI docs
0.3.2 - 2024-01-10¶
Added¶
- Option to pass
parquet_compression
parameter to DuckDB - Bigger PBF parsing test as a benchmark
Changed¶
- Increased number of rows per group for environments with more than 24 GB of memory
- Simplified temporal directory path propagation within
PbfFileReader
class - Reduced disk spillage by removing more files during operation
- Optimized final geometries concatenation by removing
UNION
operation - Tests execution order
0.3.1 - 2024-01-06¶
Added¶
- Speed column for Rich progress bar
Changed¶
- Simplified ways grouping logic by removing some steps
0.3.0 - 2024-01-02¶
Added¶
- Automatic scaling for grouping operations when working in the environment with less than 16GB of memory
- More detailed steps names
Changed¶
- Locked minimal Shapely version
- Modified ways grouping logic to be faster
- Split filtered and required ways to be parsed separately
Fixed¶
- Increased speed estimation period for Rich time progress
0.2.0 - 2023-12-29¶
Added¶
- CLI based on Typer for converting PBF files into GeoParquet
0.1.0 - 2023-12-29¶
Added¶
- Created QuackOSM repository
- Implemented PbfFileReader