Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Unreleased¶
0.2.5 - 2025-01-22¶
Added¶
- Option to pass list of
hierarchy_depth
values for multiple theme / type pairs - Info about current theme / type pair to the
HierarchyDepthOutOfBoundsWarning
0.2.4 - 2025-01-19¶
Added¶
- Places hierarchy based on the official taxonomy #63
- Option to change minimal confidence score for places and select only primary category for the wide form transformation #63
Changed¶
- Added option to use any non-negative integer as a
hierarchy_depth
value for wide form processing #64 - Shortened hash parts for generated file names to 8 characters per part
Fixed¶
- Bug where a constant value has been overwritten instead of being copied before modifying
0.2.3 - 2025-01-17¶
Fixed¶
- Changed wide format places definition for older release versions
- Changed get all columns function for wide format places definition
- Bug where code crashed when release index hit zero matches #11
0.2.2 - 2025-01-17¶
Fixed¶
- Changed wide format definitions for different release versions
0.2.1 - 2025-01-17¶
Added¶
- Wide format release index to precalculate all possible columns #43
- Flag
include_all_possible_columns
to keep or prune empty columns #43 overturemaestro.advanced_functions.wide_form.get_all_possible_column_names
for getting a list of all possible column names #46overturemaestro.cache.clear_cache
function for clearing local release index cache from the API
0.2.0 - 2025-01-16¶
Added¶
- Automatic total time wrapper decorator to aggregate nested function calls
- Parameter
columns_to_download
for selecting columns to download from the dataset #23 - Option to pass a list of pyarrow filters and columns for download for each theme type pair when downloading multiple datasets at once
- Automatic columns detection in pyarrow filters when passing
columns_to_download
- New
advanced_functions
module with awide
format for machine learning purposes #38
Fixed¶
- Replaced urllib HTTPError with requests HTTPError in release index download functions
Changed¶
- Refactored available release versions caching #24
- Removed hive partitioned parquet schema columns from GeoDataFrame loading
Deprecated¶
- Nested fields in PyArrow filter in CLI is now expected to be separated by a dot, not a comma #22
0.1.2 - 2024-12-17¶
Added¶
- Option to pass max number of workers for downloading the data #30
0.1.1 - 2024-11-24¶
Changed¶
- Modified release index consolidation script
0.1.0 - 2024-10-31¶
Added¶
- CLI #3
- Option to filter data with bounding box #4
- Tests for the library #6
- Automatic newest release version loading #7
- Library docs #2
- README content
- Verbosity modes
- Total operation time
- Overloads for the functions typing
- Function for displaying all available release versions
- GitHub Action workflows for docs deployment
Changed¶
- Moved location of the pregenerated release indexes to the global cache #19
- Moved
scikit-learn
andpolars
to the dedicated dependency group #9 - Sped up intersection algorithm
- Reduced number of max concurrent connections for parquet files download
Fixed¶
- Memory leak during concurrent parquet files download
- Added automatic retry for downloads with 10 retries
0.0.3 - 2024-09-08¶
Added¶
- Alternative bounding box related functions for downloading data
0.0.2 - 2024-09-06¶
Added¶
- Basic library tests
- CI/CD workflows
Fixed¶
- Added missing Pooch and geoarrow-rust-core dependencies
- Cleaned other dependencies
- Changed forward slashes in URLs on Windows
0.0.1 - 2024-09-02¶
Added¶
- Release index generation
- Functions for downloading the data using generated indexes
- Function for displaying available theme and type values