Skip to content


All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.


0.2.6 - 2025-01-26


  • Function detection in elapsed_time_decorator for Google Colab environment

0.2.5 - 2025-01-22


  • Option to pass list of hierarchy_depth values for multiple theme / type pairs
  • Info about current theme / type pair to the HierarchyDepthOutOfBoundsWarning

0.2.4 - 2025-01-19


  • Places hierarchy based on the official taxonomy #63
  • Option to change minimal confidence score for places and select only primary category for the wide form transformation #63


  • Added option to use any non-negative integer as a hierarchy_depth value for wide form processing #64
  • Shortened hash parts for generated file names to 8 characters per part


  • Bug where a constant value has been overwritten instead of being copied before modifying

0.2.3 - 2025-01-17


  • Changed wide format places definition for older release versions
  • Changed get all columns function for wide format places definition
  • Bug where code crashed when release index hit zero matches #11

0.2.2 - 2025-01-17


  • Changed wide format definitions for different release versions

0.2.1 - 2025-01-17


  • Wide format release index to precalculate all possible columns #43
  • Flag include_all_possible_columns to keep or prune empty columns #43
  • overturemaestro.advanced_functions.wide_form.get_all_possible_column_names for getting a list of all possible column names #46
  • overturemaestro.cache.clear_cache function for clearing local release index cache from the API

0.2.0 - 2025-01-16


  • Automatic total time wrapper decorator to aggregate nested function calls
  • Parameter columns_to_download for selecting columns to download from the dataset #23
  • Option to pass a list of pyarrow filters and columns for download for each theme type pair when downloading multiple datasets at once
  • Automatic columns detection in pyarrow filters when passing columns_to_download
  • New advanced_functions module with a wide format for machine learning purposes #38


  • Replaced urllib HTTPError with requests HTTPError in release index download functions


  • Refactored available release versions caching #24
  • Removed hive partitioned parquet schema columns from GeoDataFrame loading


  • Nested fields in PyArrow filter in CLI is now expected to be separated by a dot, not a comma #22

0.1.2 - 2024-12-17


  • Option to pass max number of workers for downloading the data #30

0.1.1 - 2024-11-24


  • Modified release index consolidation script

0.1.0 - 2024-10-31


  • CLI #3
  • Option to filter data with bounding box #4
  • Tests for the library #6
  • Automatic newest release version loading #7
  • Library docs #2
  • README content
  • Verbosity modes
  • Total operation time
  • Overloads for the functions typing
  • Function for displaying all available release versions
  • GitHub Action workflows for docs deployment


  • Moved location of the pregenerated release indexes to the global cache #19
  • Moved scikit-learn and polars to the dedicated dependency group #9
  • Sped up intersection algorithm
  • Reduced number of max concurrent connections for parquet files download


  • Memory leak during concurrent parquet files download
  • Added automatic retry for downloads with 10 retries

0.0.3 - 2024-09-08


  • Alternative bounding box related functions for downloading data

0.0.2 - 2024-09-06


  • Basic library tests
  • CI/CD workflows


  • Added missing Pooch and geoarrow-rust-core dependencies
  • Cleaned other dependencies
  • Changed forward slashes in URLs on Windows

0.0.1 - 2024-09-02


  • Release index generation
  • Functions for downloading the data using generated indexes
  • Function for displaying available theme and type values