OvertureMapsLoader
srai.loaders.OvertureMapsLoader ¶
OvertureMapsLoader(
theme_type_pairs: Optional[list[tuple[str, str]]] = None,
release: Optional[str] = None,
include_all_possible_columns: bool = True,
hierarchy_depth: Optional[Union[int, list[Optional[int]]]] = 1,
download_directory: Union[str, Path] = "files",
verbosity_mode: Literal["silent", "transient", "verbose"] = "transient",
max_workers: Optional[int] = None,
places_use_primary_category_only: bool = False,
places_minimal_confidence: float = 0.75,
)
Bases: Loader
OvertureMapsLoader.
Overture Maps[1] loader is a loader capable of loading OvertureMaps features from dedicated s3 bucket. It can download multiple data types for different release versions and it can filter features using PyArrow[2] filters.
This loader is a wrapper around OvertureMaestro
[3] library.
It utilizes the PyArrow streaming capabilities as well as duckdb
[4] engine for transforming
the data into the required format.
References
PARAMETER | DESCRIPTION |
---|---|
theme_type_pairs
|
List of theme type pairs to download. If None, will download all available datasets. Defaults to None.
TYPE:
|
release
|
Release version. If not provided, will automatically load newest available release version. Defaults to None.
TYPE:
|
include_all_possible_columns
|
Whether to have always the same list of columns in the resulting file. This ensures that always the same set of columns is returned for a given release for different regions. This also means, that some columns might be all filled with a False value. Defaults to True.
TYPE:
|
hierarchy_depth
|
Depth used to calculate how many hierarchy columns should be used to generate the wide form of the data. Can be a single integer or a list of integers. If None, will use all available columns. Must be a non-negative integer. Defaults to 1.
TYPE:
|
download_directory
|
Directory where to save the downloaded GeoParquet files. Defaults to "files".
TYPE:
|
verbosity_mode
|
Set progress verbosity mode. Can be one of: silent, transient and verbose. Silent disables output completely. Transient tracks progress, but removes output after finished. Verbose leaves all progress outputs in the stdout. Defaults to "transient".
TYPE:
|
max_workers
|
Max number of multiprocessing workers used to process the dataset. Defaults to None.
TYPE:
|
places_use_primary_category_only
|
Whether to use only the primary category for places. Defaults to False.
TYPE:
|
places_minimal_confidence
|
Minimal confidence level for the places dataset. Defaults to 0.75.
TYPE:
|
Source code in srai/loaders/overturemaps_loader.py
load ¶
load(
area: Union[
BaseGeometry, Iterable[BaseGeometry], gpd.GeoSeries, gpd.GeoDataFrame
],
ignore_cache: bool = False,
) -> gpd.GeoDataFrame
Load Overture Maps features for a given area in a wide format.
The loader will automatically download matching GeoParquet files from
the S3 bucket provided by the Overture Maps Foundation. Later it will filter
features and transform them into a wide format. It will return a GeoDataFrame
containing the geometry
column and boolean columns for each category.
Note: Remember to set count_categories
to False
in CountEmbedder
and its descendants.
If used with include_all_possible_columns
=False
, some key/value pairs might be
missing from the resulting GeoDataFrame, simply because there are no such objects in the given area.
PARAMETER | DESCRIPTION |
---|---|
area
|
Area for which to download objects.
TYPE:
|
ignore_cache
|
(bool, optional): Whether to ignore precalculated geoparquet files or not. Defaults to False.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
gpd.GeoDataFrame: Downloaded features as a GeoDataFrame. |