Skip to content

Index

This module contains regionalizers, which are used to divide space before analysis.

Embedding methods available in srai operate on a regions, which can be defined in many ways. In this module, we aggregate different regionalization methods under a common Regionalizer interface. We include both pre-defined spatial indexes (e.g. H3 or S2), data-based ones (e.g. Voronoi) and OSM- based ones (e.g. administrative boundaries).

Regionalizer

Bases: ABC

Abstract class for regionalizers.

transform(gdf)

abstractmethod

Regionalize a given GeoDataFrame.

This one should treat the input as a single region.

PARAMETER DESCRIPTION
gdf

GeoDataFrame to be regionalized.

TYPE: GeoDataFrame

RETURNS DESCRIPTION
GeoDataFrame

GeoDataFrame with the regionalized data.

Source code in srai/regionalizers/_base.py
@abc.abstractmethod
def transform(self, gdf: gpd.GeoDataFrame) -> gpd.GeoDataFrame:  # pragma: no cover
    """
    Regionalize a given GeoDataFrame.

    This one should treat the input as a single region.

    Args:
        gdf (gpd.GeoDataFrame): GeoDataFrame to be regionalized.

    Returns:
        GeoDataFrame with the regionalized data.
    """
    raise NotImplementedError

AdministrativeBoundaryRegionalizer(
    admin_level,
    clip_regions=True,
    return_empty_region=False,
    prioritize_english_name=True,
    toposimplify=True,
    remove_artefact_regions=True,
)

Bases: Regionalizer

AdministrativeBoundaryRegionalizer.

Administrative boundary regionalizer allows the given geometries to be divided into boundaries from OpenStreetMap on a given admin_level [1].

Downloads those boundaries online using overpass and osmnx libraries.

Note: offline .pbf loading will be implemented in the future. Note: option to download historic data will be implemented in the future.

References
  1. https://wiki.openstreetmap.org/wiki/Key:admin_level
PARAMETER DESCRIPTION
admin_level

OpenStreetMap admin_level value. See [1] for detailed description of available values.

TYPE: int

clip_regions

Whether to clip regions using a provided mask. Turning it off can an be useful when trying to load regions using list a of points. Defaults to True.

TYPE: bool DEFAULT: True

return_empty_region

Whether to return an empty region to fill remaining space or not. Defaults to False.

TYPE: bool DEFAULT: False

prioritize_english_name

Whether to use english area name if available as a region id first. Defaults to True.

TYPE: bool DEFAULT: True

toposimplify

Whether to simplify topology to reduce geometries size or not. Value is passed to topojson library for topology-aware simplification. Since provided values are treated like degrees, values between 1e-4 and 1.0 are recommended. Defaults to True (which results in value equal 1e-4).

TYPE: Union[bool, float] DEFAULT: True

remove_artefact_regions

Whether to remove small regions barely intersecting queried area. Turning it off can sometimes load unnecessary boundaries that touch on the edge. It removes regions that intersect with an area smaller than 1% of total self. Takes into consideration if provided query GeoDataFrame contains points and then skips calculating area when intersects any point. Defaults to True.

TYPE: bool DEFAULT: True

RAISES DESCRIPTION
ValueError

If admin_level is outside available range (1-11). See [2] for list of values with in_wiki selected.

References
  1. https://wiki.openstreetmap.org/wiki/Tag:boundary=administrative#10_admin_level_values_for_specific_countries
  2. https://taginfo.openstreetmap.org/keys/admin_level#values
Source code in srai/regionalizers/administrative_boundary_regionalizer.py
def __init__(
    self,
    admin_level: int,
    clip_regions: bool = True,
    return_empty_region: bool = False,
    prioritize_english_name: bool = True,
    toposimplify: Union[bool, float] = True,
    remove_artefact_regions: bool = True,
) -> None:
    """
    Init AdministrativeBoundaryRegionalizer.

    Args:
        admin_level (int): OpenStreetMap admin_level value. See [1] for detailed description of
            available values.
        clip_regions (bool, optional): Whether to clip regions using a provided mask.
            Turning it off can an be useful when trying to load regions using list a of points.
            Defaults to True.
        return_empty_region (bool, optional): Whether to return an empty region to fill
            remaining space or not. Defaults to False.
        prioritize_english_name (bool, optional): Whether to use english area name if available
            as a region id first. Defaults to True.
        toposimplify (Union[bool, float], optional): Whether to simplify topology to reduce
            geometries size or not. Value is passed to `topojson` library for topology-aware
            simplification. Since provided values are treated like degrees, values between
            1e-4 and 1.0 are recommended. Defaults to True (which results in value equal 1e-4).
        remove_artefact_regions (bool, optional): Whether to remove small regions barely
            intersecting queried area. Turning it off can sometimes load unnecessary boundaries
            that touch on the edge. It removes regions that intersect with an area smaller
            than 1% of total self. Takes into consideration if provided query GeoDataFrame
            contains points and then skips calculating area when intersects any point.
            Defaults to True.

    Raises:
        ValueError: If admin_level is outside available range (1-11). See [2] for list of
            values with `in_wiki` selected.

    References:
        1. https://wiki.openstreetmap.org/wiki/Tag:boundary=administrative#10_admin_level_values_for_specific_countries
        2. https://taginfo.openstreetmap.org/keys/admin_level#values
    """  # noqa: W505, E501
    import_optional_dependencies(
        dependency_group="osm",
        modules=["osmnx", "overpass"],
    )
    from overpass import API

    if admin_level < 1 or admin_level > 11:
        raise ValueError("admin_level outside of available range.")

    self.admin_level = admin_level
    self.prioritize_english_name = prioritize_english_name
    self.clip_regions = clip_regions
    self.return_empty_region = return_empty_region
    self.remove_artefact_regions = remove_artefact_regions

    if isinstance(toposimplify, (int, float)) and toposimplify > 0:
        self.toposimplify = toposimplify
    elif isinstance(toposimplify, bool) and toposimplify:
        self.toposimplify = 1e-4
    else:
        self.toposimplify = False

    self.overpass_api = API(timeout=60)

transform(gdf)

Regionalize a given GeoDataFrame.

Will query Overpass [1] server using overpass [2] library for closed administrative boundaries on a given admin_level and then download geometries for each relation using osmnx [3] library.

If prioritize_english_name is set to True, method will try to extract the name:en tag first before resorting to the name tag. If boundary doesn't have a name tag, an id will be used.

Before returning downloaded regions, a topology is built and can be simplified to reduce size of geometries while keeping neighbouring regions together without introducing gaps. Topojson library [4] is used for this operation.

Additionally, an empty region with name EMPTY can be introduced if returned regions do not fully cover a clipping mask.

PARAMETER DESCRIPTION
gdf

GeoDataFrame to be regionalized. Will use this list of geometries to crop resulting regions.

TYPE: GeoDataFrame

RETURNS DESCRIPTION
GeoDataFrame

gpd.GeoDataFrame: GeoDataFrame with the regionalized data cropped using input GeoDataFrame.

RAISES DESCRIPTION
RuntimeError

If simplification can't preserve a topology.

References
  1. https://wiki.openstreetmap.org/wiki/Overpass_API
  2. https://github.com/mvexel/overpass-api-python-wrapper
  3. https://github.com/gboeing/osmnx
  4. https://github.com/mattijn/topojson
Source code in srai/regionalizers/administrative_boundary_regionalizer.py
def transform(self, gdf: gpd.GeoDataFrame) -> gpd.GeoDataFrame:
    """
    Regionalize a given GeoDataFrame.

    Will query Overpass [1] server using `overpass` [2] library for closed administrative
    boundaries on a given admin_level and then download geometries for each relation using
    `osmnx` [3] library.

    If `prioritize_english_name` is set to `True`, method will try to extract the `name:en` tag
    first before resorting to the `name` tag. If boundary doesn't have a `name` tag, an `id`
    will be used.

    Before returning downloaded regions, a topology is built and can be simplified to reduce
    size of geometries while keeping neighbouring regions together without introducing gaps.
    `Topojson` library [4] is used for this operation.

    Additionally, an empty region with name `EMPTY` can be introduced if returned regions do
    not fully cover a clipping mask.

    Args:
        gdf (gpd.GeoDataFrame): GeoDataFrame to be regionalized.
            Will use this list of geometries to crop resulting regions.

    Returns:
        gpd.GeoDataFrame: GeoDataFrame with the regionalized data cropped using input
            GeoDataFrame.

    Raises:
        RuntimeError: If simplification can't preserve a topology.

    References:
        1. https://wiki.openstreetmap.org/wiki/Overpass_API
        2. https://github.com/mvexel/overpass-api-python-wrapper
        3. https://github.com/gboeing/osmnx
        4. https://github.com/mattijn/topojson
    """
    gdf_wgs84 = gdf.to_crs(crs=WGS84_CRS)

    regions_dicts = self._generate_regions_from_all_geometries(gdf_wgs84)

    if not regions_dicts:
        import warnings

        warnings.warn(
            "Couldn't find any administrative boundaries with"
            f" `admin_level`={self.admin_level}.",
            RuntimeWarning,
            stacklevel=2,
        )

        return self._get_empty_geodataframe(gdf_wgs84)

    regions_gdf = gpd.GeoDataFrame(data=regions_dicts, crs=WGS84_CRS).set_index(REGIONS_INDEX)
    regions_gdf = self._toposimplify_gdf(regions_gdf)

    if self.remove_artefact_regions:
        points_collection: Optional[BaseGeometry] = gdf_wgs84[
            gdf_wgs84.geom_type == "Point"
        ].geometry.unary_union
        clipping_polygon_area: Optional[BaseGeometry] = gdf_wgs84[
            gdf_wgs84.geom_type != "Point"
        ].geometry.unary_union

        regions_to_keep = [
            region_id
            for region_id, row in regions_gdf.iterrows()
            if self._check_intersects_with_points(row["geometry"], points_collection)
            or self._calculate_intersection_area_fraction(
                row["geometry"], clipping_polygon_area
            )
            > 0.01
        ]
        regions_gdf = regions_gdf.loc[regions_to_keep]

    if self.clip_regions:
        regions_gdf = regions_gdf.clip(mask=gdf_wgs84, keep_geom_type=False)

    if self.return_empty_region:
        empty_region = self._generate_empty_region(mask=gdf_wgs84, regions_gdf=regions_gdf)
        if not empty_region.is_empty:
            regions_gdf.loc[
                AdministrativeBoundaryRegionalizer.EMPTY_REGION_NAME,
                GEOMETRY_COLUMN,
            ] = empty_region

    return regions_gdf

H3Regionalizer(resolution, buffer=True)

Bases: Regionalizer

H3 Regionalizer.

H3 Regionalizer allows the given geometries to be divided into H3 cells - hexagons with pentagons as a very rare exception

PARAMETER DESCRIPTION
resolution

Resolution of the cells. See [1] for a full comparison.

TYPE: int

buffer

Whether to fully cover the geometries with H3 Cells (visible on the borders). Defaults to True.

TYPE: bool DEFAULT: True

RAISES DESCRIPTION
ValueError

If resolution is not between 0 and 15.

References
  1. https://h3geo.org/docs/core-library/restable/
Source code in srai/regionalizers/h3_regionalizer.py
def __init__(self, resolution: int, buffer: bool = True) -> None:
    """
    Init H3Regionalizer.

    Args:
        resolution (int): Resolution of the cells. See [1] for a full comparison.
        buffer (bool, optional): Whether to fully cover the geometries with
            H3 Cells (visible on the borders). Defaults to True.

    Raises:
        ValueError: If resolution is not between 0 and 15.

    References:
        1. https://h3geo.org/docs/core-library/restable/
    """
    if not (0 <= resolution <= 15):
        raise ValueError(f"Resolution {resolution} is not between 0 and 15.")

    self.resolution = resolution
    self.buffer = buffer

transform(gdf)

Regionalize a given GeoDataFrame.

Transforms given geometries into H3 cells of given resolution and optionally applies buffering.

PARAMETER DESCRIPTION
gdf

(Multi)Polygons to be regionalized.

TYPE: GeoDataFrame

RETURNS DESCRIPTION
GeoDataFrame

gpd.GeoDataFrame: H3 cells.

RAISES DESCRIPTION
ValueError

If provided GeoDataFrame has no crs defined.

Source code in srai/regionalizers/h3_regionalizer.py
def transform(self, gdf: gpd.GeoDataFrame) -> gpd.GeoDataFrame:
    """
    Regionalize a given GeoDataFrame.

    Transforms given geometries into H3 cells of given resolution
    and optionally applies buffering.

    Args:
        gdf (gpd.GeoDataFrame): (Multi)Polygons to be regionalized.

    Returns:
        gpd.GeoDataFrame: H3 cells.

    Raises:
        ValueError: If provided GeoDataFrame has no crs defined.
    """
    gdf_wgs84 = gdf.to_crs(crs=WGS84_CRS)

    gdf_exploded = self._explode_multipolygons(gdf_wgs84)

    h3_indexes = list(
        set(
            shapely_geometry_to_h3(
                gdf_exploded[GEOMETRY_COLUMN],
                h3_resolution=self.resolution,
                buffer=self.buffer,
            )
        )
    )
    gdf_h3 = gpd.GeoDataFrame(
        data={REGIONS_INDEX: h3_indexes},
        geometry=h3_to_geoseries(h3_indexes),
        crs=WGS84_CRS,
    ).set_index(REGIONS_INDEX)

    return gdf_h3.to_crs(gdf.crs)

S2Regionalizer(resolution, buffer=True)

Bases: Regionalizer

S2 Regionalizer.

S2 Regionalizer gives an opportunity to divide the given geometries into square S2 cells.

PARAMETER DESCRIPTION
resolution

Resolution of the cells (S2 supports 0-30). See [1] for a full comparison.

TYPE: int

buffer

If True then fully cover geometries with S2 cells. Otherwise only use those cells that fully fit into them. Defaults to True.

TYPE: bool DEFAULT: True

RAISES DESCRIPTION
ValueError

If resolution is not between 0 and 30.

References
  1. https://s2geometry.io/resources/s2cell_statistics.html
Source code in srai/regionalizers/s2_regionalizer.py
def __init__(self, resolution: int, buffer: bool = True) -> None:
    """
    Init S2 Regionalizer.

    Args:
        resolution (int): Resolution of the cells (S2 supports 0-30). See [1] for
            a full comparison.
        buffer (bool, optional): If True then fully cover geometries with S2 cells.
            Otherwise only use those cells that fully fit into them. Defaults to True.

    Raises:
        ValueError: If resolution is not between 0 and 30.

    References:
        1. https://s2geometry.io/resources/s2cell_statistics.html
    """
    if not (0 <= resolution <= 30):
        raise ValueError(f"Resolution {resolution} is not between 0 and 30.")

    self.resolution = resolution
    self.buffer = buffer

transform(gdf)

Regionalize a given GeoDataFrame.

PARAMETER DESCRIPTION
gdf

GeoDataFrame to be regionalized.

TYPE: GeoDataFrame

RETURNS DESCRIPTION
GeoDataFrame

gpd.GeoDataFrame: GeoDataFrame with regionalized geometries.

Source code in srai/regionalizers/s2_regionalizer.py
def transform(self, gdf: gpd.GeoDataFrame) -> gpd.GeoDataFrame:
    """
    Regionalize a given GeoDataFrame.

    Args:
        gdf (gpd.GeoDataFrame): GeoDataFrame to be regionalized.

    Returns:
        gpd.GeoDataFrame: GeoDataFrame with regionalized geometries.
    """
    gdf_wgs84 = gdf.to_crs(crs=WGS84_CRS)

    s2_gdf = self._fill_with_s2_cells(self._explode_multipolygons(gdf_wgs84))

    # s2 library fills also holes in Polygons, so here we remove redundant cells
    res: gpd.GeoDataFrame = gpd.sjoin(
        s2_gdf,
        gdf_wgs84,
        how="inner",
        predicate="intersects" if self.buffer else "within",
    ).drop(columns=["index_right"])

    res = res[~res.index.duplicated(keep="first")]

    res.index.name = REGIONS_INDEX

    return res

SlippyMapRegionalizer(zoom)

Bases: Regionalizer

SlippyMapRegionalizer class.

PARAMETER DESCRIPTION
zoom

zoom level

TYPE: int

RAISES DESCRIPTION
ValueError

if zoom is not in [0, 19]

Source code in srai/regionalizers/slippy_map_regionalizer.py
def __init__(self, zoom: int) -> None:
    """
    Initialize SlippyMapRegionalizer.

    Args:
        zoom (int): zoom level

    Raises:
        ValueError: if zoom is not in [0, 19]
    """
    if not 0 <= zoom <= 19:
        raise ValueError
    self.zoom = zoom
    super().__init__()

transform(gdf)

Regionalize a given GeoDataFrame.

PARAMETER DESCRIPTION
gdf

GeoDataFrame to be regionalized.

TYPE: GeoDataFrame

RETURNS DESCRIPTION
GeoDataFrame

gpd.GeoDataFrame: GeoDataFrame with regionalized geometries.

RAISES DESCRIPTION
ValueError

If provided GeoDataFrame has no crs defined.

Source code in srai/regionalizers/slippy_map_regionalizer.py
def transform(self, gdf: gpd.GeoDataFrame) -> gpd.GeoDataFrame:
    """
    Regionalize a given GeoDataFrame.

    Args:
        gdf (gpd.GeoDataFrame): GeoDataFrame to be regionalized.

    Returns:
        gpd.GeoDataFrame: GeoDataFrame with regionalized geometries.

    Raises:
        ValueError: If provided GeoDataFrame has no crs defined.
    """
    gdf_wgs84 = gdf.to_crs(crs=WGS84_CRS)
    gdf_exploded = self._explode_multipolygons(gdf_wgs84)

    values = (
        seq(gdf_exploded[GEOMETRY_COLUMN])
        .map(self._to_cells)
        .flatten()
        .map(
            lambda item: (
                item
                | {
                    REGIONS_INDEX: f"{item['x']}_{item['y']}_{self.zoom}",
                    "z": self.zoom,
                }
            )
        )
        .to_list()
    )

    gdf = gpd.GeoDataFrame(values, geometry=GEOMETRY_COLUMN, crs=WGS84_CRS).set_index(
        REGIONS_INDEX
    )
    return gdf.drop_duplicates()

VoronoiRegionalizer(
    seeds,
    max_meters_between_points=10000,
    num_of_multiprocessing_workers=-1,
    multiprocessing_activation_threshold=None,
)

Bases: Regionalizer

VoronoiRegionalizer.

Voronoi [1] regionalizer allows the given geometries to be divided into Thiessen polygons using geometries that are the seeds. To minimize distortions tessellation will be performed on a sphere using SphericalVoronoi [2] function from scipy library.

References
  1. https://en.wikipedia.org/wiki/Voronoi_diagram
  2. https://docs.scipy.org/doc/scipy-1.9.2/reference/generated/scipy.spatial.SphericalVoronoi.html

All (multi)polygons from seeds GeoDataFrame will be transformed to their centroids, because scipy function requires only points as an input.

PARAMETER DESCRIPTION
seeds

List of points or a GeoDataFrame with seeds for creating a tessellation. Every non-point geometry will be mapped to a centroid. Minimum 4 seeds are required. Seeds cannot lie on a single arc. Empty seeds will be removed.

TYPE: Union[GeoDataFrame, List[Point]]

max_meters_between_points

Maximal distance in meters between two points in a resulting polygon. Higher number results lower resolution of a polygon.

TYPE: int DEFAULT: 10000

num_of_multiprocessing_workers

Number of workers used for multiprocessing. Defaults to -1 which results in a total number of available cpu threads. 0 and 1 values disable multiprocessing. Similar to n_jobs parameter from scikit-learn library.

TYPE: int DEFAULT: -1

multiprocessing_activation_threshold

Number of seeds required to start processing on multiple processes. Activating multiprocessing for a small amount of points might not be feasible. Defaults to 100.

TYPE: int DEFAULT: None

RAISES DESCRIPTION
ValueError

If any seed is duplicated.

ValueError

If less than 4 seeds are provided.

ValueError

If provided seeds geodataframe has no crs defined.

ValueError

If any seed is outside WGS84 coordinates domain.

Source code in srai/regionalizers/voronoi_regionalizer.py
def __init__(
    self,
    seeds: Union[gpd.GeoDataFrame, list[Point]],
    max_meters_between_points: int = 10_000,
    num_of_multiprocessing_workers: int = -1,
    multiprocessing_activation_threshold: Optional[int] = None,
) -> None:
    """
    Init VoronoiRegionalizer.

    All (multi)polygons from seeds GeoDataFrame will be transformed to their centroids,
    because scipy function requires only points as an input.

    Args:
        seeds (Union[gpd.GeoDataFrame, List[Point]]): List of points or a GeoDataFrame
            with seeds for creating a tessellation. Every non-point geometry will be mapped
            to a centroid. Minimum 4 seeds are required. Seeds cannot lie on a single arc.
            Empty seeds will be removed.
        max_meters_between_points (int): Maximal distance in meters between two points
            in a resulting polygon. Higher number results lower resolution of a polygon.
        num_of_multiprocessing_workers (int): Number of workers used for
            multiprocessing. Defaults to `-1` which results in a total number of available
            cpu threads. `0` and `1` values disable multiprocessing.
            Similar to `n_jobs` parameter from `scikit-learn` library.
        multiprocessing_activation_threshold (int, optional): Number of seeds required to start
            processing on multiple processes. Activating multiprocessing for a small
            amount of points might not be feasible. Defaults to 100.

    Raises:
        ValueError: If any seed is duplicated.
        ValueError: If less than 4 seeds are provided.
        ValueError: If provided seeds geodataframe has no crs defined.
        ValueError: If any seed is outside WGS84 coordinates domain.
    """
    import_optional_dependencies(
        dependency_group="voronoi",
        modules=["haversine", "pymap3d", "scipy", "spherical_geometry"],
    )
    self.region_ids: list[Hashable] = []
    self.seeds: list[Point] = []

    if isinstance(seeds, gpd.GeoDataFrame):
        from ._spherical_voronoi import _parse_geodataframe_seeds

        self.seeds, self.region_ids = _parse_geodataframe_seeds(seeds)
    else:
        self.seeds = seeds
        self.region_ids = list(range(len(seeds)))

    self.max_meters_between_points = max_meters_between_points
    self.num_of_multiprocessing_workers = num_of_multiprocessing_workers
    self.multiprocessing_activation_threshold = multiprocessing_activation_threshold

    if len(self.seeds) < 4:
        raise ValueError("Minimum 4 seeds are required.")

    from ._spherical_voronoi import _check_if_in_bounds, _get_duplicated_seeds_ids

    duplicated_seeds_ids = _get_duplicated_seeds_ids(self.seeds, self.region_ids)
    if duplicated_seeds_ids:
        raise ValueError(f"Duplicate seeds present: {duplicated_seeds_ids}.")

    if not _check_if_in_bounds(self.seeds):
        raise ValueError("Seeds outside Earth WGS84 bounding box.")

transform(gdf=None)

Regionalize a given GeoDataFrame.

Returns a list of disjointed regions consisting of Thiessen cells generated using a Voronoi diagram on the sphere.

PARAMETER DESCRIPTION
gdf

GeoDataFrame to be regionalized. Will use this list of geometries to crop resulting regions. If None, a boundary box with bounds (-180, -90, 180, 90) is used to return regions covering whole Earth. Defaults to None.

TYPE: Optional[GeoDataFrame] DEFAULT: None

RETURNS DESCRIPTION
GeoDataFrame

gpd.GeoDataFrame: GeoDataFrame with the regionalized data cropped using input GeoDataFrame.

RAISES DESCRIPTION
ValueError

If provided geodataframe has no crs defined.

ValueError

If seeds are laying on a single arc.

Source code in srai/regionalizers/voronoi_regionalizer.py
def transform(self, gdf: Optional[gpd.GeoDataFrame] = None) -> gpd.GeoDataFrame:
    """
    Regionalize a given GeoDataFrame.

    Returns a list of disjointed regions consisting of Thiessen cells generated
    using a Voronoi diagram on the sphere.

    Args:
        gdf (Optional[gpd.GeoDataFrame], optional): GeoDataFrame to be regionalized.
            Will use this list of geometries to crop resulting regions. If None, a boundary box
            with bounds (-180, -90, 180, 90) is used to return regions covering whole Earth.
            Defaults to None.

    Returns:
        gpd.GeoDataFrame: GeoDataFrame with the regionalized data cropped using input
            GeoDataFrame.

    Raises:
        ValueError: If provided geodataframe has no crs defined.
        ValueError: If seeds are laying on a single arc.
    """
    from ._spherical_voronoi import generate_voronoi_regions

    if gdf is None:
        gdf = gpd.GeoDataFrame(
            {GEOMETRY_COLUMN: [box(minx=-180, maxx=180, miny=-90, maxy=90)]}, crs=WGS84_CRS
        )

    gdf_wgs84 = gdf.to_crs(crs=WGS84_CRS)
    generated_regions = generate_voronoi_regions(
        seeds=self.seeds,
        max_meters_between_points=self.max_meters_between_points,
        num_of_multiprocessing_workers=self.num_of_multiprocessing_workers,
        multiprocessing_activation_threshold=self.multiprocessing_activation_threshold,
    )
    regions_gdf = gpd.GeoDataFrame(
        data={GEOMETRY_COLUMN: generated_regions}, index=self.region_ids, crs=WGS84_CRS
    )
    regions_gdf.index.rename(REGIONS_INDEX, inplace=True)
    clipped_regions_gdf = regions_gdf.clip(mask=gdf_wgs84, keep_geom_type=False)
    return clipped_regions_gdf

geocode_to_region_gdf(query, by_osmid=False)

Geocode a query to the regions_gdf unified format.

This functions is a wrapper around the ox.geocode_to_gdf[1] function from the osmnx library. For parameters description look into the source documentation.

PARAMETER DESCRIPTION
query

Query string(s) or structured dict(s) to geocode.

TYPE: Union[str, List[str], Dict[str, Any]]

by_osmid

Flag to treat query as an OSM ID lookup rather than text search. Defaults to False.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
GeoDataFrame

gpd.GeoDataFrame: GeoDataFrame with geocoded regions.

References
  1. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.geocoder.geocode_to_gdf

Examples:

Download geometry for a city

>>> from srai.regionalizers import geocode_to_region_gdf
>>> geocode_to_region_gdf("Wrocław, PL")
                                                  geometry
region_id
Wrocław, Lower Silesian Voivodeship, Poland  POLYGON ((...

Download geometries for multiple cities

>>> geocode_to_region_gdf(["New York City", "Washington, DC"])
                                                            geometry
region_id
New York, United States                          MULTIPOLYGON (((...
Washington, District of Columbia, United States  POLYGON ((...

Use OSM relation IDs to get geometries.

>>> geocode_to_region_gdf(["R175342", "R5750005"], by_osmid=True)
                                                         geometry
region_id
Greater London, England, United Kingdom             POLYGON ((...
Sydney, Council of the City of Sydney, New Sout...  POLYGON ((...
Source code in srai/regionalizers/geocode.py
def geocode_to_region_gdf(
    query: Union[str, list[str], dict[str, Any]], by_osmid: bool = False
) -> gpd.GeoDataFrame:
    """
    Geocode a query to the `regions_gdf` unified format.

    This functions is a wrapper around the `ox.geocode_to_gdf`[1] function from the `osmnx` library.
    For parameters description look into the source documentation.

    Args:
        query (Union[str, List[str], Dict[str, Any]]): Query string(s) or structured dict(s)
            to geocode.
        by_osmid (bool, optional): Flag to treat query as an OSM ID lookup rather than text search.
            Defaults to False.

    Returns:
        gpd.GeoDataFrame: GeoDataFrame with geocoded regions.

    References:
        1. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.geocoder.geocode_to_gdf

    Examples:
        Download geometry for a city
        >>> from srai.regionalizers import geocode_to_region_gdf
        >>> geocode_to_region_gdf("Wrocław, PL")
                                                          geometry
        region_id
        Wrocław, Lower Silesian Voivodeship, Poland  POLYGON ((...

        Download geometries for multiple cities

        >>> geocode_to_region_gdf(["New York City", "Washington, DC"])
                                                                    geometry
        region_id
        New York, United States                          MULTIPOLYGON (((...
        Washington, District of Columbia, United States  POLYGON ((...

        Use OSM relation IDs to get geometries.

        >>> geocode_to_region_gdf(["R175342", "R5750005"], by_osmid=True)
                                                                 geometry
        region_id
        Greater London, England, United Kingdom             POLYGON ((...
        Sydney, Council of the City of Sydney, New Sout...  POLYGON ((...
    """
    import_optional_dependencies(
        dependency_group="osm",
        modules=["osmnx"],
    )

    import osmnx as ox

    geocoded_gdf = ox.geocode_to_gdf(query=query, by_osmid=by_osmid, which_result=None)
    regions_gdf = (
        geocoded_gdf[["display_name", "geometry"]]
        .rename(columns={"display_name": REGIONS_INDEX})
        .set_index(REGIONS_INDEX)
    )
    return regions_gdf