Index
regionalizers ¶
This module contains regionalizers, which are used to divide space before analysis.
Embedding methods available in srai operate on a regions, which can be defined in many ways. In
this module, we aggregate different regionalization methods under a common Regionalizer interface.
We include both pre-defined spatial indexes (e.g. H3 or S2), data-based ones (e.g. Voronoi) and OSM-
based ones (e.g. administrative boundaries).
AdministrativeBoundaryRegionalizer ¶
AdministrativeBoundaryRegionalizer(
    admin_level: int,
    clip_regions: bool = True,
    return_empty_region: bool = False,
    prioritize_english_name: bool = True,
    toposimplify: Union[bool, float] = True,
    remove_artefact_regions: bool = True,
)
            Bases: Regionalizer
AdministrativeBoundaryRegionalizer.
Administrative boundary regionalizer allows the given geometries to be divided
into boundaries from OpenStreetMap on a given admin_level [1].
Downloads those boundaries online using overpass and osmnx libraries.
Note: offline .pbf loading will be implemented in the future. Note: option to download historic data will be implemented in the future.
| PARAMETER | DESCRIPTION | 
|---|---|
| admin_level | OpenStreetMap admin_level value. See [1] for detailed description of available values. 
                  
                    TYPE:
                       | 
| clip_regions | Whether to clip regions using a provided mask. Turning it off can an be useful when trying to load regions using list a of points. Defaults to True. 
                  
                    TYPE:
                       | 
| return_empty_region | Whether to return an empty region to fill remaining space or not. Defaults to False. 
                  
                    TYPE:
                       | 
| prioritize_english_name | Whether to use english area name if available as a region id first. Defaults to True. 
                  
                    TYPE:
                       | 
| toposimplify | Whether to simplify topology to reduce
geometries size or not. Value is passed to  
                  
                    TYPE:
                       | 
| remove_artefact_regions | Whether to remove small regions barely intersecting queried area. Turning it off can sometimes load unnecessary boundaries that touch on the edge. It removes regions that intersect with an area smaller than 1% of total self. Takes into consideration if provided query GeoDataFrame contains points and then skips calculating area when intersects any point. Defaults to True. 
                  
                    TYPE:
                       | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | If admin_level is outside available range (1-11). See [2] for list of
values with  | 
References
Source code in srai/regionalizers/administrative_boundary_regionalizer.py
                  transform ¶
Regionalize a given GeoDataFrame.
Will query Overpass [1] server using overpass [2] library for closed administrative
boundaries on a given admin_level and then download geometries for each relation using
osmnx [3] library.
If prioritize_english_name is set to True, method will try to extract the name:en tag
first before resorting to the name tag. If boundary doesn't have a name tag, an id
will be used.
Before returning downloaded regions, a topology is built and can be simplified to reduce
size of geometries while keeping neighbouring regions together without introducing gaps.
Topojson library [4] is used for this operation.
Additionally, an empty region with name EMPTY can be introduced if returned regions do
not fully cover a clipping mask.
| PARAMETER | DESCRIPTION | 
|---|---|
| gdf | GeoDataFrame to be regionalized. Will use this list of geometries to crop resulting regions. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | gpd.GeoDataFrame: GeoDataFrame with the regionalized data cropped using input GeoDataFrame. | 
| RAISES | DESCRIPTION | 
|---|---|
| RuntimeError | If simplification can't preserve a topology. | 
References
Source code in srai/regionalizers/administrative_boundary_regionalizer.py
            | 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 |  | 
H3Regionalizer ¶
            Bases: Regionalizer
H3 Regionalizer.
H3 Regionalizer allows the given geometries to be divided into H3 cells - hexagons with pentagons as a very rare exception
| PARAMETER | DESCRIPTION | 
|---|---|
| resolution | Resolution of the cells. See [1] for a full comparison. 
                  
                    TYPE:
                       | 
| buffer | Whether to fully cover the geometries with H3 Cells (visible on the borders). Defaults to True. 
                  
                    TYPE:
                       | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | If resolution is not between 0 and 15. | 
Source code in srai/regionalizers/h3_regionalizer.py
                  transform ¶
Regionalize a given GeoDataFrame.
Transforms given geometries into H3 cells of given resolution and optionally applies buffering.
| PARAMETER | DESCRIPTION | 
|---|---|
| gdf | (Multi)Polygons to be regionalized. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | gpd.GeoDataFrame: H3 cells. | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | If provided GeoDataFrame has no crs defined. | 
Source code in srai/regionalizers/h3_regionalizer.py
            Regionalizer ¶
            Bases: ABC
Abstract class for regionalizers.
transform ¶
abstractmethod
  
  Regionalize a given GeoDataFrame.
This one should treat the input as a single region.
| PARAMETER | DESCRIPTION | 
|---|---|
| gdf | GeoDataFrame to be regionalized. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | GeoDataFrame with the regionalized data. | 
Source code in srai/regionalizers/_base.py
            S2Regionalizer ¶
            Bases: Regionalizer
S2 Regionalizer.
S2 Regionalizer gives an opportunity to divide the given geometries into square S2 cells.
| PARAMETER | DESCRIPTION | 
|---|---|
| resolution | Resolution of the cells (S2 supports 0-30). See [1] for a full comparison. 
                  
                    TYPE:
                       | 
| buffer | If True then fully cover geometries with S2 cells. Otherwise only use those cells that fully fit into them. Defaults to True. 
                  
                    TYPE:
                       | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | If resolution is not between 0 and 30. | 
Source code in srai/regionalizers/s2_regionalizer.py
                  transform ¶
Regionalize a given GeoDataFrame.
| PARAMETER | DESCRIPTION | 
|---|---|
| gdf | GeoDataFrame to be regionalized. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | gpd.GeoDataFrame: GeoDataFrame with regionalized geometries. | 
Source code in srai/regionalizers/s2_regionalizer.py
            SlippyMapRegionalizer ¶
            Bases: Regionalizer
SlippyMapRegionalizer class.
| PARAMETER | DESCRIPTION | 
|---|---|
| zoom | zoom level 
                  
                    TYPE:
                       | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | if zoom is not in [0, 19] | 
Source code in srai/regionalizers/slippy_map_regionalizer.py
                  
                transform ¶
Regionalize a given GeoDataFrame.
| PARAMETER | DESCRIPTION | 
|---|---|
| gdf | GeoDataFrame to be regionalized. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | gpd.GeoDataFrame: GeoDataFrame with regionalized geometries. | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | If provided GeoDataFrame has no crs defined. | 
Source code in srai/regionalizers/slippy_map_regionalizer.py
            VoronoiRegionalizer ¶
VoronoiRegionalizer(
    seeds: Union[gpd.GeoDataFrame, list[Point]],
    max_meters_between_points: int = 10000,
    num_of_multiprocessing_workers: int = -1,
    multiprocessing_activation_threshold: Optional[int] = None,
)
            Bases: Regionalizer
VoronoiRegionalizer.
Voronoi [1] regionalizer allows the given geometries to be divided into Thiessen polygons using geometries that are the seeds. To minimize distortions tessellation will be performed on a sphere using SphericalVoronoi [2] function from scipy library.
References
All (multi)polygons from seeds GeoDataFrame will be transformed to their centroids, because scipy function requires only points as an input.
| PARAMETER | DESCRIPTION | 
|---|---|
| seeds | List of points or a GeoDataFrame with seeds for creating a tessellation. Every non-point geometry will be mapped to a centroid. Minimum 4 seeds are required. Seeds cannot lie on a single arc. Empty seeds will be removed. 
                  
                    TYPE:
                       | 
| max_meters_between_points | Maximal distance in meters between two points in a resulting polygon. Higher number results lower resolution of a polygon. 
                  
                    TYPE:
                       | 
| num_of_multiprocessing_workers | Number of workers used for
multiprocessing. Defaults to  
                  
                    TYPE:
                       | 
| multiprocessing_activation_threshold | Number of seeds required to start processing on multiple processes. Activating multiprocessing for a small amount of points might not be feasible. Defaults to 100. 
                  
                    TYPE:
                       | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | If any seed is duplicated. | 
| ValueError | If less than 4 seeds are provided. | 
| ValueError | If provided seeds geodataframe has no crs defined. | 
| ValueError | If any seed is outside WGS84 coordinates domain. | 
Source code in srai/regionalizers/voronoi_regionalizer.py
                  transform ¶
Regionalize a given GeoDataFrame.
Returns a list of disjointed regions consisting of Thiessen cells generated using a Voronoi diagram on the sphere.
| PARAMETER | DESCRIPTION | 
|---|---|
| gdf | GeoDataFrame to be regionalized. Will use this list of geometries to crop resulting regions. If None, a boundary box with bounds (-180, -90, 180, 90) is used to return regions covering whole Earth. Defaults to None. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | gpd.GeoDataFrame: GeoDataFrame with the regionalized data cropped using input GeoDataFrame. | 
| RAISES | DESCRIPTION | 
|---|---|
| ValueError | If provided geodataframe has no crs defined. | 
| ValueError | If seeds are laying on a single arc. | 
Source code in srai/regionalizers/voronoi_regionalizer.py
            convert_to_regions_gdf ¶
convert_to_regions_gdf(
    geometry: Union[
        BaseGeometry, Iterable[BaseGeometry], gpd.GeoSeries, gpd.GeoDataFrame
    ],
    index_column: Optional[str] = None,
) -> gpd.GeoDataFrame
Convert any geometry to a regions GeoDataFrame.
| PARAMETER | DESCRIPTION | 
|---|---|
| geometry | Geo objects to convert. 
                  
                    TYPE:
                       | 
| index_column | Name of the column used to define the index. If None, will rename the existing index. Defaults to None. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | gpd.GeoDataFrame: Regions gdf with proper index definition. | 
Source code in srai/geometry.py
            geocode_to_region_gdf ¶
geocode_to_region_gdf(
    query: Union[str, list[str], dict[str, Any]], by_osmid: bool = False
) -> gpd.GeoDataFrame
Geocode a query to the regions_gdf unified format.
This functions is a wrapper around the ox.geocode_to_gdf[1] function from the osmnx library.
For parameters description look into the source documentation.
| PARAMETER | DESCRIPTION | 
|---|---|
| query | Query string(s) or structured dict(s) to geocode. 
                  
                    TYPE:
                       | 
| by_osmid | Flag to treat query as an OSM ID lookup rather than text search. Defaults to False. 
                  
                    TYPE:
                       | 
| RETURNS | DESCRIPTION | 
|---|---|
| GeoDataFrame | gpd.GeoDataFrame: GeoDataFrame with geocoded regions. | 
Examples:
Download geometry for a city
>>> from srai.regionalizers import geocode_to_region_gdf
>>> geocode_to_region_gdf("Wrocław, PL")
                                                  geometry
region_id
Wrocław, Lower Silesian Voivodeship, Poland  POLYGON ((...
Download geometries for multiple cities
>>> geocode_to_region_gdf(["New York City", "Washington, DC"])
                                                            geometry
region_id
New York, United States                          MULTIPOLYGON (((...
Washington, District of Columbia, United States  POLYGON ((...
Use OSM relation IDs to get geometries.
>>> geocode_to_region_gdf(["R175342", "R5750005"], by_osmid=True)
                                                         geometry
region_id
Greater London, England, United Kingdom             POLYGON ((...
Sydney, Council of the City of Sydney, New Sout...  POLYGON ((...