Index
This module contains loaders, used to load spatial data from different sources.
We want to unify loading from different data sources into a single interface. Thanks to this, we have a unified spatial data format, which makes it possible to feed them into any of the embedding methods available in this library.
¶
Bases: ABC
Abstract class for loaders.
¶
abstractmethod
Load data for a given area.
PARAMETER | DESCRIPTION |
---|---|
*args |
Positional arguments dependating on a specific loader.
TYPE:
|
**kwargs |
Keyword arguments dependating on a specific loader.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
GeoDataFrame with the downloaded data. |
Source code in srai/loaders/_base.py
¶
Bases: Loader
GeoparquetLoader.
Geoparquet [1] loader is a wrapper for a geopandas.read_parquet
function
and allows for an automatic index setting and additional geometry clipping.
References
¶
Load a geoparquet file.
PARAMETER | DESCRIPTION |
---|---|
file_path |
parquet file path.
TYPE:
|
index_column |
Column that will be used as an index. If not provided, automatic indexing will be applied by default. Defaults to None.
TYPE:
|
columns |
List of columns to load. If not provided, all will be loaded. Defaults to None.
TYPE:
|
area |
Mask to clip loaded data. If not provided, unaltered data will be returned. Defaults to None.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If provided index column doesn't exists in list of loaded columns. |
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
gpd.GeoDataFrame: Loaded geoparquet file as a GeoDataFrame. |
Source code in srai/loaders/geoparquet_loader.py
¶
Bases: Loader
GTFSLoader.
This loader is capable of reading GTFS feed and calculates time aggregations in 1H slots.
Source code in srai/loaders/gtfs_loader.py
¶
Load GTFS feed and calculate time aggregations for stops.
PARAMETER | DESCRIPTION |
---|---|
gtfs_file |
Path to the GTFS feed.
TYPE:
|
fail_on_validation_errors |
Fail if GTFS feed is invalid. Ignored when skip_validation is True.
TYPE:
|
skip_validation |
Skip GTFS feed validation.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
gpd.GeoDataFrame: GeoDataFrame with trip counts and list of directions for stops. |
Source code in srai/loaders/gtfs_loader.py
¶
Bases: Loader
, ABC
Abstract class for loaders.
¶
abstractmethod
Load data for a given area.
PARAMETER | DESCRIPTION |
---|---|
area |
Shapely geometry with the area of interest.
TYPE:
|
tags |
OSM tags filter.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
gpd.GeoDataFrame: GeoDataFrame with the downloaded data. |
Source code in srai/loaders/osm_loaders/_base.py
¶
Bases: OSMLoader
OSMOnlineLoader.
OSM(OpenStreetMap)[1] online loader is a loader capable of downloading objects from a given area from OSM. It filters features based on OSM tags[2] in form of key:value pairs, that are used by OSM users to give meaning to geometries.
This loader is a wrapper around the osmnx
library. It uses osmnx.geometries_from_polygon
to make individual queries.
Source code in srai/loaders/osm_loaders/osm_online_loader.py
¶
Download OSM features with specified tags for a given area.
The loader first downloads all objects with tags
. It returns a GeoDataFrame containing
the geometry
column and columns for tag keys.
Some key/value pairs might be missing from the resulting GeoDataFrame,
simply because there are no such objects in the given area.
PARAMETER | DESCRIPTION |
---|---|
area |
Area for which to download objects.
TYPE:
|
tags |
A dictionary
specifying which tags to download.
The keys should be OSM tags (e.g.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
gpd.GeoDataFrame: Downloaded features as a GeoDataFrame. |
Source code in srai/loaders/osm_loaders/osm_online_loader.py
¶
Bases: OSMLoader
OSMPbfLoader.
OSM(OpenStreetMap)[1] PBF(Protocolbuffer Binary Format)[2] loader is a loader capable of loading OSM features from a PBF file. It filters features based on OSM tags[3] in form of key:value pairs, that are used by OSM users to give meaning to geometries.
This loader uses PbfFileReader
from the QuackOSM
[3] library.
It utilizes the duckdb
[4] engine with spatial
[5] extension
capable of parsing an *.osm.pbf
file.
Additionally, it can download a pbf file extract for a given area using different sources.
References
PARAMETER | DESCRIPTION |
---|---|
pbf_file |
Downloaded
TYPE:
|
download_source |
Source to use when downloading PBF files.
Can be one of:
TYPE:
|
download_directory |
Directory where to save the downloaded
TYPE:
|
Source code in srai/loaders/osm_loaders/osm_pbf_loader.py
¶
Load OSM features with specified tags for a given area from an *.osm.pbf
file.
The loader will use provided *.osm.pbf
file, or download extracts
automatically. Later it will parse and filter features from files
using PbfFileReader
from QuackOSM
library. It will return a GeoDataFrame
containing the geometry
column and columns for tag keys.
Some key/value pairs might be missing from the resulting GeoDataFrame,
simply because there are no such objects in the given area.
PARAMETER | DESCRIPTION |
---|---|
area |
Area for which to download objects.
TYPE:
|
tags |
A dictionary
specifying which tags to download.
The keys should be OSM tags (e.g.
TYPE:
|
ignore_cache |
(bool, optional): Whether to ignore precalculated geoparquet files or not. Defaults to False.
TYPE:
|
explode_tags |
(bool, optional): Whether to split OSM tags into multiple columns or keep them in a single dict. Defaults to True.
TYPE:
|
keep_all_tags |
(bool, optional): Whether to keep all tags related to the element,
or return only those defined in the
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If PBF file is expected to be downloaded and provided geometries aren't shapely.geometry.Polygons. |
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
gpd.GeoDataFrame: Downloaded features as a GeoDataFrame. |
Source code in srai/loaders/osm_loaders/osm_pbf_loader.py
¶
Load OSM features with specified tags for a given area and save it to geoparquet file.
PARAMETER | DESCRIPTION |
---|---|
area |
Area for which to download objects.
TYPE:
|
tags |
A dictionary
specifying which tags to download.
The keys should be OSM tags (e.g.
TYPE:
|
ignore_cache |
(bool, optional): Whether to ignore precalculated geoparquet files or not. Defaults to False.
TYPE:
|
explode_tags |
(bool, optional): Whether to split OSM tags into multiple columns or keep them in a single dict. Defaults to True.
TYPE:
|
keep_all_tags |
(bool, optional): Whether to keep all tags related to the element,
or return only those defined in the
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Path
|
Path to the saved GeoParquet file.
TYPE:
|
Source code in srai/loaders/osm_loaders/osm_pbf_loader.py
OSMTileLoader(
tile_server_url,
zoom,
verbose=False,
resource_type="png",
auth_token=None,
data_collector=None,
storage_path=None,
)
¶
OSMTileLoader(
tile_server_url,
zoom,
verbose=False,
resource_type="png",
auth_token=None,
data_collector=None,
storage_path=None,
)
OSM Tile Loader.
Download raster tiles from user specified tile server, like listed in [1]. Loader finds x, y coordinates [2] for specified area and downloads tiles. Address is built with schema {tile_server_url}/{zoom}/{x}/{y}.{resource_type}
References
PARAMETER | DESCRIPTION |
---|---|
tile_server_url |
url of tile server, without z, x, y parameters
TYPE:
|
zoom |
zoom level [1]
TYPE:
|
verbose |
should print logs. Defaults to False.
TYPE:
|
resource_type |
file extension. Added to the end of url. Defaults to "png".
TYPE:
|
auth_token |
auth token. Added as access_token parameter to request. Defaults to None.
TYPE:
|
data_collector |
DataCollector object or
TYPE:
|
storage_path |
path to save data, used with SavingDataCollector. Defaults to None.
TYPE:
|
Source code in srai/loaders/osm_loaders/osm_tile_loader.py
¶
Return all tiles of region.
PARAMETER | DESCRIPTION |
---|---|
area |
Area for which to download objects.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
GeoDataFrame
|
gpd.GeoDataFrame: Pandas of tiles for each region in area transformed by DataCollector |
Source code in srai/loaders/osm_loaders/osm_tile_loader.py
¶
Download single tile from tile server. Return tile processed by DataCollector.
PARAMETER | DESCRIPTION |
---|---|
x(int) |
x tile coordinate
|
y(int) |
y tile coordinate
|
idx |
id of tile, if non created as x_y_self.zoom
TYPE:
|
Source code in srai/loaders/osm_loaders/osm_tile_loader.py
¶
Bases: str
, Enum
Type of the street network.
See [1] for more details.
OSMWayLoader(
network_type,
contain_within_area=False,
preprocess=True,
wide=True,
metadata=False,
osm_way_tags=constants.OSM_WAY_TAGS,
)
¶
OSMWayLoader(
network_type,
contain_within_area=False,
preprocess=True,
wide=True,
metadata=False,
osm_way_tags=constants.OSM_WAY_TAGS,
)
Bases: Loader
OSMWayLoader downloads road infrastructure from OSM.
OSMWayLoader loader is a wrapper for the osmnx.graph_from_polygon()
and osmnx.graph_to_gdfs()
that simplifies obtaining the road infrastructure data
from OpenStreetMap. As the OSM data is often noisy, it can also take an opinionated approach
to preprocessing it, with standardisation in mind - e.g. unification of units,
discarding non-wiki values and rounding them.
PARAMETER | DESCRIPTION |
---|---|
network_type |
Type of the network to download.
TYPE:
|
contain_within_area |
defaults to False Whether to remove the roads that have one of their nodes outside of the given area.
TYPE:
|
preprocess |
defaults to True Whether to preprocess the data.
TYPE:
|
wide |
defaults to True Whether to return the roads in wide format.
TYPE:
|
metadata |
defaults to False Whether to return metadata for roads.
TYPE:
|
osm_way_tags |
defaults to constants.OSM_WAY_TAGS Dict of tags to take into consideration during computing.
TYPE:
|
Source code in srai/loaders/osm_way_loader/osm_way_loader.py
¶
Load road infrastructure for a given GeoDataFrame.
PARAMETER | DESCRIPTION |
---|---|
area |
(Multi)Polygons for which to download road infrastructure data.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
ValueError
|
If provided GeoDataFrame has no crs defined. |
ValueError
|
If provided GeoDataFrame is empty. |
TypeError
|
If provided geometries are not of type Polygon or MultiPolygon. |
LoadedDataIsEmptyException
|
If none of the supplied area polygons contains any road infrastructure data. |
RETURNS | DESCRIPTION |
---|---|
tuple[GeoDataFrame, GeoDataFrame]
|
Tuple[gpd.GeoDataFrame, gpd.GeoDataFrame]: Road infrastructure as (intersections, roads) |
Source code in srai/loaders/osm_way_loader/osm_way_loader.py
¶
Download a file with progress bar.
PARAMETER | DESCRIPTION |
---|---|
url |
URL to download.
TYPE:
|
fname |
File name.
TYPE:
|
chunk_size |
Chunk size.
TYPE:
|
force_download |
Flag to force download even if file exists.
TYPE:
|
Source: https://gist.github.com/yanqd0/c13ed29e29432e3cf3e7c38467f42f51