Skip to content

OSMPbfLoader

srai.loaders.OSMPbfLoader(
    pbf_file=None,
    download_source="protomaps",
    download_directory="files",
    switch_to_geofabrik_on_error=True,
)

Bases: OSMLoader

OSMPbfLoader.

OSM(OpenStreetMap)[1] PBF(Protocolbuffer Binary Format)[2] loader is a loader capable of loading OSM features from a PBF file. It filters features based on OSM tags[3] in form of key:value pairs, that are used by OSM users to give meaning to geometries.

This loader uses pyosmium[3] library capable of parsing an *.osm.pbf file.

Additionally, it can download a pbf file extract for a given area using different sources.

References
  1. https://www.openstreetmap.org/
  2. https://wiki.openstreetmap.org/wiki/PBF_Format
  3. https://osmcode.org/pyosmium/
PARAMETER DESCRIPTION
pbf_file

Downloaded *.osm.pbf file to be used by the loader. If not provided, it will be automatically downloaded for a given area. Defaults to None.

TYPE: Union[str, Path] DEFAULT: None

download_source

Source to use when downloading PBF files. Can be one of: geofabrik, openstreetmap_fr, protomaps. Defaults to "protomaps".

TYPE: PbfSourceLiteral DEFAULT: 'protomaps'

download_directory

Directory where to save the downloaded *.osm.pbf files. Ignored if pbf_file is provided. Defaults to "files".

TYPE: Union[str, Path] DEFAULT: 'files'

switch_to_geofabrik_on_error

Flag whether to automatically switch download_source to 'geofabrik' if error occures. Defaults to True.

TYPE: bool DEFAULT: True

Source code in srai/loaders/osm_loaders/osm_pbf_loader.py
def __init__(
    self,
    pbf_file: Optional[Union[str, Path]] = None,
    download_source: PbfSourceLiteral = "protomaps",
    download_directory: Union[str, Path] = "files",
    switch_to_geofabrik_on_error: bool = True,
) -> None:
    """
    Initialize OSMPbfLoader.

    Args:
        pbf_file (Union[str, Path], optional): Downloaded `*.osm.pbf` file to be used by
            the loader. If not provided, it will be automatically downloaded for a given area.
            Defaults to None.
        download_source (PbfSourceLiteral, optional): Source to use when downloading PBF files.
            Can be one of: `geofabrik`, `openstreetmap_fr`, `protomaps`.
            Defaults to "protomaps".
        download_directory (Union[str, Path], optional): Directory where to save the downloaded
            `*.osm.pbf` files. Ignored if `pbf_file` is provided. Defaults to "files".
        switch_to_geofabrik_on_error (bool, optional): Flag whether to automatically
            switch `download_source` to 'geofabrik' if error occures. Defaults to `True`.
    """
    import_optional_dependencies(dependency_group="osm", modules=["osmium"])
    self.pbf_file = pbf_file
    self.download_source = download_source
    self.download_directory = download_directory
    self.switch_to_geofabrik_on_error = switch_to_geofabrik_on_error

load(area, tags)

Load OSM features with specified tags for a given area from an *.osm.pbf file.

The loader will use provided *.osm.pbf file, or download extracts using PbfFileDownloader. Later it will parse and filter features from files using PbfFileHandler. It will return a GeoDataFrame containing the geometry column and columns for tag keys.

Some key/value pairs might be missing from the resulting GeoDataFrame,

simply because there are no such objects in the given area.

PARAMETER DESCRIPTION
area

Area for which to download objects.

TYPE: Union[BaseGeometry, Iterable[BaseGeometry], gpd.GeoSeries, gpd.GeoDataFrame]

tags

A dictionary specifying which tags to download. The keys should be OSM tags (e.g. building, amenity). The values should either be True for retrieving all objects with the tag, string for retrieving a single tag-value pair or list of strings for retrieving all values specified in the list. tags={'leisure': 'park} would return parks from the area. tags={'leisure': 'park, 'amenity': True, 'shop': ['bakery', 'bicycle']} would return parks, all amenity types, bakeries and bicycle shops.

TYPE: Union[OsmTagsFilter, GroupedOsmTagsFilter]

RAISES DESCRIPTION
ValueError

If PBF file is expected to be downloaded and provided geometries aren't shapely.geometry.Polygons.

RETURNS DESCRIPTION
gpd.GeoDataFrame

gpd.GeoDataFrame: Downloaded features as a GeoDataFrame.

Source code in srai/loaders/osm_loaders/osm_pbf_loader.py
def load(
    self,
    area: Union[BaseGeometry, Iterable[BaseGeometry], gpd.GeoSeries, gpd.GeoDataFrame],
    tags: Union[OsmTagsFilter, GroupedOsmTagsFilter],
) -> gpd.GeoDataFrame:
    """
    Load OSM features with specified tags for a given area from an `*.osm.pbf` file.

    The loader will use provided `*.osm.pbf` file, or download extracts
    using `PbfFileDownloader`. Later it will parse and filter features from files
    using `PbfFileHandler`. It will return a GeoDataFrame containing the `geometry` column
    and columns for tag keys.

    Note: Some key/value pairs might be missing from the resulting GeoDataFrame,
        simply because there are no such objects in the given area.

    Args:
        area (Union[BaseGeometry, Iterable[BaseGeometry], gpd.GeoSeries, gpd.GeoDataFrame]):
            Area for which to download objects.
        tags (Union[OsmTagsFilter, GroupedOsmTagsFilter]): A dictionary
            specifying which tags to download.
            The keys should be OSM tags (e.g. `building`, `amenity`).
            The values should either be `True` for retrieving all objects with the tag,
            string for retrieving a single tag-value pair
            or list of strings for retrieving all values specified in the list.
            `tags={'leisure': 'park}` would return parks from the area.
            `tags={'leisure': 'park, 'amenity': True, 'shop': ['bakery', 'bicycle']}`
            would return parks, all amenity types, bakeries and bicycle shops.

    Raises:
        ValueError: If PBF file is expected to be downloaded and provided geometries
            aren't shapely.geometry.Polygons.

    Returns:
        gpd.GeoDataFrame: Downloaded features as a GeoDataFrame.
    """
    from srai.loaders.osm_loaders.pbf_file_downloader import PbfFileDownloader
    from srai.loaders.osm_loaders.pbf_file_handler import PbfFileHandler

    area_wgs84 = self._prepare_area_gdf(area)

    downloaded_pbf_files: Mapping[Hashable, Sequence[Union[str, Path]]]
    if self.pbf_file is not None:
        downloaded_pbf_files = {Path(self.pbf_file).name: [self.pbf_file]}
    else:
        downloaded_pbf_files = PbfFileDownloader(
            download_source=self.download_source,
            download_directory=self.download_directory,
            switch_to_geofabrik_on_error=self.switch_to_geofabrik_on_error,
        ).download_pbf_files_for_regions_gdf(regions_gdf=area_wgs84)

    merged_tags = self._merge_osm_tags_filter(tags)

    pbf_handler = PbfFileHandler(tags=merged_tags)

    results = []
    for region_id, pbf_files in downloaded_pbf_files.items():
        features_gdf = pbf_handler.get_features_gdf(
            file_paths=pbf_files, region_id=str(region_id)
        )
        matching_features_ids = features_gdf.sjoin(area_wgs84).index
        results.append(features_gdf.loc[matching_features_ids])

    result_gdf = self._group_gdfs(results).set_crs(WGS84_CRS)

    features_columns = [
        column
        for column in result_gdf.columns
        if column != GEOMETRY_COLUMN and result_gdf[column].notnull().any()
    ]
    result_gdf = result_gdf[[GEOMETRY_COLUMN, *sorted(features_columns)]]

    return self._parse_features_gdf_to_groups(result_gdf, tags)