Highway2Vec Embedder¶
In [1]:
Copied!
from IPython.display import display
from srai.plotting import plot_numeric_data, plot_regions
from IPython.display import display
from srai.plotting import plot_numeric_data, plot_regions
Get an area to embed¶
In [2]:
Copied!
from srai.regionalizers import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Wrocław, PL")
plot_regions(area_gdf, tiles_style="CartoDB positron")
from srai.regionalizers import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Wrocław, PL")
plot_regions(area_gdf, tiles_style="CartoDB positron")
Out[2]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Regionalize the area with a regionalizer¶
In [3]:
Copied!
from srai.regionalizers import H3Regionalizer
regionalizer = H3Regionalizer(9)
regions_gdf = regionalizer.transform(area_gdf)
print(len(regions_gdf))
display(regions_gdf.head(3))
plot_regions(regions_gdf, tiles_style="CartoDB positron")
from srai.regionalizers import H3Regionalizer
regionalizer = H3Regionalizer(9)
regions_gdf = regionalizer.transform(area_gdf)
print(len(regions_gdf))
display(regions_gdf.head(3))
plot_regions(regions_gdf, tiles_style="CartoDB positron")
3168
geometry | |
---|---|
region_id | |
891e2042dd7ffff | POLYGON ((17.01563 51.17702, 17.01567 51.17534... |
891e2042b1bffff | POLYGON ((17.02152 51.13682, 17.02155 51.13514... |
891e2050b0bffff | POLYGON ((16.91712 51.20177, 16.91716 51.20009... |
Out[3]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Download a road infrastructure for the area¶
In [4]:
Copied!
from srai.loaders import OSMNetworkType, OSMWayLoader
loader = OSMWayLoader(OSMNetworkType.DRIVE)
nodes_gdf, edges_gdf = loader.load(area_gdf)
display(nodes_gdf.head(3))
display(edges_gdf.head(3))
ax = edges_gdf.plot(linewidth=1, figsize=(12, 7))
nodes_gdf.plot(ax=ax, markersize=3, color="red")
from srai.loaders import OSMNetworkType, OSMWayLoader
loader = OSMWayLoader(OSMNetworkType.DRIVE)
nodes_gdf, edges_gdf = loader.load(area_gdf)
display(nodes_gdf.head(3))
display(edges_gdf.head(3))
ax = edges_gdf.plot(linewidth=1, figsize=(12, 7))
nodes_gdf.plot(ax=ax, markersize=3, color="red")
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/srai/loaders/osm_way_loader/osm_way_loader.py:229: FutureWarning: The clean_periphery argument has been deprecated and will be removed in the v2.0.0 release. Future behavior will be as though clean_periphery=True. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 G_directed = ox.graph_from_polygon( /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/osmnx/_overpass.py:359: FutureWarning: `settings.timeout` is deprecated and will be removed in the v2.0.0 release: use `settings.requests_timeout` instead. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 overpass_settings = _make_overpass_settings()
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/osmnx/_overpass.py:369: FutureWarning: `settings.timeout` is deprecated and will be removed in the v2.0.0 release: use `settings.requests_timeout` instead. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 yield _overpass_request(data={"data": query_str})
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/srai/loaders/osm_way_loader/osm_way_loader.py:237: FutureWarning: The `get_undirected` function is deprecated and will be removed in the v2.0.0 release. Replace it with `convert.to_undirected` instead. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 G_undirected = ox.utils_graph.get_undirected(G_directed)
y | x | street_count | highway | ref | geometry | |
---|---|---|---|---|---|---|
osmid | ||||||
95584835 | 51.083111 | 17.049513 | 4 | NaN | NaN | POINT (17.04951 51.08311) |
95584841 | 51.084699 | 17.064367 | 3 | NaN | NaN | POINT (17.06437 51.08470) |
95584850 | 51.083328 | 17.035057 | 4 | NaN | NaN | POINT (17.03506 51.08333) |
oneway | lanes-1 | lanes-2 | lanes-3 | lanes-4 | lanes-5 | lanes-6 | lanes-7 | lanes-8 | lanes-9 | ... | bicycle-official | lit-yes | lit-no | lit-sunset-sunrise | lit-24/7 | lit-automatic | lit-disused | lit-limited | lit-interval | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||||
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.04951 51.08311, 17.04947 51.083... |
1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.04951 51.08311, 17.04933 51.083... |
2 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.05357 51.08301, 17.05335 51.082... |
3 rows × 219 columns
Out[4]:
<Axes: >
Find out which edges correspond to which regions¶
In [5]:
Copied!
from srai.joiners import IntersectionJoiner
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, edges_gdf)
joint_gdf
from srai.joiners import IntersectionJoiner
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, edges_gdf)
joint_gdf
Out[5]:
region_id | feature_id |
---|---|
891e2042dd7ffff | 6270 |
6265 | |
6266 | |
6242 | |
6263 | |
... | ... |
891e204720bffff | 2694 |
5285 | |
2696 | |
2695 | |
9047 |
15545 rows × 0 columns
Get the embeddings for regions based on the road infrastructure¶
In [6]:
Copied!
from pytorch_lightning import seed_everything
from srai.embedders import Highway2VecEmbedder
seed_everything(42)
embedder = Highway2VecEmbedder()
embeddings = embedder.fit_transform(regions_gdf, edges_gdf, joint_gdf)
embeddings
from pytorch_lightning import seed_everything
from srai.embedders import Highway2VecEmbedder
seed_everything(42)
embedder = Highway2VecEmbedder()
embeddings = embedder.fit_transform(regions_gdf, edges_gdf, joint_gdf)
embeddings
Seed set to 42
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:75: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `pytorch_lightning` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default
| Name | Type | Params --------------------------------------- 0 | encoder | Sequential | 16.0 K 1 | decoder | Sequential | 16.2 K --------------------------------------- 32.1 K Trainable params 0 Non-trainable params 32.1 K Total params 0.128 Total estimated model params size (MB)
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=3` in the `DataLoader` to improve performance.
`Trainer.fit` stopped: `max_epochs=10` reached.
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/srai/embedders/highway2vec/embedder.py:75: FutureWarning: The behavior of array concatenation with empty entries is deprecated. In a future version, this will no longer exclude empty items when determining the result dtype. To retain the old behavior, exclude the empty entries before the concat operation. embeddings_joint = joint_gdf.join(embeddings_df)
Out[6]:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
891e2040007ffff | -0.539541 | -0.147202 | 0.231133 | -0.155268 | 0.318578 | -0.122770 | 0.081642 | -0.649768 | -0.241923 | 0.726528 | ... | 0.307526 | -0.507819 | 0.135545 | 0.039534 | 0.025653 | 0.079719 | 0.150593 | 0.348484 | 0.157026 | 0.348836 |
891e2040013ffff | -0.153741 | -0.081649 | 0.326839 | -0.272349 | 0.590248 | -0.133988 | -0.032478 | -0.647943 | -0.416105 | 0.836500 | ... | 0.403983 | -0.796961 | 0.215343 | -0.348054 | 0.034720 | 0.147255 | 0.421888 | 0.416943 | 0.242064 | 0.397768 |
891e2040017ffff | -0.265606 | -0.259180 | 0.262159 | -0.227918 | 0.437086 | -0.231525 | 0.185855 | -0.838344 | -0.175982 | 0.586379 | ... | 0.246594 | -0.570084 | 0.175362 | -0.032558 | 0.046645 | 0.103537 | 0.481308 | 0.251798 | 0.145708 | 0.375634 |
891e2040023ffff | -0.257300 | -0.354628 | 0.342268 | -0.040691 | 0.502657 | -0.160569 | 0.144855 | -0.433879 | -0.077589 | 0.586579 | ... | 0.139541 | -0.493104 | 0.179853 | -0.014301 | -0.046311 | 0.260333 | 0.386325 | 0.069908 | -0.028361 | 0.441664 |
891e2040027ffff | -0.577240 | -0.245056 | 0.116357 | -0.331536 | 0.622523 | 0.034275 | 0.088231 | -0.758699 | -0.301155 | 0.729003 | ... | 0.018124 | -0.601429 | 0.183795 | -0.155685 | -0.012062 | 0.326336 | 0.176341 | 0.187382 | 0.018452 | 0.588956 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
891e2055bc3ffff | -0.531588 | 0.053953 | 0.218250 | -0.149345 | 0.485677 | -0.337040 | 0.185409 | -0.816564 | -0.293617 | 0.745303 | ... | 0.466325 | -0.575672 | -0.017546 | -0.210806 | -0.246534 | 0.317339 | 0.094239 | 0.400965 | 0.386304 | 0.760467 |
891e2055bc7ffff | -0.312561 | 0.180587 | 0.216111 | -0.196476 | 0.789139 | -0.478202 | 0.298272 | -0.609006 | -0.374453 | 0.736907 | ... | 0.351175 | -0.600261 | 0.029559 | -0.233984 | -0.333623 | 0.331489 | -0.026461 | 0.374098 | 0.311198 | 0.490092 |
891e2055bcbffff | -0.739363 | 0.035218 | 0.297716 | -0.088070 | 0.444381 | -0.183434 | 0.193246 | -1.048380 | -0.251863 | 0.719740 | ... | 0.605018 | -0.496426 | -0.185179 | -0.304660 | -0.223717 | 0.455643 | 0.172639 | 0.451482 | 0.569794 | 1.042124 |
891e205a967ffff | -0.643766 | 0.128776 | 0.195003 | -0.297602 | 0.324029 | 0.154858 | -0.095509 | -0.372894 | -0.371210 | 0.532472 | ... | -0.283816 | -0.305983 | 0.359435 | 0.000081 | 0.210609 | 0.025653 | 0.207406 | 0.221255 | 0.052564 | 0.575522 |
891e205a9a7ffff | -1.478251 | -1.421816 | 0.651385 | -1.063441 | 0.967914 | -0.829468 | 0.323825 | -0.825694 | -0.475194 | 1.608099 | ... | 0.584450 | -0.549811 | -0.137307 | -0.018250 | -0.286358 | 0.310460 | 0.385605 | 0.967372 | 0.244139 | 0.914667 |
2032 rows × 30 columns
In [7]:
Copied!
from sklearn.cluster import KMeans
clusterizer = KMeans(n_clusters=5, random_state=42)
clusterizer.fit(embeddings)
embeddings["cluster"] = clusterizer.labels_
embeddings
from sklearn.cluster import KMeans
clusterizer = KMeans(n_clusters=5, random_state=42)
clusterizer.fit(embeddings)
embeddings["cluster"] = clusterizer.labels_
embeddings
Out[7]:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | cluster | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
891e2040007ffff | -0.539541 | -0.147202 | 0.231133 | -0.155268 | 0.318578 | -0.122770 | 0.081642 | -0.649768 | -0.241923 | 0.726528 | ... | -0.507819 | 0.135545 | 0.039534 | 0.025653 | 0.079719 | 0.150593 | 0.348484 | 0.157026 | 0.348836 | 3 |
891e2040013ffff | -0.153741 | -0.081649 | 0.326839 | -0.272349 | 0.590248 | -0.133988 | -0.032478 | -0.647943 | -0.416105 | 0.836500 | ... | -0.796961 | 0.215343 | -0.348054 | 0.034720 | 0.147255 | 0.421888 | 0.416943 | 0.242064 | 0.397768 | 0 |
891e2040017ffff | -0.265606 | -0.259180 | 0.262159 | -0.227918 | 0.437086 | -0.231525 | 0.185855 | -0.838344 | -0.175982 | 0.586379 | ... | -0.570084 | 0.175362 | -0.032558 | 0.046645 | 0.103537 | 0.481308 | 0.251798 | 0.145708 | 0.375634 | 0 |
891e2040023ffff | -0.257300 | -0.354628 | 0.342268 | -0.040691 | 0.502657 | -0.160569 | 0.144855 | -0.433879 | -0.077589 | 0.586579 | ... | -0.493104 | 0.179853 | -0.014301 | -0.046311 | 0.260333 | 0.386325 | 0.069908 | -0.028361 | 0.441664 | 3 |
891e2040027ffff | -0.577240 | -0.245056 | 0.116357 | -0.331536 | 0.622523 | 0.034275 | 0.088231 | -0.758699 | -0.301155 | 0.729003 | ... | -0.601429 | 0.183795 | -0.155685 | -0.012062 | 0.326336 | 0.176341 | 0.187382 | 0.018452 | 0.588956 | 1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
891e2055bc3ffff | -0.531588 | 0.053953 | 0.218250 | -0.149345 | 0.485677 | -0.337040 | 0.185409 | -0.816564 | -0.293617 | 0.745303 | ... | -0.575672 | -0.017546 | -0.210806 | -0.246534 | 0.317339 | 0.094239 | 0.400965 | 0.386304 | 0.760467 | 4 |
891e2055bc7ffff | -0.312561 | 0.180587 | 0.216111 | -0.196476 | 0.789139 | -0.478202 | 0.298272 | -0.609006 | -0.374453 | 0.736907 | ... | -0.600261 | 0.029559 | -0.233984 | -0.333623 | 0.331489 | -0.026461 | 0.374098 | 0.311198 | 0.490092 | 4 |
891e2055bcbffff | -0.739363 | 0.035218 | 0.297716 | -0.088070 | 0.444381 | -0.183434 | 0.193246 | -1.048380 | -0.251863 | 0.719740 | ... | -0.496426 | -0.185179 | -0.304660 | -0.223717 | 0.455643 | 0.172639 | 0.451482 | 0.569794 | 1.042124 | 2 |
891e205a967ffff | -0.643766 | 0.128776 | 0.195003 | -0.297602 | 0.324029 | 0.154858 | -0.095509 | -0.372894 | -0.371210 | 0.532472 | ... | -0.305983 | 0.359435 | 0.000081 | 0.210609 | 0.025653 | 0.207406 | 0.221255 | 0.052564 | 0.575522 | 3 |
891e205a9a7ffff | -1.478251 | -1.421816 | 0.651385 | -1.063441 | 0.967914 | -0.829468 | 0.323825 | -0.825694 | -0.475194 | 1.608099 | ... | -0.549811 | -0.137307 | -0.018250 | -0.286358 | 0.310460 | 0.385605 | 0.967372 | 0.244139 | 0.914667 | 2 |
2032 rows × 31 columns
In [8]:
Copied!
plot_numeric_data(regions_gdf, "cluster", embeddings)
plot_numeric_data(regions_gdf, "cluster", embeddings)
Out[8]:
Make this Notebook Trusted to load map: File -> Trust Notebook