Highway2Vec Embedder¶
In [1]:
Copied!
from IPython.display import display
from srai.plotting import plot_numeric_data, plot_regions
from IPython.display import display
from srai.plotting import plot_numeric_data, plot_regions
Get an area to embed¶
In [2]:
Copied!
from srai.regionalizers import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Wrocław, PL")
plot_regions(area_gdf, tiles_style="CartoDB positron")
from srai.regionalizers import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Wrocław, PL")
plot_regions(area_gdf, tiles_style="CartoDB positron")
Out[2]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Regionalize the area with a regionalizer¶
In [3]:
Copied!
from srai.regionalizers import H3Regionalizer
regionalizer = H3Regionalizer(9)
regions_gdf = regionalizer.transform(area_gdf)
print(len(regions_gdf))
display(regions_gdf.head(3))
plot_regions(regions_gdf, tiles_style="CartoDB positron")
from srai.regionalizers import H3Regionalizer
regionalizer = H3Regionalizer(9)
regions_gdf = regionalizer.transform(area_gdf)
print(len(regions_gdf))
display(regions_gdf.head(3))
plot_regions(regions_gdf, tiles_style="CartoDB positron")
3168
geometry | |
---|---|
region_id | |
891e20421cbffff | POLYGON ((16.94046 51.16970, 16.94050 51.16802... |
891e2042a2fffff | POLYGON ((17.00658 51.13133, 17.00661 51.12965... |
891e204e557ffff | POLYGON ((17.05566 51.06461, 17.05570 51.06293... |
Out[3]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Download a road infrastructure for the area¶
In [4]:
Copied!
from srai.loaders import OSMNetworkType, OSMWayLoader
loader = OSMWayLoader(OSMNetworkType.DRIVE)
nodes_gdf, edges_gdf = loader.load(area_gdf)
display(nodes_gdf.head(3))
display(edges_gdf.head(3))
ax = edges_gdf.plot(linewidth=1, figsize=(12, 7))
nodes_gdf.plot(ax=ax, markersize=3, color="red")
from srai.loaders import OSMNetworkType, OSMWayLoader
loader = OSMWayLoader(OSMNetworkType.DRIVE)
nodes_gdf, edges_gdf = loader.load(area_gdf)
display(nodes_gdf.head(3))
display(edges_gdf.head(3))
ax = edges_gdf.plot(linewidth=1, figsize=(12, 7))
nodes_gdf.plot(ax=ax, markersize=3, color="red")
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/srai/loaders/osm_way_loader/osm_way_loader.py:229: FutureWarning: The clean_periphery argument has been deprecated and will be removed in the v2.0.0 release. Future behavior will be as though clean_periphery=True. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 G_directed = ox.graph_from_polygon( /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/osmnx/_overpass.py:359: FutureWarning: `settings.timeout` is deprecated and will be removed in the v2.0.0 release: use `settings.requests_timeout` instead. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 overpass_settings = _make_overpass_settings()
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/osmnx/_overpass.py:369: FutureWarning: `settings.timeout` is deprecated and will be removed in the v2.0.0 release: use `settings.requests_timeout` instead. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 yield _overpass_request(data={"data": query_str}) /opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/osmnx/_overpass.py:451: FutureWarning: `settings.timeout` is deprecated and will be removed in the v2.0.0 release: use `settings.requests_timeout` instead. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 this_pause = _get_overpass_pause(overpass_endpoint)
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/srai/loaders/osm_way_loader/osm_way_loader.py:237: FutureWarning: The `get_undirected` function is deprecated and will be removed in the v2.0.0 release. Replace it with `convert.to_undirected` instead. See the OSMnx v2 migration guide: https://github.com/gboeing/osmnx/issues/1123 G_undirected = ox.utils_graph.get_undirected(G_directed)
y | x | street_count | highway | ref | geometry | |
---|---|---|---|---|---|---|
osmid | ||||||
95584835 | 51.083111 | 17.049513 | 4 | NaN | NaN | POINT (17.04951 51.08311) |
95584841 | 51.084699 | 17.064367 | 3 | NaN | NaN | POINT (17.06437 51.08470) |
95584850 | 51.083328 | 17.035057 | 4 | NaN | NaN | POINT (17.03506 51.08333) |
oneway | lanes-1 | lanes-2 | lanes-3 | lanes-4 | lanes-5 | lanes-6 | lanes-7 | lanes-8 | lanes-9 | ... | bicycle-official | lit-yes | lit-no | lit-sunset-sunrise | lit-24/7 | lit-automatic | lit-disused | lit-limited | lit-interval | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||||
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.04951 51.08311, 17.04947 51.083... |
1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.04951 51.08311, 17.04933 51.083... |
2 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.05357 51.08301, 17.05335 51.082... |
3 rows × 219 columns
Out[4]:
<Axes: >
Find out which edges correspond to which regions¶
In [5]:
Copied!
from srai.joiners import IntersectionJoiner
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, edges_gdf)
joint_gdf
from srai.joiners import IntersectionJoiner
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, edges_gdf)
joint_gdf
Out[5]:
region_id | feature_id |
---|---|
891e204e557ffff | 4190 |
4202 | |
4201 | |
4199 | |
891e2042697ffff | 8901 |
... | ... |
891e20406b3ffff | 8286 |
3362 | |
6578 | |
891e2055a43ffff | 7511 |
8970 |
15722 rows × 0 columns
Get the embeddings for regions based on the road infrastructure¶
In [6]:
Copied!
from pytorch_lightning import seed_everything
from srai.embedders import Highway2VecEmbedder
seed_everything(42)
embedder = Highway2VecEmbedder()
embeddings = embedder.fit_transform(regions_gdf, edges_gdf, joint_gdf)
embeddings
from pytorch_lightning import seed_everything
from srai.embedders import Highway2VecEmbedder
seed_everything(42)
embedder = Highway2VecEmbedder()
embeddings = embedder.fit_transform(regions_gdf, edges_gdf, joint_gdf)
embeddings
Seed set to 42
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:75: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `pytorch_lightning` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default
| Name | Type | Params --------------------------------------- 0 | encoder | Sequential | 16.0 K 1 | decoder | Sequential | 16.2 K --------------------------------------- 32.1 K Trainable params 0 Non-trainable params 32.1 K Total params 0.128 Total estimated model params size (MB)
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=3` in the `DataLoader` to improve performance.
`Trainer.fit` stopped: `max_epochs=10` reached.
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/srai/embedders/highway2vec/embedder.py:75: FutureWarning: The behavior of array concatenation with empty entries is deprecated. In a future version, this will no longer exclude empty items when determining the result dtype. To retain the old behavior, exclude the empty entries before the concat operation. embeddings_joint = joint_gdf.join(embeddings_df)
Out[6]:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
891e2040007ffff | -0.500876 | -0.170577 | 0.175679 | -0.138017 | 0.351574 | -0.137988 | 0.097659 | -0.601147 | -0.241708 | 0.738275 | ... | 0.326857 | -0.527040 | 0.132806 | 0.072183 | -0.015617 | 0.040504 | 0.171830 | 0.370400 | 0.146430 | 0.348382 |
891e2040013ffff | -0.096451 | -0.105448 | 0.285260 | -0.296229 | 0.622387 | -0.110835 | 0.036012 | -0.672212 | -0.479695 | 0.815160 | ... | 0.413562 | -0.754025 | 0.198783 | -0.328841 | -0.017702 | 0.092769 | 0.455044 | 0.404475 | 0.166973 | 0.372385 |
891e2040017ffff | -0.284640 | -0.256166 | 0.183288 | -0.233499 | 0.446882 | -0.211849 | 0.230594 | -0.792736 | -0.233340 | 0.618573 | ... | 0.195669 | -0.576084 | 0.197247 | 0.038464 | 0.048569 | 0.083445 | 0.470839 | 0.281041 | 0.138333 | 0.396957 |
891e2040023ffff | -0.241505 | -0.363091 | 0.301251 | -0.026235 | 0.519482 | -0.183924 | 0.170449 | -0.417670 | -0.102240 | 0.612353 | ... | 0.133085 | -0.501896 | 0.179845 | 0.035775 | -0.074151 | 0.209274 | 0.391761 | 0.097197 | -0.033675 | 0.442482 |
891e2040027ffff | -0.561028 | -0.269774 | 0.069739 | -0.329874 | 0.631586 | 0.011030 | 0.127321 | -0.714779 | -0.320608 | 0.769084 | ... | -0.003671 | -0.611373 | 0.175998 | -0.074924 | -0.033651 | 0.277068 | 0.207851 | 0.231221 | 0.038559 | 0.586329 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
891e2055bc3ffff | -0.499624 | -0.015107 | 0.130929 | -0.192005 | 0.520919 | -0.350832 | 0.242332 | -0.776085 | -0.352089 | 0.773911 | ... | 0.466479 | -0.584363 | -0.091274 | -0.211894 | -0.304317 | 0.270350 | 0.190800 | 0.388361 | 0.321336 | 0.761356 |
891e2055bc7ffff | -0.353990 | 0.102986 | 0.096897 | -0.284011 | 0.792185 | -0.537846 | 0.339987 | -0.508364 | -0.397602 | 0.779856 | ... | 0.277303 | -0.611381 | -0.034055 | -0.250024 | -0.416371 | 0.289269 | 0.047325 | 0.326037 | 0.210747 | 0.463528 |
891e2055bcbffff | -0.649712 | -0.050774 | 0.189476 | -0.130992 | 0.493045 | -0.208345 | 0.289957 | -1.025107 | -0.309696 | 0.747634 | ... | 0.634918 | -0.514004 | -0.267735 | -0.272899 | -0.313647 | 0.365553 | 0.275041 | 0.450990 | 0.498539 | 1.033792 |
891e205a967ffff | -0.605251 | 0.114015 | 0.175078 | -0.281711 | 0.336612 | 0.154542 | -0.026165 | -0.365951 | -0.396474 | 0.531186 | ... | -0.299310 | -0.325677 | 0.356714 | 0.028953 | 0.215758 | -0.004623 | 0.233533 | 0.279212 | 0.088734 | 0.554686 |
891e205a9a7ffff | -1.404263 | -1.382761 | 0.503229 | -1.089563 | 0.939539 | -0.933452 | 0.222206 | -0.786773 | -0.500759 | 1.735679 | ... | 0.565642 | -0.589892 | -0.091554 | 0.035516 | -0.432250 | 0.239600 | 0.643934 | 1.037692 | 0.163673 | 0.953123 |
2034 rows × 30 columns
In [7]:
Copied!
from sklearn.cluster import KMeans
clusterizer = KMeans(n_clusters=5, random_state=42)
clusterizer.fit(embeddings)
embeddings["cluster"] = clusterizer.labels_
embeddings
from sklearn.cluster import KMeans
clusterizer = KMeans(n_clusters=5, random_state=42)
clusterizer.fit(embeddings)
embeddings["cluster"] = clusterizer.labels_
embeddings
Out[7]:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | cluster | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
891e2040007ffff | -0.500876 | -0.170577 | 0.175679 | -0.138017 | 0.351574 | -0.137988 | 0.097659 | -0.601147 | -0.241708 | 0.738275 | ... | -0.527040 | 0.132806 | 0.072183 | -0.015617 | 0.040504 | 0.171830 | 0.370400 | 0.146430 | 0.348382 | 1 |
891e2040013ffff | -0.096451 | -0.105448 | 0.285260 | -0.296229 | 0.622387 | -0.110835 | 0.036012 | -0.672212 | -0.479695 | 0.815160 | ... | -0.754025 | 0.198783 | -0.328841 | -0.017702 | 0.092769 | 0.455044 | 0.404475 | 0.166973 | 0.372385 | 3 |
891e2040017ffff | -0.284640 | -0.256166 | 0.183288 | -0.233499 | 0.446882 | -0.211849 | 0.230594 | -0.792736 | -0.233340 | 0.618573 | ... | -0.576084 | 0.197247 | 0.038464 | 0.048569 | 0.083445 | 0.470839 | 0.281041 | 0.138333 | 0.396957 | 3 |
891e2040023ffff | -0.241505 | -0.363091 | 0.301251 | -0.026235 | 0.519482 | -0.183924 | 0.170449 | -0.417670 | -0.102240 | 0.612353 | ... | -0.501896 | 0.179845 | 0.035775 | -0.074151 | 0.209274 | 0.391761 | 0.097197 | -0.033675 | 0.442482 | 4 |
891e2040027ffff | -0.561028 | -0.269774 | 0.069739 | -0.329874 | 0.631586 | 0.011030 | 0.127321 | -0.714779 | -0.320608 | 0.769084 | ... | -0.611373 | 0.175998 | -0.074924 | -0.033651 | 0.277068 | 0.207851 | 0.231221 | 0.038559 | 0.586329 | 1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
891e2055bc3ffff | -0.499624 | -0.015107 | 0.130929 | -0.192005 | 0.520919 | -0.350832 | 0.242332 | -0.776085 | -0.352089 | 0.773911 | ... | -0.584363 | -0.091274 | -0.211894 | -0.304317 | 0.270350 | 0.190800 | 0.388361 | 0.321336 | 0.761356 | 2 |
891e2055bc7ffff | -0.353990 | 0.102986 | 0.096897 | -0.284011 | 0.792185 | -0.537846 | 0.339987 | -0.508364 | -0.397602 | 0.779856 | ... | -0.611381 | -0.034055 | -0.250024 | -0.416371 | 0.289269 | 0.047325 | 0.326037 | 0.210747 | 0.463528 | 2 |
891e2055bcbffff | -0.649712 | -0.050774 | 0.189476 | -0.130992 | 0.493045 | -0.208345 | 0.289957 | -1.025107 | -0.309696 | 0.747634 | ... | -0.514004 | -0.267735 | -0.272899 | -0.313647 | 0.365553 | 0.275041 | 0.450990 | 0.498539 | 1.033792 | 2 |
891e205a967ffff | -0.605251 | 0.114015 | 0.175078 | -0.281711 | 0.336612 | 0.154542 | -0.026165 | -0.365951 | -0.396474 | 0.531186 | ... | -0.325677 | 0.356714 | 0.028953 | 0.215758 | -0.004623 | 0.233533 | 0.279212 | 0.088734 | 0.554686 | 1 |
891e205a9a7ffff | -1.404263 | -1.382761 | 0.503229 | -1.089563 | 0.939539 | -0.933452 | 0.222206 | -0.786773 | -0.500759 | 1.735679 | ... | -0.589892 | -0.091554 | 0.035516 | -0.432250 | 0.239600 | 0.643934 | 1.037692 | 0.163673 | 0.953123 | 2 |
2034 rows × 31 columns
In [8]:
Copied!
plot_numeric_data(regions_gdf, "cluster", embeddings)
plot_numeric_data(regions_gdf, "cluster", embeddings)
Out[8]:
Make this Notebook Trusted to load map: File -> Trust Notebook