Highway2Vec Embedder¶
In [1]:
Copied!
from IPython.display import display
from srai.plotting import plot_regions, plot_numeric_data
from IPython.display import display
from srai.plotting import plot_regions, plot_numeric_data
Get an area to embed¶
In [2]:
Copied!
from srai.utils import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Wrocław, PL")
plot_regions(area_gdf, tiles_style="CartoDB positron")
from srai.utils import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Wrocław, PL")
plot_regions(area_gdf, tiles_style="CartoDB positron")
Out[2]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Regionalize the area with a regionalizer¶
In [3]:
Copied!
from srai.regionalizers import H3Regionalizer
regionalizer = H3Regionalizer(9)
regions_gdf = regionalizer.transform(area_gdf)
print(len(regions_gdf))
display(regions_gdf.head(3))
plot_regions(regions_gdf, tiles_style="CartoDB positron")
from srai.regionalizers import H3Regionalizer
regionalizer = H3Regionalizer(9)
regions_gdf = regionalizer.transform(area_gdf)
print(len(regions_gdf))
display(regions_gdf.head(3))
plot_regions(regions_gdf, tiles_style="CartoDB positron")
3168
geometry | |
---|---|
region_id | |
891e204718fffff | POLYGON ((17.13181 51.14511, 17.13184 51.14343... |
891e2040037ffff | POLYGON ((16.95419 51.11713, 16.95423 51.11545... |
891e20424dbffff | POLYGON ((16.88728 51.18908, 16.88479 51.18817... |
Out[3]:
Make this Notebook Trusted to load map: File -> Trust Notebook
Download a road infrastructure for the area¶
In [4]:
Copied!
from srai.loaders import OSMWayLoader
from srai.loaders.osm_way_loader import NetworkType
loader = OSMWayLoader(NetworkType.DRIVE)
nodes_gdf, edges_gdf = loader.load(area_gdf)
display(nodes_gdf.head(3))
display(edges_gdf.head(3))
ax = edges_gdf.plot(linewidth=1, figsize=(12, 7))
nodes_gdf.plot(ax=ax, markersize=3, color="red")
from srai.loaders import OSMWayLoader
from srai.loaders.osm_way_loader import NetworkType
loader = OSMWayLoader(NetworkType.DRIVE)
nodes_gdf, edges_gdf = loader.load(area_gdf)
display(nodes_gdf.head(3))
display(edges_gdf.head(3))
ax = edges_gdf.plot(linewidth=1, figsize=(12, 7))
nodes_gdf.plot(ax=ax, markersize=3, color="red")
y | x | street_count | highway | ref | geometry | |
---|---|---|---|---|---|---|
osmid | ||||||
95584835 | 51.083111 | 17.049513 | 4 | NaN | NaN | POINT (17.04951 51.08311) |
95584841 | 51.084699 | 17.064367 | 3 | NaN | NaN | POINT (17.06437 51.08470) |
95584850 | 51.083328 | 17.035057 | 4 | NaN | NaN | POINT (17.03506 51.08333) |
oneway | lanes-1 | lanes-2 | lanes-3 | lanes-4 | lanes-5 | lanes-6 | lanes-7 | lanes-8 | lanes-9 | ... | bicycle-official | lit-yes | lit-no | lit-sunset-sunrise | lit-24/7 | lit-automatic | lit-disused | lit-limited | lit-interval | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||||
0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.04951 51.08311, 17.04947 51.083... |
1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.04951 51.08311, 17.04933 51.083... |
2 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | LINESTRING (17.05357 51.08301, 17.05335 51.082... |
3 rows × 219 columns
Out[4]:
<Axes: >
Find out which edges correspond to which regions¶
In [5]:
Copied!
from srai.joiners import IntersectionJoiner
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, edges_gdf)
joint_gdf
from srai.joiners import IntersectionJoiner
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, edges_gdf)
joint_gdf
Out[5]:
region_id | feature_id |
---|---|
891e204718fffff | 4836 |
891e2047113ffff | 4836 |
891e204718bffff | 4836 |
891e204718fffff | 2395 |
891e2047117ffff | 2395 |
... | ... |
891e2041dbbffff | 3199 |
3197 | |
3198 | |
3201 | |
891e20476b3ffff | 8809 |
15330 rows × 0 columns
Get the embeddings for regions based on the road infrastructure¶
In [6]:
Copied!
from srai.embedders import Highway2VecEmbedder
from pytorch_lightning import seed_everything
seed_everything(42)
embedder = Highway2VecEmbedder()
embeddings = embedder.fit_transform(regions_gdf, edges_gdf, joint_gdf)
embeddings
from srai.embedders import Highway2VecEmbedder
from pytorch_lightning import seed_everything
seed_everything(42)
embedder = Highway2VecEmbedder()
embeddings = embedder.fit_transform(regions_gdf, edges_gdf, joint_gdf)
embeddings
Global seed set to 42 GPU available: False, used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs /opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:67: UserWarning: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `pytorch_lightning` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default warning_cache.warn( | Name | Type | Params --------------------------------------- 0 | encoder | Sequential | 16.0 K 1 | decoder | Sequential | 16.2 K --------------------------------------- 32.1 K Trainable params 0 Non-trainable params 32.1 K Total params 0.128 Total estimated model params size (MB)
`Trainer.fit` stopped: `max_epochs=10` reached.
Out[6]:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
891e2040007ffff | -0.592675 | -0.156439 | 0.316878 | -0.178116 | 0.335074 | -0.042099 | -0.007037 | -0.726710 | -0.242276 | 0.664817 | ... | 0.257848 | -0.375854 | 0.089992 | 0.182088 | 0.141173 | 0.031616 | 0.163341 | 0.365907 | 0.287733 | 0.324711 |
891e2040013ffff | -0.136257 | -0.156195 | 0.342551 | -0.308239 | 0.554553 | 0.027854 | 0.007827 | -0.755653 | -0.406361 | 0.826457 | ... | 0.372594 | -0.740286 | 0.246507 | -0.357965 | 0.101201 | 0.115533 | 0.531880 | 0.537617 | 0.212468 | 0.362886 |
891e2040017ffff | -0.249226 | -0.161647 | 0.195461 | -0.112282 | 0.178534 | -0.124589 | 0.141553 | -0.740693 | -0.168635 | 0.554595 | ... | 0.102795 | -0.516651 | 0.172073 | 0.008919 | -0.032252 | 0.096935 | 0.454974 | 0.335007 | 0.145881 | 0.324424 |
891e2040023ffff | -0.276689 | -0.388363 | 0.408318 | -0.038385 | 0.531989 | -0.118302 | 0.136248 | -0.450662 | -0.077615 | 0.640293 | ... | 0.135172 | -0.499666 | 0.185636 | 0.018971 | -0.032847 | 0.255846 | 0.418071 | 0.117046 | -0.024102 | 0.473609 |
891e2040027ffff | -0.611483 | -0.297155 | 0.173165 | -0.332716 | 0.650224 | 0.118049 | 0.033207 | -0.731662 | -0.273948 | 0.811148 | ... | -0.020694 | -0.601581 | 0.176176 | -0.137177 | 0.014194 | 0.348125 | 0.219086 | 0.209495 | 0.015815 | 0.607438 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
891e2055bc3ffff | -0.571268 | -0.047053 | 0.172409 | -0.259168 | 0.539724 | -0.241797 | 0.096836 | -0.876133 | -0.340842 | 0.839133 | ... | 0.510309 | -0.610962 | -0.013011 | -0.226429 | -0.204265 | 0.358358 | 0.093020 | 0.468527 | 0.310629 | 0.819315 |
891e2055bc7ffff | -0.407740 | 0.078420 | 0.166854 | -0.306929 | 0.829329 | -0.365957 | 0.160496 | -0.659686 | -0.388537 | 0.869107 | ... | 0.425436 | -0.633082 | 0.004459 | -0.229100 | -0.245808 | 0.360813 | -0.000522 | 0.437012 | 0.214676 | 0.564988 |
891e2055bcbffff | -0.758721 | -0.087215 | 0.255242 | -0.243854 | 0.546038 | -0.091597 | 0.130745 | -1.106571 | -0.344764 | 0.810302 | ... | 0.660555 | -0.529476 | -0.146418 | -0.327390 | -0.225057 | 0.499087 | 0.147140 | 0.524861 | 0.496142 | 1.100273 |
891e205a967ffff | -0.675398 | 0.119309 | 0.198991 | -0.282234 | 0.294037 | 0.197420 | -0.068271 | -0.393761 | -0.373669 | 0.551305 | ... | -0.302837 | -0.330806 | 0.369927 | 0.026749 | 0.207280 | 0.056827 | 0.234496 | 0.244398 | 0.050003 | 0.572030 |
891e205a9a7ffff | -1.501739 | -1.545459 | 0.703367 | -0.901502 | 1.115304 | -0.587796 | 0.051766 | -0.549502 | -0.296983 | 1.950246 | ... | 0.629084 | -0.512419 | -0.191052 | -0.132126 | -0.361702 | 0.345152 | 0.425911 | 1.122826 | 0.088288 | 1.056574 |
2027 rows × 30 columns
In [7]:
Copied!
from sklearn.cluster import KMeans
clusterizer = KMeans(n_clusters=5, random_state=42)
clusterizer.fit(embeddings)
embeddings["cluster"] = clusterizer.labels_
embeddings
from sklearn.cluster import KMeans
clusterizer = KMeans(n_clusters=5, random_state=42)
clusterizer.fit(embeddings)
embeddings["cluster"] = clusterizer.labels_
embeddings
/opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10/site-packages/sklearn/cluster/_kmeans.py:1412: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning super()._check_params_vs_input(X, default_n_init=10)
Out[7]:
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | cluster | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
891e2040007ffff | -0.592675 | -0.156439 | 0.316878 | -0.178116 | 0.335074 | -0.042099 | -0.007037 | -0.726710 | -0.242276 | 0.664817 | ... | -0.375854 | 0.089992 | 0.182088 | 0.141173 | 0.031616 | 0.163341 | 0.365907 | 0.287733 | 0.324711 | 4 |
891e2040013ffff | -0.136257 | -0.156195 | 0.342551 | -0.308239 | 0.554553 | 0.027854 | 0.007827 | -0.755653 | -0.406361 | 0.826457 | ... | -0.740286 | 0.246507 | -0.357965 | 0.101201 | 0.115533 | 0.531880 | 0.537617 | 0.212468 | 0.362886 | 1 |
891e2040017ffff | -0.249226 | -0.161647 | 0.195461 | -0.112282 | 0.178534 | -0.124589 | 0.141553 | -0.740693 | -0.168635 | 0.554595 | ... | -0.516651 | 0.172073 | 0.008919 | -0.032252 | 0.096935 | 0.454974 | 0.335007 | 0.145881 | 0.324424 | 3 |
891e2040023ffff | -0.276689 | -0.388363 | 0.408318 | -0.038385 | 0.531989 | -0.118302 | 0.136248 | -0.450662 | -0.077615 | 0.640293 | ... | -0.499666 | 0.185636 | 0.018971 | -0.032847 | 0.255846 | 0.418071 | 0.117046 | -0.024102 | 0.473609 | 3 |
891e2040027ffff | -0.611483 | -0.297155 | 0.173165 | -0.332716 | 0.650224 | 0.118049 | 0.033207 | -0.731662 | -0.273948 | 0.811148 | ... | -0.601581 | 0.176176 | -0.137177 | 0.014194 | 0.348125 | 0.219086 | 0.209495 | 0.015815 | 0.607438 | 4 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
891e2055bc3ffff | -0.571268 | -0.047053 | 0.172409 | -0.259168 | 0.539724 | -0.241797 | 0.096836 | -0.876133 | -0.340842 | 0.839133 | ... | -0.610962 | -0.013011 | -0.226429 | -0.204265 | 0.358358 | 0.093020 | 0.468527 | 0.310629 | 0.819315 | 0 |
891e2055bc7ffff | -0.407740 | 0.078420 | 0.166854 | -0.306929 | 0.829329 | -0.365957 | 0.160496 | -0.659686 | -0.388537 | 0.869107 | ... | -0.633082 | 0.004459 | -0.229100 | -0.245808 | 0.360813 | -0.000522 | 0.437012 | 0.214676 | 0.564988 | 0 |
891e2055bcbffff | -0.758721 | -0.087215 | 0.255242 | -0.243854 | 0.546038 | -0.091597 | 0.130745 | -1.106571 | -0.344764 | 0.810302 | ... | -0.529476 | -0.146418 | -0.327390 | -0.225057 | 0.499087 | 0.147140 | 0.524861 | 0.496142 | 1.100273 | 2 |
891e205a967ffff | -0.675398 | 0.119309 | 0.198991 | -0.282234 | 0.294037 | 0.197420 | -0.068271 | -0.393761 | -0.373669 | 0.551305 | ... | -0.330806 | 0.369927 | 0.026749 | 0.207280 | 0.056827 | 0.234496 | 0.244398 | 0.050003 | 0.572030 | 4 |
891e205a9a7ffff | -1.501739 | -1.545459 | 0.703367 | -0.901502 | 1.115304 | -0.587796 | 0.051766 | -0.549502 | -0.296983 | 1.950246 | ... | -0.512419 | -0.191052 | -0.132126 | -0.361702 | 0.345152 | 0.425911 | 1.122826 | 0.088288 | 1.056574 | 2 |
2027 rows × 31 columns
In [8]:
Copied!
plot_numeric_data(regions_gdf, embeddings, "cluster", tiles_style="CartoDB positron")
plot_numeric_data(regions_gdf, embeddings, "cluster", tiles_style="CartoDB positron")
Out[8]:
Make this Notebook Trusted to load map: File -> Trust Notebook