Contextual count embedder
from srai.embedders import ContextualCountEmbedder
from srai.joiners import IntersectionJoiner
from srai.loaders.osm_loaders import OSMPbfLoader
from srai.neighbourhoods import H3Neighbourhood
from srai.plotting.folium_wrapper import plot_numeric_data, plot_regions
from srai.regionalizers import H3Regionalizer
Data preparation¶
In order to use ContextualCountEmbedder
we need to prepare some data.
Namely we need: regions_gdf
, features_gdf
, and joint_gdf
.
These are the outputs of Regionalizers, Loaders and Joiners respectively.
from srai.regionalizers import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Lisboa, PT")
plot_regions(area_gdf)
Regionalize the area using an H3Regionalizer¶
regionalizer = H3Regionalizer(resolution=9, buffer=True)
regions_gdf = regionalizer.transform(area_gdf)
regions_gdf
geometry | |
---|---|
region_id | |
89393375eafffff | POLYGON ((-9.15199 38.78201, -9.15370 38.78067... |
89393375bd7ffff | POLYGON ((-9.12230 38.77006, -9.12400 38.76872... |
89393367593ffff | POLYGON ((-9.09957 38.75969, -9.10128 38.75835... |
893933674c7ffff | POLYGON ((-9.12699 38.74520, -9.12869 38.74386... |
8939336291bffff | POLYGON ((-9.15222 38.73712, -9.15392 38.73578... |
... | ... |
893933670c7ffff | POLYGON ((-9.10645 38.72842, -9.10816 38.72708... |
89393375e77ffff | POLYGON ((-9.15526 38.77238, -9.15696 38.77104... |
8939336050fffff | POLYGON ((-9.20920 38.70168, -9.21090 38.70034... |
89393362953ffff | POLYGON ((-9.15733 38.73310, -9.15904 38.73176... |
89393362923ffff | POLYGON ((-9.13721 38.73716, -9.13892 38.73582... |
830 rows × 1 columns
Download some objects from OpenStreetMap¶
You can use both OsmTagsFilter
and GroupedOsmTagsFilter
filters. In this example, a predefined GroupedOsmTagsFilter
filter BASE_OSM_GROUPS_FILTER
is used.
from srai.loaders.osm_loaders.filters import BASE_OSM_GROUPS_FILTER
loader = OSMPbfLoader()
features_gdf = loader.load(area_gdf, tags=BASE_OSM_GROUPS_FILTER)
features_gdf
/root/development/srai/srai/loaders/osm_loaders/osm_pbf_loader.py:128: FutureWarning: Use `convert_geometry_to_geodataframe` instead. Deprecated since 0.8.1 version. features_gdf = pbf_reader.get_features_gdf_from_geometry(
0%| | 0.00/344M [00:00<?, ?B/s]
0%| | 37.9k/344M [00:00<15:49, 362kB/s]
0%| | 161k/344M [00:00<06:39, 862kB/s]
0%| | 501k/344M [00:00<02:51, 2.01MB/s]
0%|▏ | 1.46M/344M [00:00<01:09, 4.96MB/s]
1%|▎ | 2.66M/344M [00:00<00:45, 7.51MB/s]
1%|▍ | 4.19M/344M [00:00<00:33, 10.2MB/s]
2%|▌ | 5.47M/344M [00:00<00:30, 11.0MB/s]
2%|▋ | 6.57M/344M [00:00<00:31, 10.8MB/s]
2%|▉ | 8.29M/344M [00:00<00:26, 12.7MB/s]
3%|█ | 9.61M/344M [00:01<00:25, 12.9MB/s]
3%|█▏ | 11.0M/344M [00:01<00:25, 13.2MB/s]
4%|█▎ | 12.4M/344M [00:01<00:25, 13.3MB/s]
4%|█▌ | 13.7M/344M [00:01<00:24, 13.3MB/s]
4%|█▋ | 15.1M/344M [00:01<00:24, 13.5MB/s]
5%|█▊ | 16.5M/344M [00:01<00:24, 13.4MB/s]
5%|█▉ | 17.8M/344M [00:01<00:24, 13.4MB/s]
6%|██▏ | 19.3M/344M [00:01<00:23, 13.8MB/s]
6%|██▎ | 20.7M/344M [00:01<00:23, 13.6MB/s]
6%|██▍ | 22.1M/344M [00:01<00:24, 13.2MB/s]
7%|██▌ | 23.5M/344M [00:02<00:23, 13.4MB/s]
7%|██▋ | 24.8M/344M [00:02<00:24, 13.1MB/s]
8%|██▉ | 26.2M/344M [00:02<00:23, 13.3MB/s]
8%|███ | 27.6M/344M [00:02<00:23, 13.2MB/s]
8%|███▏ | 28.9M/344M [00:02<00:24, 12.9MB/s]
9%|███▎ | 30.3M/344M [00:02<00:23, 13.2MB/s]
9%|███▌ | 31.8M/344M [00:02<00:23, 13.6MB/s]
10%|███▋ | 33.2M/344M [00:02<00:22, 13.8MB/s]
10%|███▊ | 34.6M/344M [00:02<00:22, 13.7MB/s]
10%|███▉ | 36.0M/344M [00:02<00:22, 13.8MB/s]
11%|████▏ | 37.4M/344M [00:03<00:22, 13.7MB/s]
11%|████▎ | 38.9M/344M [00:03<00:21, 14.0MB/s]
12%|████▍ | 40.3M/344M [00:03<00:21, 14.0MB/s]
12%|████▌ | 41.7M/344M [00:03<00:21, 14.1MB/s]
13%|████▊ | 43.1M/344M [00:03<00:22, 13.5MB/s]
13%|████▉ | 44.7M/344M [00:03<00:21, 14.2MB/s]
13%|█████ | 46.2M/344M [00:03<00:20, 14.3MB/s]
14%|█████▎ | 47.6M/344M [00:03<00:20, 14.2MB/s]
14%|█████▍ | 49.1M/344M [00:03<00:20, 14.1MB/s]
15%|█████▌ | 50.5M/344M [00:03<00:21, 13.9MB/s]
15%|█████▋ | 51.9M/344M [00:04<00:20, 14.0MB/s]
15%|█████▉ | 53.3M/344M [00:04<00:21, 13.6MB/s]
16%|██████ | 54.7M/344M [00:04<00:20, 13.9MB/s]
16%|██████▏ | 56.1M/344M [00:04<00:20, 13.7MB/s]
17%|██████▎ | 57.5M/344M [00:04<00:20, 13.8MB/s]
17%|██████▌ | 58.9M/344M [00:04<00:21, 13.4MB/s]
18%|██████▋ | 60.4M/344M [00:04<00:20, 13.8MB/s]
18%|██████▊ | 61.8M/344M [00:04<00:20, 13.9MB/s]
18%|██████▉ | 63.2M/344M [00:04<00:20, 13.8MB/s]
19%|███████▏ | 64.6M/344M [00:05<00:20, 13.9MB/s]
19%|███████▎ | 66.0M/344M [00:05<00:20, 13.5MB/s]
20%|███████▍ | 67.3M/344M [00:05<00:20, 13.3MB/s]
20%|███████▌ | 68.7M/344M [00:05<00:21, 12.9MB/s]
20%|███████▋ | 70.0M/344M [00:05<00:27, 9.85MB/s]
21%|███████▊ | 71.1M/344M [00:05<00:28, 9.62MB/s]
21%|████████ | 72.8M/344M [00:05<00:23, 11.4MB/s]
22%|████████▍ | 76.4M/344M [00:05<00:15, 17.7MB/s]
23%|████████▋ | 78.3M/344M [00:06<00:16, 16.4MB/s]
23%|████████▊ | 80.1M/344M [00:06<00:16, 16.4MB/s]
24%|█████████ | 81.8M/344M [00:06<00:17, 15.3MB/s]
24%|█████████▏ | 83.4M/344M [00:06<00:23, 11.0MB/s]
25%|█████████▎ | 84.7M/344M [00:06<00:23, 11.2MB/s]
25%|█████████▍ | 86.0M/344M [00:06<00:22, 11.5MB/s]
25%|█████████▋ | 87.3M/344M [00:06<00:21, 11.8MB/s]
26%|█████████▊ | 88.6M/344M [00:06<00:21, 11.9MB/s]
26%|█████████▉ | 89.8M/344M [00:07<00:21, 12.0MB/s]
27%|██████████ | 91.2M/344M [00:07<00:20, 12.5MB/s]
27%|██████████▏ | 92.5M/344M [00:07<00:19, 12.7MB/s]
27%|██████████▎ | 93.8M/344M [00:07<00:19, 12.9MB/s]
28%|██████████▌ | 95.2M/344M [00:07<00:19, 12.8MB/s]
28%|██████████▋ | 96.6M/344M [00:07<00:18, 13.4MB/s]
28%|██████████▊ | 98.0M/344M [00:07<00:18, 13.4MB/s]
29%|██████████▉ | 99.4M/344M [00:07<00:18, 13.5MB/s]
29%|███████████▍ | 101M/344M [00:07<00:18, 13.5MB/s]
30%|███████████▌ | 102M/344M [00:07<00:17, 13.5MB/s]
30%|███████████▋ | 103M/344M [00:08<00:17, 13.5MB/s]
30%|███████████▉ | 105M/344M [00:08<00:17, 13.5MB/s]
31%|████████████ | 106M/344M [00:08<00:17, 13.6MB/s]
31%|████████████▏ | 108M/344M [00:08<00:17, 13.5MB/s]
32%|████████████▎ | 109M/344M [00:08<00:17, 13.2MB/s]
32%|████████████▌ | 110M/344M [00:08<00:17, 13.2MB/s]
32%|████████████▋ | 112M/344M [00:08<00:17, 13.2MB/s]
33%|████████████▊ | 113M/344M [00:08<00:17, 13.3MB/s]
33%|████████████▉ | 114M/344M [00:08<00:17, 13.2MB/s]
34%|█████████████ | 116M/344M [00:08<00:17, 13.2MB/s]
34%|█████████████▎ | 117M/344M [00:09<00:17, 13.1MB/s]
34%|█████████████▍ | 118M/344M [00:09<00:16, 13.4MB/s]
35%|█████████████▌ | 120M/344M [00:09<00:16, 13.3MB/s]
35%|█████████████▋ | 121M/344M [00:09<00:16, 13.2MB/s]
36%|█████████████▊ | 122M/344M [00:09<00:16, 13.2MB/s]
36%|██████████████ | 124M/344M [00:09<00:16, 13.2MB/s]
36%|██████████████▏ | 125M/344M [00:09<00:16, 13.4MB/s]
37%|██████████████▎ | 126M/344M [00:09<00:16, 13.2MB/s]
37%|██████████████▍ | 128M/344M [00:09<00:15, 13.5MB/s]
38%|██████████████▋ | 129M/344M [00:09<00:16, 13.3MB/s]
38%|██████████████▊ | 131M/344M [00:10<00:16, 13.3MB/s]
38%|██████████████▉ | 132M/344M [00:10<00:15, 13.3MB/s]
39%|███████████████ | 133M/344M [00:10<00:15, 13.4MB/s]
39%|███████████████▎ | 135M/344M [00:10<00:15, 13.3MB/s]
40%|███████████████▍ | 136M/344M [00:10<00:15, 13.4MB/s]
40%|███████████████▌ | 137M/344M [00:10<00:15, 13.3MB/s]
40%|███████████████▋ | 139M/344M [00:10<00:15, 13.2MB/s]
41%|███████████████▊ | 140M/344M [00:10<00:15, 13.2MB/s]
41%|████████████████ | 141M/344M [00:10<00:15, 13.4MB/s]
41%|████████████████▏ | 143M/344M [00:11<00:15, 13.4MB/s]
42%|████████████████▎ | 144M/344M [00:11<00:15, 13.2MB/s]
42%|████████████████▍ | 145M/344M [00:11<00:14, 13.4MB/s]
43%|████████████████▋ | 147M/344M [00:11<00:14, 13.3MB/s]
43%|████████████████▊ | 148M/344M [00:11<00:14, 13.6MB/s]
43%|████████████████▉ | 150M/344M [00:11<00:14, 13.6MB/s]
44%|█████████████████▏ | 151M/344M [00:11<00:13, 13.8MB/s]
44%|█████████████████▎ | 152M/344M [00:11<00:13, 13.7MB/s]
45%|█████████████████▍ | 154M/344M [00:11<00:13, 14.2MB/s]
45%|█████████████████▌ | 155M/344M [00:11<00:13, 13.8MB/s]
46%|█████████████████▊ | 157M/344M [00:12<00:13, 14.3MB/s]
46%|█████████████████▉ | 158M/344M [00:12<00:13, 14.3MB/s]
46%|██████████████████ | 160M/344M [00:12<00:13, 14.0MB/s]
47%|██████████████████▎ | 161M/344M [00:12<00:12, 14.2MB/s]
47%|██████████████████▍ | 163M/344M [00:12<00:13, 13.7MB/s]
48%|██████████████████▌ | 164M/344M [00:12<00:12, 14.0MB/s]
48%|██████████████████▊ | 166M/344M [00:12<00:13, 13.7MB/s]
49%|██████████████████▉ | 167M/344M [00:12<00:12, 13.8MB/s]
49%|███████████████████ | 168M/344M [00:12<00:12, 13.8MB/s]
49%|███████████████████▎ | 170M/344M [00:12<00:12, 14.1MB/s]
50%|███████████████████▍ | 171M/344M [00:13<00:12, 13.9MB/s]
50%|███████████████████▌ | 173M/344M [00:13<00:12, 13.9MB/s]
51%|███████████████████▋ | 174M/344M [00:13<00:11, 14.3MB/s]
51%|███████████████████▉ | 176M/344M [00:13<00:11, 14.7MB/s]
52%|████████████████████ | 177M/344M [00:13<00:11, 14.6MB/s]
52%|████████████████████▎ | 179M/344M [00:13<00:11, 14.8MB/s]
52%|████████████████████▍ | 180M/344M [00:13<00:11, 14.6MB/s]
53%|████████████████████▌ | 182M/344M [00:13<00:11, 14.7MB/s]
53%|████████████████████▊ | 183M/344M [00:13<00:10, 14.9MB/s]
54%|████████████████████▉ | 185M/344M [00:13<00:11, 14.5MB/s]
54%|█████████████████████ | 186M/344M [00:14<00:10, 14.6MB/s]
55%|█████████████████████▎ | 188M/344M [00:14<00:10, 14.5MB/s]
55%|█████████████████████▍ | 189M/344M [00:14<00:10, 14.5MB/s]
55%|█████████████████████▌ | 191M/344M [00:14<00:10, 14.5MB/s]
56%|█████████████████████▊ | 192M/344M [00:14<00:10, 14.5MB/s]
56%|█████████████████████▉ | 194M/344M [00:14<00:10, 14.5MB/s]
57%|██████████████████████ | 195M/344M [00:14<00:10, 14.3MB/s]
57%|██████████████████████▎ | 196M/344M [00:14<00:10, 14.5MB/s]
58%|██████████████████████▍ | 198M/344M [00:14<00:10, 14.4MB/s]
58%|██████████████████████▌ | 199M/344M [00:14<00:09, 14.7MB/s]
58%|██████████████████████▊ | 201M/344M [00:15<00:09, 14.6MB/s]
59%|██████████████████████▉ | 202M/344M [00:15<00:09, 14.4MB/s]
59%|███████████████████████ | 204M/344M [00:15<00:09, 14.7MB/s]
60%|███████████████████████▎ | 205M/344M [00:15<00:09, 14.4MB/s]
60%|███████████████████████▍ | 207M/344M [00:15<00:09, 14.8MB/s]
61%|███████████████████████▋ | 208M/344M [00:15<00:09, 13.9MB/s]
61%|███████████████████████▊ | 210M/344M [00:15<00:10, 13.3MB/s]
61%|███████████████████████▉ | 211M/344M [00:15<00:10, 12.9MB/s]
62%|████████████████████████ | 212M/344M [00:15<00:10, 12.6MB/s]
62%|████████████████████████▏ | 214M/344M [00:16<00:10, 12.4MB/s]
62%|████████████████████████▎ | 215M/344M [00:16<00:10, 12.2MB/s]
63%|████████████████████████▌ | 216M/344M [00:16<00:10, 12.2MB/s]
63%|████████████████████████▋ | 217M/344M [00:16<00:10, 12.3MB/s]
64%|████████████████████████▊ | 219M/344M [00:16<00:10, 12.4MB/s]
64%|████████████████████████▉ | 220M/344M [00:16<00:09, 12.6MB/s]
64%|█████████████████████████ | 221M/344M [00:16<00:09, 12.7MB/s]
65%|█████████████████████████▏ | 223M/344M [00:16<00:09, 12.7MB/s]
65%|█████████████████████████▍ | 224M/344M [00:16<00:09, 12.6MB/s]
65%|█████████████████████████▌ | 225M/344M [00:16<00:09, 12.6MB/s]
66%|█████████████████████████▋ | 226M/344M [00:17<00:09, 12.4MB/s]
66%|█████████████████████████▊ | 228M/344M [00:17<00:09, 12.3MB/s]
67%|█████████████████████████▉ | 229M/344M [00:17<00:09, 12.3MB/s]
67%|██████████████████████████ | 230M/344M [00:17<00:09, 12.2MB/s]
67%|██████████████████████████▏ | 231M/344M [00:17<00:09, 12.2MB/s]
68%|██████████████████████████▎ | 233M/344M [00:17<00:09, 12.1MB/s]
68%|██████████████████████████▌ | 234M/344M [00:17<00:09, 12.2MB/s]
68%|██████████████████████████▋ | 235M/344M [00:17<00:08, 12.2MB/s]
69%|██████████████████████████▊ | 236M/344M [00:17<00:08, 12.3MB/s]
69%|██████████████████████████▉ | 238M/344M [00:17<00:08, 12.3MB/s]
69%|███████████████████████████ | 239M/344M [00:18<00:08, 12.2MB/s]
70%|███████████████████████████▏ | 240M/344M [00:18<00:08, 12.3MB/s]
70%|███████████████████████████▎ | 241M/344M [00:18<00:08, 12.3MB/s]
70%|███████████████████████████▍ | 242M/344M [00:18<00:08, 12.2MB/s]
71%|███████████████████████████▋ | 244M/344M [00:18<00:08, 12.3MB/s]
71%|███████████████████████████▊ | 245M/344M [00:18<00:08, 12.2MB/s]
72%|███████████████████████████▉ | 246M/344M [00:18<00:07, 12.2MB/s]
72%|████████████████████████████ | 247M/344M [00:18<00:07, 12.3MB/s]
72%|████████████████████████████▏ | 249M/344M [00:18<00:07, 12.3MB/s]
73%|████████████████████████████▎ | 250M/344M [00:19<00:07, 12.3MB/s]
73%|████████████████████████████▍ | 251M/344M [00:19<00:07, 12.2MB/s]
73%|████████████████████████████▌ | 252M/344M [00:19<00:07, 12.2MB/s]
74%|████████████████████████████▋ | 254M/344M [00:19<00:07, 12.2MB/s]
74%|████████████████████████████▉ | 255M/344M [00:19<00:07, 12.2MB/s]
74%|█████████████████████████████ | 256M/344M [00:19<00:07, 12.2MB/s]
75%|█████████████████████████████▏ | 257M/344M [00:19<00:07, 12.2MB/s]
75%|█████████████████████████████▎ | 259M/344M [00:19<00:06, 12.3MB/s]
76%|█████████████████████████████▍ | 260M/344M [00:19<00:06, 12.4MB/s]
76%|█████████████████████████████▌ | 261M/344M [00:19<00:06, 12.3MB/s]
76%|█████████████████████████████▋ | 262M/344M [00:20<00:06, 12.2MB/s]
77%|█████████████████████████████▊ | 264M/344M [00:20<00:06, 12.2MB/s]
77%|██████████████████████████████ | 265M/344M [00:20<00:06, 12.1MB/s]
77%|██████████████████████████████▏ | 266M/344M [00:20<00:06, 12.2MB/s]
78%|██████████████████████████████▎ | 267M/344M [00:20<00:06, 12.3MB/s]
78%|██████████████████████████████▍ | 268M/344M [00:20<00:06, 12.3MB/s]
78%|██████████████████████████████▌ | 270M/344M [00:20<00:06, 12.2MB/s]
79%|██████████████████████████████▋ | 271M/344M [00:20<00:05, 12.4MB/s]
79%|██████████████████████████████▊ | 272M/344M [00:20<00:05, 12.3MB/s]
79%|██████████████████████████████▉ | 273M/344M [00:20<00:05, 12.4MB/s]
80%|███████████████████████████████▏ | 275M/344M [00:21<00:05, 12.3MB/s]
80%|███████████████████████████████▎ | 276M/344M [00:21<00:05, 12.0MB/s]
81%|███████████████████████████████▍ | 277M/344M [00:21<00:05, 12.4MB/s]
81%|███████████████████████████████▌ | 279M/344M [00:21<00:05, 12.4MB/s]
81%|███████████████████████████████▋ | 280M/344M [00:21<00:05, 12.4MB/s]
82%|███████████████████████████████▊ | 281M/344M [00:21<00:05, 12.4MB/s]
82%|███████████████████████████████▉ | 282M/344M [00:21<00:05, 12.3MB/s]
82%|████████████████████████████████▏ | 283M/344M [00:21<00:04, 12.3MB/s]
83%|████████████████████████████████▎ | 285M/344M [00:21<00:04, 12.3MB/s]
83%|████████████████████████████████▍ | 286M/344M [00:21<00:04, 12.2MB/s]
83%|████████████████████████████████▌ | 287M/344M [00:22<00:04, 12.2MB/s]
84%|████████████████████████████████▋ | 288M/344M [00:22<00:04, 12.3MB/s]
84%|████████████████████████████████▊ | 290M/344M [00:22<00:04, 12.3MB/s]
85%|████████████████████████████████▉ | 291M/344M [00:22<00:04, 12.2MB/s]
85%|█████████████████████████████████ | 292M/344M [00:22<00:04, 12.2MB/s]
85%|█████████████████████████████████▎ | 293M/344M [00:22<00:04, 12.2MB/s]
86%|█████████████████████████████████▍ | 295M/344M [00:22<00:04, 12.2MB/s]
86%|█████████████████████████████████▌ | 296M/344M [00:22<00:03, 12.2MB/s]
86%|█████████████████████████████████▋ | 297M/344M [00:22<00:03, 12.2MB/s]
87%|█████████████████████████████████▊ | 298M/344M [00:22<00:03, 12.2MB/s]
87%|█████████████████████████████████▉ | 300M/344M [00:23<00:03, 12.3MB/s]
87%|██████████████████████████████████ | 301M/344M [00:23<00:03, 12.2MB/s]
88%|██████████████████████████████████▏ | 302M/344M [00:23<00:03, 12.2MB/s]
88%|██████████████████████████████████▎ | 303M/344M [00:23<00:03, 12.2MB/s]
88%|██████████████████████████████████▌ | 304M/344M [00:23<00:03, 12.4MB/s]
89%|██████████████████████████████████▋ | 306M/344M [00:23<00:03, 12.5MB/s]
89%|██████████████████████████████████▊ | 307M/344M [00:23<00:02, 12.5MB/s]
90%|██████████████████████████████████▉ | 308M/344M [00:23<00:02, 12.6MB/s]
90%|███████████████████████████████████ | 310M/344M [00:23<00:02, 12.5MB/s]
90%|███████████████████████████████████▏ | 311M/344M [00:23<00:02, 12.5MB/s]
91%|███████████████████████████████████▍ | 312M/344M [00:24<00:02, 12.4MB/s]
91%|███████████████████████████████████▌ | 313M/344M [00:24<00:02, 12.6MB/s]
91%|███████████████████████████████████▋ | 315M/344M [00:24<00:02, 12.6MB/s]
92%|███████████████████████████████████▊ | 316M/344M [00:24<00:02, 12.6MB/s]
92%|███████████████████████████████████▉ | 317M/344M [00:24<00:02, 12.6MB/s]
93%|████████████████████████████████████ | 318M/344M [00:24<00:02, 12.5MB/s]
93%|████████████████████████████████████▏ | 320M/344M [00:24<00:01, 12.7MB/s]
93%|████████████████████████████████████▍ | 321M/344M [00:24<00:01, 12.7MB/s]
94%|████████████████████████████████████▌ | 323M/344M [00:24<00:01, 13.5MB/s]
94%|████████████████████████████████████▊ | 324M/344M [00:24<00:01, 14.9MB/s]
95%|████████████████████████████████████▉ | 326M/344M [00:25<00:01, 15.8MB/s]
95%|█████████████████████████████████████▏ | 328M/344M [00:25<00:00, 16.4MB/s]
96%|█████████████████████████████████████▍ | 330M/344M [00:25<00:00, 16.8MB/s]
96%|█████████████████████████████████████▌ | 332M/344M [00:25<00:00, 17.0MB/s]
97%|█████████████████████████████████████▊ | 333M/344M [00:25<00:00, 17.1MB/s]
97%|█████████████████████████████████████▉ | 335M/344M [00:25<00:00, 17.0MB/s]
98%|██████████████████████████████████████▏| 337M/344M [00:25<00:00, 16.8MB/s]
98%|██████████████████████████████████████▎| 338M/344M [00:25<00:00, 16.6MB/s]
99%|██████████████████████████████████████▌| 340M/344M [00:25<00:00, 16.6MB/s]
99%|██████████████████████████████████████▋| 342M/344M [00:25<00:00, 16.4MB/s]
100%|██████████████████████████████████████▉| 343M/344M [00:26<00:00, 16.2MB/s]
0%| | 0.00/344M [00:00<?, ?B/s]
100%|████████████████████████████████████████| 344M/344M [00:00<00:00, 174GB/s]
Finished operation in 0:02:32
geometry | aerialway | airports | buildings | culture_art_entertainment | education | emergency | finances | greenery | healthcare | historic | leisure | other | shops | sport | sustenance | tourism | transportation | water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||
node/33082025 | POINT (-9.12436 38.72831) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position | None |
node/33082026 | POINT (-9.12458 38.72836) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position | None |
node/33084039 | POINT (-9.10711 38.72899) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position | None |
node/33360557 | POINT (-9.13908 38.71066) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | railway=subway_entrance | None |
node/254753716 | POINT (-9.14165 38.72275) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position | None |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
way/1249257754 | POLYGON ((-9.14865 38.72863, -9.14881 38.72855... | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | tourism=hotel | None | None |
way/1249344025 | POLYGON ((-9.10443 38.76544, -9.10439 38.76511... | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | amenity=parking | None |
way/1249656110 | POLYGON ((-9.13161 38.71549, -9.13163 38.71548... | None | None | None | None | None | None | None | None | None | None | leisure=swimming_pool | None | None | None | None | None | None | None |
way/1257228044 | POLYGON ((-9.11005 38.74418, -9.11002 38.74418... | None | None | None | None | None | None | None | landuse=meadow | None | None | None | None | None | None | None | None | None | None |
way/1257228053 | POLYGON ((-9.10917 38.74387, -9.10909 38.74383... | None | None | None | None | None | None | None | landuse=grass | None | None | None | None | None | None | None | None | None | None |
29648 rows × 19 columns
Join the objects with the regions they belong to¶
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, features_gdf)
joint_gdf
region_id | feature_id |
---|---|
89393375eafffff | node/11287438896 |
way/182889607 | |
node/6467058270 | |
node/6467058269 | |
way/1103408947 | |
... | ... |
89393362923ffff | node/4364239902 |
node/11918427761 | |
node/5973556958 | |
way/427210041 | |
way/93763775 |
33141 rows × 0 columns
Embed using features existing in data¶
ContextualCountEmbedder
extends capabilities of basic CountEmbedder
by incorporating the neighbourhood of embedded region. In this example we will use the H3Neighbourhood
.
h3n = H3Neighbourhood()
Squashed vector version (default)¶
Embedder will return vector of the same length as CountEmbedder
, but will sum averaged values from the neighbourhoods diminished by the neighbour distance squared.
cce = ContextualCountEmbedder(
neighbourhood=h3n, neighbourhood_distance=10, concatenate_vectors=False
)
embeddings = cce.transform(regions_gdf, features_gdf, joint_gdf)
embeddings
Generating embeddings for neighbours: 0%| | 0/10 [00:00<?, ?it/s]
Generating embeddings for neighbours: 10%|█████████████▉ | 1/10 [00:00<00:04, 2.16it/s]
Generating embeddings for neighbours: 20%|███████████████████████████▊ | 2/10 [00:01<00:05, 1.42it/s]
Generating embeddings for neighbours: 30%|█████████████████████████████████████████▋ | 3/10 [00:02<00:05, 1.23it/s]
Generating embeddings for neighbours: 40%|███████████████████████████████████████████████████████▌ | 4/10 [00:03<00:05, 1.17it/s]
Generating embeddings for neighbours: 50%|█████████████████████████████████████████████████████████████████████▌ | 5/10 [00:04<00:04, 1.11it/s]
Generating embeddings for neighbours: 60%|███████████████████████████████████████████████████████████████████████████████████▍ | 6/10 [00:05<00:03, 1.12it/s]
Generating embeddings for neighbours: 70%|█████████████████████████████████████████████████████████████████████████████████████████████████▎ | 7/10 [00:05<00:02, 1.25it/s]
Generating embeddings for neighbours: 80%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 8/10 [00:06<00:01, 1.38it/s]
Generating embeddings for neighbours: 90%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 9/10 [00:06<00:00, 1.38it/s]
Generating embeddings for neighbours: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.27it/s]
Generating embeddings for neighbours: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00, 1.27it/s]
aerialway | airports | buildings | culture_art_entertainment | education | emergency | finances | greenery | healthcare | historic | leisure | other | shops | sport | sustenance | tourism | transportation | water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | ||||||||||||||||||
89393375eafffff | 0.000000 | 0.072681 | 0.172622 | 0.061501 | 1.528754 | 0.006953 | 0.077612 | 7.063242 | 1.232352 | 0.102468 | 6.066067 | 0.085638 | 0.511175 | 2.184690 | 0.344730 | 0.359538 | 17.358640 | 3.179877 |
89393375bd7ffff | 0.002571 | 0.085539 | 0.342563 | 0.025650 | 1.318792 | 0.051086 | 0.175110 | 40.441995 | 0.314964 | 0.091541 | 6.361588 | 0.339900 | 1.471082 | 0.849505 | 0.648031 | 0.384639 | 10.448222 | 0.059617 |
89393367593ffff | 0.029869 | 0.005094 | 3.895792 | 0.047278 | 1.467486 | 0.059723 | 0.247737 | 6.789595 | 1.613498 | 0.121502 | 3.628074 | 0.183688 | 6.269584 | 1.581679 | 6.517240 | 0.863253 | 23.838351 | 0.622676 |
893933674c7ffff | 0.000000 | 0.004865 | 0.453913 | 0.079248 | 1.734145 | 0.004311 | 0.339349 | 7.434820 | 0.406112 | 0.190524 | 6.272197 | 1.324375 | 3.781918 | 3.942077 | 2.996424 | 2.912269 | 11.585708 | 0.111753 |
8939336291bffff | 0.002999 | 0.001107 | 7.677233 | 0.563714 | 3.179352 | 0.061792 | 1.373996 | 6.856475 | 5.665501 | 4.728557 | 7.547984 | 0.625591 | 21.008825 | 1.389960 | 20.018408 | 11.039550 | 39.739968 | 3.437012 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
893933670c7ffff | 0.000000 | 0.000000 | 0.449794 | 0.118885 | 0.203727 | 0.002279 | 0.267195 | 0.983572 | 0.242713 | 0.542425 | 0.768711 | 0.319655 | 1.130012 | 0.221153 | 1.261395 | 0.562957 | 6.168536 | 2.595571 |
89393375e77ffff | 0.001260 | 0.062867 | 1.575070 | 1.034028 | 0.670339 | 0.006059 | 0.453115 | 10.085802 | 0.648666 | 0.327498 | 6.117789 | 0.220976 | 2.393762 | 0.639035 | 3.040032 | 4.257465 | 8.345633 | 4.231709 |
8939336050fffff | 0.000000 | 0.000000 | 1.052120 | 1.145575 | 3.899146 | 0.002337 | 0.201507 | 5.341798 | 0.289971 | 1.672643 | 25.349041 | 1.434187 | 1.601335 | 7.882438 | 1.586429 | 1.700717 | 10.169676 | 0.724271 |
89393362953ffff | 0.003912 | 0.000140 | 2.691972 | 0.368362 | 1.050498 | 0.068171 | 1.030027 | 8.839801 | 1.391188 | 2.846826 | 12.063737 | 1.648895 | 8.512483 | 2.703981 | 8.078421 | 8.435443 | 24.200379 | 2.625915 |
89393362923ffff | 0.000721 | 0.001473 | 5.116668 | 0.216235 | 7.217274 | 0.010448 | 1.573282 | 5.838340 | 1.552192 | 1.702333 | 17.844794 | 2.837060 | 26.570471 | 5.375883 | 22.269152 | 4.335175 | 78.863438 | 0.166823 |
830 rows × 18 columns
Concatenated vector version¶
Embedder will return vector of length n * distance
where n
is number of features from the CountEmbedder
and distance
is number of neighbourhoods analysed.
Each feature will be postfixed with _n
string, where n
is the current distance. Values are averaged from all neighbours.
wide_cce = ContextualCountEmbedder(
neighbourhood=h3n, neighbourhood_distance=10, concatenate_vectors=True
)
wide_embeddings = wide_cce.transform(regions_gdf, features_gdf, joint_gdf)
wide_embeddings
Generating embeddings for neighbours: 0%| | 0/10 [00:00<?, ?it/s]
Generating embeddings for neighbours: 10%|█████████████▉ | 1/10 [00:00<00:06, 1.46it/s]
Generating embeddings for neighbours: 20%|███████████████████████████▊ | 2/10 [00:01<00:05, 1.33it/s]
Generating embeddings for neighbours: 30%|█████████████████████████████████████████▋ | 3/10 [00:02<00:04, 1.44it/s]
Generating embeddings for neighbours: 40%|███████████████████████████████████████████████████████▌ | 4/10 [00:02<00:03, 1.63it/s]
Generating embeddings for neighbours: 50%|█████████████████████████████████████████████████████████████████████▌ | 5/10 [00:03<00:02, 1.75it/s]
Generating embeddings for neighbours: 60%|███████████████████████████████████████████████████████████████████████████████████▍ | 6/10 [00:03<00:02, 1.82it/s]
Generating embeddings for neighbours: 70%|█████████████████████████████████████████████████████████████████████████████████████████████████▎ | 7/10 [00:04<00:01, 1.87it/s]
Generating embeddings for neighbours: 80%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████▏ | 8/10 [00:04<00:01, 1.85it/s]
Generating embeddings for neighbours: 90%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 9/10 [00:05<00:00, 1.88it/s]
Generating embeddings for neighbours: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:05<00:00, 1.90it/s]
Generating embeddings for neighbours: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:05<00:00, 1.76it/s]
aerialway_0 | airports_0 | buildings_0 | culture_art_entertainment_0 | education_0 | emergency_0 | finances_0 | greenery_0 | healthcare_0 | historic_0 | ... | healthcare_10 | historic_10 | leisure_10 | other_10 | shops_10 | sport_10 | sustenance_10 | tourism_10 | transportation_10 | water_10 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
89393375eafffff | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 4.0 | 1.0 | 0.0 | ... | 1.103448 | 0.344828 | 5.862069 | 0.517241 | 2.655172 | 2.241379 | 2.586207 | 1.241379 | 15.275862 | 0.103448 |
89393375bd7ffff | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 33.0 | 0.0 | 0.0 | ... | 0.756757 | 0.729730 | 4.459459 | 0.702703 | 4.810811 | 1.918919 | 3.243243 | 2.054054 | 15.918919 | 0.945946 |
89393367593ffff | 0.0 | 0.0 | 2.0 | 0.0 | 1.0 | 0.0 | 0.0 | 4.0 | 1.0 | 0.0 | ... | 0.321429 | 0.285714 | 4.357143 | 0.464286 | 2.678571 | 2.214286 | 1.500000 | 1.107143 | 11.821429 | 0.785714 |
893933674c7ffff | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0.0 | ... | 1.545455 | 2.181818 | 4.045455 | 1.022727 | 11.704545 | 1.659091 | 11.454545 | 6.704545 | 19.181818 | 0.659091 |
8939336291bffff | 0.0 | 0.0 | 5.0 | 0.0 | 2.0 | 0.0 | 0.0 | 3.0 | 4.0 | 4.0 | ... | 0.533333 | 1.033333 | 3.366667 | 0.716667 | 3.583333 | 1.300000 | 5.950000 | 2.900000 | 12.300000 | 0.666667 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
893933670c7ffff | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 2.000000 | 2.222222 | 3.777778 | 1.444444 | 19.962963 | 0.592593 | 15.851852 | 7.555556 | 22.777778 | 0.703704 |
89393375e77ffff | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | 0.0 | 0.0 | 7.0 | 0.0 | 0.0 | ... | 1.027778 | 0.277778 | 4.222222 | 0.527778 | 4.638889 | 1.222222 | 3.083333 | 2.500000 | 17.138889 | 0.666667 |
8939336050fffff | 0.0 | 0.0 | 0.0 | 1.0 | 3.0 | 0.0 | 0.0 | 1.0 | 0.0 | 1.0 | ... | 0.352941 | 0.529412 | 4.411765 | 0.235294 | 1.588235 | 2.000000 | 1.470588 | 1.529412 | 6.470588 | 0.411765 |
89393362953ffff | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 2.0 | ... | 0.559322 | 0.542373 | 2.983051 | 0.559322 | 2.338983 | 1.016949 | 2.627119 | 1.711864 | 10.322034 | 0.796610 |
89393362923ffff | 0.0 | 0.0 | 3.0 | 0.0 | 6.0 | 0.0 | 0.0 | 3.0 | 0.0 | 1.0 | ... | 0.812500 | 0.541667 | 3.250000 | 0.687500 | 5.020833 | 0.687500 | 4.395833 | 2.354167 | 10.062500 | 1.104167 |
830 rows × 198 columns
Plotting example features¶
plot_numeric_data(regions_gdf, "leisure", embeddings)
plot_numeric_data(regions_gdf, "transportation", embeddings)