Contextual count embedder
from srai.embedders import ContextualCountEmbedder
from srai.joiners import IntersectionJoiner
from srai.loaders.osm_loaders import OSMPbfLoader
from srai.neighbourhoods import H3Neighbourhood
from srai.plotting.folium_wrapper import plot_numeric_data, plot_regions
from srai.regionalizers import H3Regionalizer
Data preparation¶
In order to use ContextualCountEmbedder
we need to prepare some data.
Namely we need: regions_gdf
, features_gdf
, and joint_gdf
.
These are the outputs of Regionalizers, Loaders and Joiners respectively.
from srai.regionalizers import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Lisboa, PT")
plot_regions(area_gdf)
Regionalize the area using an H3Regionalizer¶
regionalizer = H3Regionalizer(resolution=9, buffer=True)
regions_gdf = regionalizer.transform(area_gdf)
regions_gdf
geometry | |
---|---|
region_id | |
89393375807ffff | POLYGON ((-9.12892 38.78367, -9.13063 38.78233... |
89393362903ffff | POLYGON ((-9.14820 38.73793, -9.14990 38.73659... |
89393375e37ffff | POLYGON ((-9.14905 38.77961, -9.15076 38.77827... |
89393362b77ffff | POLYGON ((-9.14994 38.71068, -9.15165 38.70933... |
89393367097ffff | POLYGON ((-9.10318 38.73804, -9.10489 38.73671... |
... | ... |
89393375a27ffff | POLYGON ((-9.12557 38.76043, -9.12727 38.75909... |
89393362cb7ffff | POLYGON ((-9.18964 38.75946, -9.19134 38.75812... |
89393367683ffff | POLYGON ((-9.13243 38.72916, -9.13414 38.72782... |
8939337535bffff | POLYGON ((-9.19954 38.76344, -9.20125 38.76210... |
8939337595bffff | POLYGON ((-9.11500 38.78050, -9.11671 38.77916... |
830 rows × 1 columns
Download some objects from OpenStreetMap¶
You can use both OsmTagsFilter
and GroupedOsmTagsFilter
filters. In this example, a predefined GroupedOsmTagsFilter
filter BASE_OSM_GROUPS_FILTER
is used.
from srai.loaders.osm_loaders.filters import BASE_OSM_GROUPS_FILTER
loader = OSMPbfLoader()
features_gdf = loader.load(area_gdf, tags=BASE_OSM_GROUPS_FILTER)
features_gdf
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/srai/loaders/osm_loaders/osm_pbf_loader.py:128: FutureWarning: Use `convert_geometry_to_geodataframe` instead. Deprecated since 0.8.1 version. features_gdf = pbf_reader.get_features_gdf_from_geometry(
0%| | 0.00/356M [00:00<?, ?B/s]
0%| | 8.19k/356M [00:00<2:11:09, 45.2kB/s]
0%| | 36.9k/356M [00:00<53:36, 111kB/s]
0%| | 94.2k/356M [00:00<29:16, 202kB/s]
0%| | 209k/356M [00:00<16:05, 368kB/s]
0%| | 434k/356M [00:00<08:44, 676kB/s]
0%| | 889k/356M [00:01<04:35, 1.29MB/s]
1%|▏ | 1.80M/356M [00:01<02:22, 2.48MB/s]
1%|▍ | 3.62M/356M [00:01<01:12, 4.83MB/s]
2%|▋ | 6.26M/356M [00:01<00:44, 7.80MB/s]
2%|▉ | 8.73M/356M [00:01<00:36, 9.53MB/s]
3%|█▏ | 11.1M/356M [00:02<00:32, 10.5MB/s]
4%|█▍ | 13.4M/356M [00:02<00:30, 11.2MB/s]
4%|█▋ | 15.8M/356M [00:02<00:28, 11.7MB/s]
5%|█▉ | 18.5M/356M [00:02<00:26, 12.5MB/s]
6%|██▏ | 20.7M/356M [00:02<00:26, 12.4MB/s]
6%|██▍ | 23.1M/356M [00:02<00:26, 12.5MB/s]
7%|██▋ | 25.4M/356M [00:03<00:26, 12.5MB/s]
8%|██▉ | 27.7M/356M [00:03<00:26, 12.5MB/s]
8%|███▏ | 30.0M/356M [00:03<00:26, 12.5MB/s]
9%|███▍ | 32.3M/356M [00:03<00:25, 12.5MB/s]
10%|███▋ | 34.7M/356M [00:03<00:25, 12.6MB/s]
10%|███▉ | 37.0M/356M [00:04<00:25, 12.7MB/s]
11%|████▏ | 39.4M/356M [00:04<00:24, 12.7MB/s]
12%|████▍ | 41.7M/356M [00:04<00:24, 12.6MB/s]
12%|████▋ | 44.0M/356M [00:04<00:24, 12.6MB/s]
13%|████▉ | 46.3M/356M [00:04<00:24, 12.5MB/s]
14%|█████▏ | 48.8M/356M [00:04<00:23, 12.9MB/s]
14%|█████▍ | 51.4M/356M [00:05<00:23, 13.2MB/s]
15%|█████▊ | 53.9M/356M [00:05<00:22, 13.2MB/s]
16%|██████ | 56.4M/356M [00:05<00:22, 13.3MB/s]
17%|██████▎ | 58.8M/356M [00:05<00:22, 13.3MB/s]
17%|██████▌ | 61.3M/356M [00:05<00:22, 13.3MB/s]
18%|██████▊ | 63.9M/356M [00:06<00:21, 13.5MB/s]
19%|███████ | 66.2M/356M [00:06<00:21, 13.2MB/s]
19%|███████▎ | 68.5M/356M [00:06<00:22, 12.9MB/s]
20%|███████▌ | 70.8M/356M [00:06<00:22, 12.7MB/s]
21%|███████▊ | 73.1M/356M [00:06<00:19, 14.7MB/s]
21%|███████▉ | 74.7M/356M [00:06<00:20, 13.5MB/s]
21%|████████▏ | 76.1M/356M [00:07<00:21, 12.8MB/s]
22%|████████▎ | 78.1M/356M [00:07<00:19, 14.3MB/s]
22%|████████▌ | 79.6M/356M [00:07<00:20, 13.7MB/s]
23%|████████▋ | 81.3M/356M [00:07<00:21, 13.1MB/s]
23%|████████▊ | 82.9M/356M [00:07<00:19, 13.8MB/s]
24%|█████████ | 84.3M/356M [00:07<00:20, 13.2MB/s]
24%|█████████▏ | 86.1M/356M [00:07<00:18, 14.4MB/s]
25%|█████████▎ | 87.6M/356M [00:07<00:19, 13.4MB/s]
25%|█████████▌ | 88.9M/356M [00:07<00:21, 12.5MB/s]
26%|█████████▋ | 90.7M/356M [00:08<00:19, 13.7MB/s]
26%|█████████▊ | 92.1M/356M [00:08<00:20, 12.7MB/s]
26%|█████████▉ | 93.4M/356M [00:08<00:20, 12.5MB/s]
27%|██████████▏ | 94.9M/356M [00:08<00:19, 13.1MB/s]
27%|██████████▎ | 96.2M/356M [00:08<00:21, 12.1MB/s]
28%|██████████▍ | 97.9M/356M [00:08<00:19, 13.4MB/s]
28%|██████████▌ | 99.3M/356M [00:08<00:20, 12.4MB/s]
28%|███████████ | 101M/356M [00:08<00:21, 11.6MB/s]
29%|███████████▏ | 102M/356M [00:09<00:19, 12.7MB/s]
29%|███████████▎ | 103M/356M [00:09<00:21, 11.9MB/s]
29%|███████████▍ | 105M/356M [00:09<00:21, 11.5MB/s]
30%|███████████▋ | 106M/356M [00:09<00:19, 12.5MB/s]
30%|███████████▊ | 107M/356M [00:09<00:21, 11.6MB/s]
31%|███████████▉ | 109M/356M [00:09<00:20, 11.8MB/s]
31%|████████████ | 110M/356M [00:09<00:19, 12.3MB/s]
31%|████████████▏ | 112M/356M [00:09<00:21, 11.5MB/s]
32%|████████████▍ | 113M/356M [00:09<00:19, 12.7MB/s]
32%|████████████▌ | 115M/356M [00:10<00:20, 11.9MB/s]
33%|████████████▋ | 116M/356M [00:10<00:20, 11.5MB/s]
33%|████████████▊ | 117M/356M [00:10<00:19, 12.4MB/s]
33%|█████████████ | 119M/356M [00:10<00:20, 11.7MB/s]
34%|█████████████▏ | 120M/356M [00:10<00:19, 11.8MB/s]
34%|█████████████▎ | 122M/356M [00:10<00:18, 12.4MB/s]
35%|█████████████▍ | 123M/356M [00:10<00:20, 11.6MB/s]
35%|█████████████▋ | 124M/356M [00:10<00:18, 12.8MB/s]
35%|█████████████▊ | 126M/356M [00:10<00:19, 11.9MB/s]
36%|█████████████▉ | 127M/356M [00:11<00:19, 11.5MB/s]
36%|██████████████ | 129M/356M [00:11<00:17, 12.7MB/s]
37%|██████████████▏ | 130M/356M [00:11<00:19, 11.8MB/s]
37%|██████████████▍ | 131M/356M [00:11<00:18, 11.9MB/s]
37%|██████████████▌ | 133M/356M [00:11<00:18, 12.3MB/s]
38%|██████████████▋ | 134M/356M [00:11<00:18, 11.7MB/s]
38%|██████████████▊ | 135M/356M [00:11<00:17, 12.8MB/s]
38%|██████████████▉ | 137M/356M [00:11<00:18, 12.0MB/s]
39%|███████████████▏ | 138M/356M [00:12<00:18, 11.8MB/s]
39%|███████████████▎ | 139M/356M [00:12<00:17, 12.3MB/s]
40%|███████████████▍ | 141M/356M [00:12<00:18, 11.7MB/s]
40%|███████████████▌ | 142M/356M [00:12<00:16, 12.7MB/s]
40%|███████████████▋ | 143M/356M [00:12<00:17, 12.0MB/s]
41%|███████████████▊ | 145M/356M [00:12<00:17, 11.9MB/s]
41%|████████████████ | 146M/356M [00:12<00:17, 12.3MB/s]
41%|████████████████▏ | 147M/356M [00:12<00:17, 11.6MB/s]
42%|████████████████▎ | 149M/356M [00:12<00:16, 12.8MB/s]
42%|████████████████▍ | 150M/356M [00:13<00:17, 12.0MB/s]
43%|████████████████▌ | 151M/356M [00:13<00:17, 11.9MB/s]
43%|████████████████▋ | 153M/356M [00:13<00:16, 12.3MB/s]
43%|████████████████▉ | 154M/356M [00:13<00:17, 11.7MB/s]
44%|█████████████████ | 156M/356M [00:13<00:15, 12.9MB/s]
44%|█████████████████▏ | 157M/356M [00:13<00:16, 12.0MB/s]
44%|█████████████████▎ | 158M/356M [00:13<00:16, 11.9MB/s]
45%|█████████████████▍ | 159M/356M [00:13<00:15, 12.4MB/s]
45%|█████████████████▋ | 161M/356M [00:13<00:16, 11.7MB/s]
46%|█████████████████▊ | 162M/356M [00:13<00:14, 13.0MB/s]
46%|█████████████████▉ | 164M/356M [00:14<00:15, 12.1MB/s]
46%|██████████████████ | 165M/356M [00:14<00:15, 12.0MB/s]
47%|██████████████████▏ | 166M/356M [00:14<00:15, 12.3MB/s]
47%|██████████████████▎ | 167M/356M [00:14<00:15, 11.8MB/s]
48%|██████████████████▌ | 169M/356M [00:14<00:14, 13.0MB/s]
48%|██████████████████▋ | 170M/356M [00:14<00:15, 12.1MB/s]
48%|██████████████████▊ | 172M/356M [00:14<00:15, 11.9MB/s]
49%|██████████████████▉ | 173M/356M [00:14<00:14, 12.4MB/s]
49%|███████████████████ | 174M/356M [00:15<00:15, 11.7MB/s]
49%|███████████████████▎ | 176M/356M [00:15<00:13, 12.9MB/s]
50%|███████████████████▍ | 177M/356M [00:15<00:14, 12.1MB/s]
50%|███████████████████▌ | 178M/356M [00:15<00:14, 11.8MB/s]
51%|███████████████████▋ | 180M/356M [00:15<00:14, 12.3MB/s]
51%|███████████████████▊ | 181M/356M [00:15<00:14, 11.7MB/s]
51%|████████████████████ | 183M/356M [00:15<00:13, 12.7MB/s]
52%|████████████████████▏ | 184M/356M [00:15<00:14, 12.0MB/s]
52%|████████████████████▎ | 185M/356M [00:15<00:14, 11.9MB/s]
52%|████████████████████▍ | 186M/356M [00:15<00:13, 12.3MB/s]
53%|████████████████████▌ | 188M/356M [00:16<00:14, 11.7MB/s]
53%|████████████████████▊ | 189M/356M [00:16<00:13, 12.8MB/s]
54%|████████████████████▉ | 191M/356M [00:16<00:13, 12.0MB/s]
54%|█████████████████████ | 192M/356M [00:16<00:13, 11.8MB/s]
54%|█████████████████████▏ | 193M/356M [00:16<00:13, 12.2MB/s]
55%|█████████████████████▎ | 194M/356M [00:16<00:13, 11.6MB/s]
55%|█████████████████████▍ | 196M/356M [00:16<00:12, 12.8MB/s]
55%|█████████████████████▋ | 197M/356M [00:16<00:13, 12.0MB/s]
56%|█████████████████████▊ | 198M/356M [00:16<00:13, 11.9MB/s]
56%|█████████████████████▉ | 200M/356M [00:17<00:12, 12.4MB/s]
57%|██████████████████████ | 201M/356M [00:17<00:13, 11.6MB/s]
57%|██████████████████████▏ | 203M/356M [00:17<00:11, 12.8MB/s]
57%|██████████████████████▍ | 204M/356M [00:17<00:12, 11.9MB/s]
58%|██████████████████████▌ | 205M/356M [00:17<00:12, 12.0MB/s]
58%|██████████████████████▋ | 207M/356M [00:17<00:12, 12.3MB/s]
58%|██████████████████████▊ | 208M/356M [00:17<00:12, 11.8MB/s]
59%|██████████████████████▉ | 209M/356M [00:17<00:11, 12.9MB/s]
59%|███████████████████████ | 211M/356M [00:18<00:12, 11.9MB/s]
60%|███████████████████████▎ | 212M/356M [00:18<00:11, 12.0MB/s]
60%|███████████████████████▍ | 213M/356M [00:18<00:11, 12.4MB/s]
60%|███████████████████████▌ | 215M/356M [00:18<00:12, 11.7MB/s]
61%|███████████████████████▋ | 216M/356M [00:18<00:10, 12.9MB/s]
61%|███████████████████████▊ | 217M/356M [00:18<00:11, 12.0MB/s]
62%|███████████████████████▉ | 219M/356M [00:18<00:11, 12.0MB/s]
62%|████████████████████████▏ | 220M/356M [00:18<00:10, 12.3MB/s]
62%|████████████████████████▎ | 221M/356M [00:18<00:11, 11.7MB/s]
63%|████████████████████████▍ | 223M/356M [00:18<00:10, 12.9MB/s]
63%|████████████████████████▌ | 224M/356M [00:19<00:11, 11.9MB/s]
63%|████████████████████████▋ | 225M/356M [00:19<00:10, 12.0MB/s]
64%|████████████████████████▉ | 227M/356M [00:19<00:10, 12.3MB/s]
64%|█████████████████████████ | 228M/356M [00:19<00:10, 11.7MB/s]
65%|█████████████████████████▏ | 230M/356M [00:19<00:09, 12.9MB/s]
65%|█████████████████████████▎ | 231M/356M [00:19<00:10, 11.9MB/s]
65%|█████████████████████████▍ | 232M/356M [00:19<00:10, 12.0MB/s]
66%|█████████████████████████▌ | 234M/356M [00:19<00:09, 12.5MB/s]
66%|█████████████████████████▊ | 235M/356M [00:19<00:10, 11.6MB/s]
66%|█████████████████████████▉ | 236M/356M [00:20<00:09, 12.8MB/s]
67%|██████████████████████████ | 238M/356M [00:20<00:09, 11.9MB/s]
67%|██████████████████████████▏ | 239M/356M [00:20<00:09, 11.9MB/s]
68%|██████████████████████████▎ | 240M/356M [00:20<00:09, 12.3MB/s]
68%|██████████████████████████▍ | 241M/356M [00:20<00:09, 11.6MB/s]
68%|██████████████████████████▋ | 243M/356M [00:20<00:08, 12.8MB/s]
69%|██████████████████████████▊ | 244M/356M [00:20<00:09, 11.9MB/s]
69%|██████████████████████████▉ | 246M/356M [00:20<00:09, 11.7MB/s]
69%|███████████████████████████ | 247M/356M [00:20<00:08, 12.5MB/s]
70%|███████████████████████████▏ | 248M/356M [00:21<00:09, 11.5MB/s]
70%|███████████████████████████▍ | 250M/356M [00:21<00:08, 12.1MB/s]
71%|███████████████████████████▌ | 251M/356M [00:21<00:08, 12.5MB/s]
71%|███████████████████████████▋ | 253M/356M [00:21<00:08, 11.6MB/s]
72%|███████████████████████████▉ | 254M/356M [00:21<00:07, 12.8MB/s]
72%|████████████████████████████ | 256M/356M [00:21<00:08, 11.9MB/s]
72%|████████████████████████████▏ | 257M/356M [00:21<00:08, 11.9MB/s]
73%|████████████████████████████▎ | 258M/356M [00:21<00:07, 12.7MB/s]
73%|████████████████████████████▍ | 260M/356M [00:22<00:08, 11.6MB/s]
73%|████████████████████████████▋ | 261M/356M [00:22<00:07, 12.2MB/s]
74%|████████████████████████████▊ | 263M/356M [00:22<00:07, 12.6MB/s]
74%|████████████████████████████▉ | 264M/356M [00:22<00:07, 11.6MB/s]
75%|█████████████████████████████▏ | 266M/356M [00:22<00:06, 13.0MB/s]
75%|█████████████████████████████▎ | 267M/356M [00:22<00:07, 12.0MB/s]
75%|█████████████████████████████▍ | 268M/356M [00:22<00:07, 11.6MB/s]
76%|█████████████████████████████▌ | 270M/356M [00:22<00:06, 12.6MB/s]
76%|█████████████████████████████▋ | 271M/356M [00:22<00:07, 11.8MB/s]
77%|█████████████████████████████▉ | 272M/356M [00:23<00:06, 12.0MB/s]
77%|██████████████████████████████ | 274M/356M [00:23<00:06, 12.4MB/s]
77%|██████████████████████████████▏ | 275M/356M [00:23<00:07, 11.5MB/s]
78%|██████████████████████████████▎ | 277M/356M [00:23<00:06, 12.9MB/s]
78%|██████████████████████████████▍ | 278M/356M [00:23<00:06, 11.9MB/s]
79%|██████████████████████████████▋ | 279M/356M [00:23<00:06, 11.6MB/s]
79%|██████████████████████████████▊ | 281M/356M [00:23<00:06, 12.5MB/s]
79%|██████████████████████████████▉ | 282M/356M [00:23<00:06, 11.8MB/s]
80%|███████████████████████████████ | 284M/356M [00:24<00:05, 12.1MB/s]
80%|███████████████████████████████▎ | 285M/356M [00:24<00:05, 12.5MB/s]
80%|███████████████████████████████▍ | 286M/356M [00:24<00:06, 11.5MB/s]
81%|███████████████████████████████▌ | 288M/356M [00:24<00:05, 12.9MB/s]
81%|███████████████████████████████▋ | 289M/356M [00:24<00:05, 11.9MB/s]
82%|███████████████████████████████▊ | 290M/356M [00:24<00:05, 11.6MB/s]
82%|████████████████████████████████ | 292M/356M [00:24<00:05, 12.6MB/s]
82%|████████████████████████████████▏ | 293M/356M [00:24<00:05, 11.8MB/s]
83%|████████████████████████████████▎ | 295M/356M [00:24<00:05, 12.1MB/s]
83%|████████████████████████████████▍ | 296M/356M [00:25<00:04, 12.5MB/s]
84%|████████████████████████████████▌ | 297M/356M [00:25<00:05, 11.5MB/s]
84%|████████████████████████████████▊ | 299M/356M [00:25<00:04, 12.7MB/s]
84%|████████████████████████████████▉ | 300M/356M [00:25<00:04, 11.8MB/s]
85%|█████████████████████████████████ | 301M/356M [00:25<00:04, 11.5MB/s]
85%|█████████████████████████████████▏ | 303M/356M [00:25<00:04, 12.4MB/s]
86%|█████████████████████████████████▍ | 304M/356M [00:25<00:04, 11.6MB/s]
86%|█████████████████████████████████▌ | 306M/356M [00:25<00:04, 12.3MB/s]
86%|█████████████████████████████████▋ | 307M/356M [00:25<00:03, 12.3MB/s]
87%|█████████████████████████████████▊ | 308M/356M [00:26<00:03, 11.8MB/s]
87%|█████████████████████████████████▉ | 310M/356M [00:26<00:03, 12.6MB/s]
88%|██████████████████████████████████▏ | 311M/356M [00:26<00:03, 11.9MB/s]
88%|██████████████████████████████████▎ | 313M/356M [00:26<00:03, 12.3MB/s]
88%|██████████████████████████████████▍ | 314M/356M [00:26<00:03, 12.3MB/s]
89%|██████████████████████████████████▌ | 315M/356M [00:26<00:03, 11.8MB/s]
89%|██████████████████████████████████▋ | 317M/356M [00:26<00:03, 12.5MB/s]
89%|██████████████████████████████████▊ | 318M/356M [00:26<00:03, 11.8MB/s]
90%|███████████████████████████████████ | 319M/356M [00:26<00:02, 12.2MB/s]
90%|███████████████████████████████████▏ | 321M/356M [00:27<00:02, 12.1MB/s]
91%|███████████████████████████████████▎ | 322M/356M [00:27<00:02, 11.8MB/s]
91%|███████████████████████████████████▍ | 323M/356M [00:27<00:02, 12.6MB/s]
91%|███████████████████████████████████▌ | 325M/356M [00:27<00:02, 11.9MB/s]
92%|███████████████████████████████████▊ | 326M/356M [00:27<00:02, 12.3MB/s]
92%|███████████████████████████████████▉ | 327M/356M [00:27<00:02, 12.2MB/s]
92%|████████████████████████████████████ | 329M/356M [00:27<00:02, 11.8MB/s]
93%|████████████████████████████████████▏ | 330M/356M [00:27<00:02, 12.5MB/s]
93%|████████████████████████████████████▎ | 331M/356M [00:27<00:02, 11.8MB/s]
94%|████████████████████████████████████▌ | 333M/356M [00:28<00:01, 12.3MB/s]
94%|████████████████████████████████████▋ | 334M/356M [00:28<00:01, 12.1MB/s]
94%|████████████████████████████████████▊ | 335M/356M [00:28<00:01, 11.7MB/s]
95%|████████████████████████████████████▉ | 337M/356M [00:28<00:01, 12.4MB/s]
95%|█████████████████████████████████████ | 338M/356M [00:28<00:01, 11.7MB/s]
95%|█████████████████████████████████████▏ | 339M/356M [00:28<00:01, 12.2MB/s]
96%|█████████████████████████████████████▎ | 341M/356M [00:28<00:01, 12.1MB/s]
96%|█████████████████████████████████████▍ | 342M/356M [00:28<00:01, 11.7MB/s]
97%|█████████████████████████████████████▋ | 343M/356M [00:28<00:00, 12.6MB/s]
97%|█████████████████████████████████████▊ | 345M/356M [00:29<00:00, 11.9MB/s]
97%|█████████████████████████████████████▉ | 346M/356M [00:29<00:00, 12.3MB/s]
98%|██████████████████████████████████████ | 347M/356M [00:29<00:00, 12.1MB/s]
98%|██████████████████████████████████████▏| 349M/356M [00:29<00:00, 11.8MB/s]
98%|██████████████████████████████████████▍| 350M/356M [00:29<00:00, 12.6MB/s]
99%|██████████████████████████████████████▌| 351M/356M [00:29<00:00, 11.8MB/s]
99%|██████████████████████████████████████▋| 353M/356M [00:29<00:00, 12.3MB/s]
100%|██████████████████████████████████████▊| 354M/356M [00:29<00:00, 12.1MB/s]
100%|██████████████████████████████████████▉| 355M/356M [00:29<00:00, 11.8MB/s]
0%| | 0.00/356M [00:00<?, ?B/s]
100%|████████████████████████████████████████| 356M/356M [00:00<00:00, 446GB/s]
Finished operation in 0:01:23
geometry | aerialway | airports | buildings | culture_art_entertainment | education | emergency | finances | greenery | healthcare | historic | leisure | other | shops | sport | sustenance | tourism | transportation | water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | |||||||||||||||||||
node/259112393 | POINT (-9.16792 38.76113) | None | None | None | None | None | None | None | None | None | None | None | amenity=post_office | None | None | None | None | None | None |
node/259113774 | POINT (-9.16954 38.76066) | None | None | None | None | None | None | amenity=bank | None | None | None | None | None | None | None | None | None | None | None |
node/277831550 | POINT (-9.16476 38.73881) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position | None |
node/277831931 | POINT (-9.15994 38.73804) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position | None |
node/277834117 | POINT (-9.15942 38.75257) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position | None |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
relation/18073831 | POLYGON ((-9.16824 38.70755, -9.16811 38.70766... | None | None | None | None | None | None | None | None | None | None | leisure=garden | None | None | None | None | None | None | None |
relation/18114682 | POLYGON ((-9.19031 38.74668, -9.18983 38.74664... | None | None | None | None | None | None | None | None | amenity=clinic | None | None | None | None | None | None | None | None | None |
relation/18129628 | POLYGON ((-9.14697 38.71892, -9.14695 38.71867... | None | None | None | amenity=theatre | None | None | None | None | None | None | None | None | None | None | None | None | None | None |
relation/18168748 | POLYGON ((-9.17061 38.75452, -9.17057 38.75449... | None | None | None | None | None | None | None | landuse=recreation_ground | None | None | None | None | None | None | None | None | None | None |
relation/18216967 | POLYGON ((-9.13568 38.71231, -9.13578 38.71233... | None | None | None | None | None | None | None | None | amenity=clinic | historic=castle | None | None | None | None | None | None | None | None |
31730 rows × 19 columns
Join the objects with the regions they belong to¶
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, features_gdf)
joint_gdf
region_id | feature_id |
---|---|
89393375807ffff | way/336112741 |
way/857632816 | |
way/336112743 | |
way/246197499 | |
way/336146807 | |
... | ... |
8939337595bffff | way/784894014 |
node/5746421700 | |
way/891675661 | |
way/216524628 | |
way/531347329 |
35256 rows × 0 columns
Embed using features existing in data¶
ContextualCountEmbedder
extends capabilities of basic CountEmbedder
by incorporating the neighbourhood of embedded region. In this example we will use the H3Neighbourhood
.
h3n = H3Neighbourhood()
Squashed vector version (default)¶
Embedder will return vector of the same length as CountEmbedder
, but will sum averaged values from the neighbourhoods diminished by the neighbour distance squared.
cce = ContextualCountEmbedder(
neighbourhood=h3n, neighbourhood_distance=10, concatenate_vectors=False
)
embeddings = cce.transform(regions_gdf, features_gdf, joint_gdf)
embeddings
Generating embeddings for neighbours: 0%| | 0/10 [00:00<?, ?it/s]
Generating embeddings for neighbours: 10%|█ | 1/10 [00:00<00:02, 3.12it/s]
Generating embeddings for neighbours: 20%|██ | 2/10 [00:00<00:02, 3.06it/s]
Generating embeddings for neighbours: 30%|███ | 3/10 [00:00<00:02, 3.04it/s]
Generating embeddings for neighbours: 40%|████ | 4/10 [00:01<00:01, 3.03it/s]
Generating embeddings for neighbours: 50%|█████ | 5/10 [00:01<00:01, 3.01it/s]
Generating embeddings for neighbours: 60%|██████ | 6/10 [00:01<00:01, 2.97it/s]
Generating embeddings for neighbours: 70%|███████ | 7/10 [00:02<00:01, 2.96it/s]
Generating embeddings for neighbours: 80%|████████ | 8/10 [00:02<00:00, 2.95it/s]
Generating embeddings for neighbours: 90%|█████████ | 9/10 [00:03<00:00, 2.92it/s]
Generating embeddings for neighbours: 100%|██████████| 10/10 [00:03<00:00, 2.92it/s]
Generating embeddings for neighbours: 100%|██████████| 10/10 [00:03<00:00, 2.97it/s]
aerialway | airports | buildings | culture_art_entertainment | education | emergency | finances | greenery | healthcare | historic | leisure | other | shops | sport | sustenance | tourism | transportation | water | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | ||||||||||||||||||
89393375807ffff | 0.000424 | 2.522626 | 0.202267 | 0.023442 | 0.196962 | 0.036525 | 0.062314 | 5.735401 | 0.075130 | 0.038168 | 2.346146 | 1.093156 | 0.523263 | 0.223166 | 0.352128 | 0.135426 | 8.539636 | 0.031633 |
89393362903ffff | 0.001940 | 0.001698 | 12.409412 | 1.353760 | 7.261210 | 0.042150 | 10.452963 | 9.422425 | 7.618586 | 0.945735 | 8.270947 | 1.623016 | 36.309447 | 0.473300 | 24.248795 | 8.817891 | 44.415862 | 0.321079 |
89393375e37ffff | 0.000000 | 0.143304 | 0.224910 | 0.064807 | 0.653069 | 0.011579 | 0.066747 | 13.999185 | 0.291650 | 0.073693 | 3.581006 | 0.118047 | 0.935143 | 1.611358 | 0.649094 | 4.206955 | 10.095597 | 1.195257 |
89393362b77ffff | 0.000000 | 0.000000 | 6.817581 | 0.729065 | 2.049014 | 0.076358 | 2.907855 | 4.456693 | 2.094359 | 9.132607 | 5.523225 | 3.482696 | 27.225454 | 0.297701 | 58.750673 | 25.605754 | 29.185735 | 0.860737 |
89393367097ffff | 0.001709 | 0.000000 | 6.942717 | 0.262311 | 0.266154 | 0.002019 | 0.065782 | 2.840018 | 0.253256 | 0.289361 | 1.303416 | 1.365994 | 1.724712 | 0.417280 | 3.397065 | 0.585443 | 16.755198 | 2.490156 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
89393375a27ffff | 0.001658 | 0.044809 | 0.338675 | 0.098740 | 3.498379 | 0.012504 | 0.116507 | 18.024361 | 1.260719 | 0.053485 | 8.944176 | 0.156641 | 12.324194 | 4.875606 | 3.781297 | 0.390916 | 17.238997 | 0.246963 |
89393362cb7ffff | 0.004139 | 0.000000 | 1.988572 | 0.341248 | 0.747185 | 0.044313 | 0.483579 | 6.850177 | 0.550667 | 1.584998 | 6.457678 | 0.428569 | 10.724208 | 2.003579 | 3.130296 | 0.777097 | 10.226391 | 0.127299 |
89393367683ffff | 0.000000 | 0.000000 | 2.191808 | 0.294570 | 6.114881 | 0.017328 | 1.331265 | 3.302630 | 2.345614 | 0.660878 | 4.392056 | 1.334071 | 37.455788 | 0.620049 | 19.487535 | 6.469591 | 38.132233 | 0.131340 |
8939337535bffff | 0.002129 | 0.000000 | 0.421833 | 0.108654 | 0.477528 | 0.006600 | 0.161815 | 9.306525 | 0.349763 | 0.310472 | 3.469307 | 0.252805 | 2.428283 | 0.380434 | 1.033579 | 0.364513 | 27.715797 | 0.052521 |
8939337595bffff | 0.004116 | 0.061044 | 0.354094 | 0.059926 | 2.432663 | 0.017368 | 0.181453 | 6.757675 | 0.287201 | 1.186616 | 7.522743 | 0.114218 | 1.823443 | 1.557880 | 0.881500 | 0.276990 | 20.818908 | 0.079381 |
830 rows × 18 columns
Concatenated vector version¶
Embedder will return vector of length n * distance
where n
is number of features from the CountEmbedder
and distance
is number of neighbourhoods analysed.
Each feature will be postfixed with _n
string, where n
is the current distance. Values are averaged from all neighbours.
wide_cce = ContextualCountEmbedder(
neighbourhood=h3n, neighbourhood_distance=10, concatenate_vectors=True
)
wide_embeddings = wide_cce.transform(regions_gdf, features_gdf, joint_gdf)
wide_embeddings
Generating embeddings for neighbours: 100%|██████████| 10/10 [00:03<00:00, 2.92it/s]
aerialway_0 | airports_0 | buildings_0 | culture_art_entertainment_0 | education_0 | emergency_0 | finances_0 | greenery_0 | healthcare_0 | historic_0 | ... | healthcare_10 | historic_10 | leisure_10 | other_10 | shops_10 | sport_10 | sustenance_10 | tourism_10 | transportation_10 | water_10 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
89393375807ffff | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | ... | 1.102564 | 0.256410 | 3.948718 | 0.307692 | 5.820513 | 1.564103 | 3.923077 | 1.333333 | 11.128205 | 1.179487 |
89393362903ffff | 0.0 | 0.0 | 9.0 | 1.0 | 6.0 | 0.0 | 9.0 | 6.0 | 6.0 | 0.0 | ... | 0.644068 | 1.033898 | 3.152542 | 0.559322 | 4.508475 | 0.898305 | 6.610169 | 2.694915 | 10.661017 | 0.779661 |
89393375e37ffff | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 11.0 | 0.0 | 0.0 | ... | 1.322581 | 0.193548 | 4.870968 | 0.451613 | 4.451613 | 2.161290 | 2.806452 | 1.032258 | 14.870968 | 0.000000 |
89393362b77ffff | 0.0 | 0.0 | 4.0 | 0.0 | 1.0 | 0.0 | 1.0 | 2.0 | 1.0 | 6.0 | ... | 1.129032 | 1.032258 | 3.064516 | 0.870968 | 10.064516 | 0.709677 | 6.096774 | 3.419355 | 17.064516 | 0.612903 |
89393367097ffff | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | ... | 1.321429 | 1.214286 | 4.928571 | 1.750000 | 14.357143 | 1.392857 | 9.321429 | 3.785714 | 22.892857 | 0.821429 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
89393375a27ffff | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | 14.0 | 1.0 | 0.0 | ... | 1.289474 | 0.526316 | 5.605263 | 0.710526 | 8.473684 | 2.394737 | 6.131579 | 1.947368 | 16.368421 | 0.500000 |
89393362cb7ffff | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 1.0 | ... | 0.516129 | 0.419355 | 3.580645 | 0.419355 | 1.354839 | 1.387097 | 1.709677 | 1.806452 | 11.483871 | 0.322581 |
89393367683ffff | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 1.0 | 1.0 | 0.0 | ... | 1.228571 | 0.857143 | 5.685714 | 0.914286 | 7.257143 | 2.257143 | 5.142857 | 2.342857 | 16.800000 | 0.400000 |
8939337535bffff | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 6.0 | 0.0 | 0.0 | ... | 0.692308 | 0.846154 | 5.115385 | 0.346154 | 2.653846 | 2.961538 | 1.730769 | 1.538462 | 10.769231 | 0.500000 |
8939337595bffff | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 2.0 | 0.0 | 1.0 | ... | 0.592593 | 0.111111 | 2.481481 | 0.296296 | 1.481481 | 0.666667 | 1.333333 | 1.481481 | 7.666667 | 0.518519 |
830 rows × 198 columns
Plotting example features¶
plot_numeric_data(regions_gdf, "leisure", embeddings)
plot_numeric_data(regions_gdf, "transportation", embeddings)