Contextual count embedder
from srai.embedders import ContextualCountEmbedder
from srai.joiners import IntersectionJoiner
from srai.loaders.osm_loaders import OSMPbfLoader
from srai.neighbourhoods import H3Neighbourhood
from srai.plotting.folium_wrapper import plot_numeric_data, plot_regions
from srai.regionalizers import H3Regionalizer
Data preparation¶
In order to use ContextualCountEmbedder
we need to prepare some data.
Namely we need: regions_gdf
, features_gdf
, and joint_gdf
.
These are the outputs of Regionalizers, Loaders and Joiners respectively.
from srai.regionalizers import geocode_to_region_gdf
area_gdf = geocode_to_region_gdf("Lisboa, PT")
plot_regions(area_gdf)
Regionalize the area using an H3Regionalizer¶
regionalizer = H3Regionalizer(resolution=9, buffer=True)
regions_gdf = regionalizer.transform(area_gdf)
regions_gdf
geometry | |
---|---|
region_id | |
89393360437ffff | POLYGON ((-9.22823 38.70081, -9.22993 38.69947... |
89393367413ffff | POLYGON ((-9.12296 38.74601, -9.12467 38.74467... |
89393375e73ffff | POLYGON ((-9.15928 38.77157, -9.16099 38.77023... |
89393375bc3ffff | POLYGON ((-9.12339 38.76685, -9.12509 38.76551... |
89393360573ffff | POLYGON ((-9.21028 38.69847, -9.21199 38.69712... |
... | ... |
89393362d0bffff | POLYGON ((-9.16984 38.75150, -9.17154 38.75016... |
893933664dbffff | POLYGON ((-9.09160 38.79417, -9.09331 38.79283... |
89393362e07ffff | POLYGON ((-9.20051 38.72736, -9.20221 38.72602... |
89393375bb3ffff | POLYGON ((-9.10913 38.77570, -9.11084 38.77436... |
89393362897ffff | POLYGON ((-9.18006 38.74346, -9.18177 38.74212... |
830 rows × 1 columns
Download some objects from OpenStreetMap¶
You can use both OsmTagsFilter
and GroupedOsmTagsFilter
filters. In this example, a predefined GroupedOsmTagsFilter
filter BASE_OSM_GROUPS_FILTER
is used.
from srai.loaders.osm_loaders.filters import BASE_OSM_GROUPS_FILTER
loader = OSMPbfLoader()
features_gdf = loader.load(area_gdf, tags=BASE_OSM_GROUPS_FILTER)
features_gdf
/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/srai/loaders/osm_loaders/osm_pbf_loader.py:128: FutureWarning: Use `convert_geometry_to_geodataframe` instead. Deprecated since 0.8.1 version. features_gdf = pbf_reader.get_features_gdf_from_geometry(
Downloading data from 'https://download.geofabrik.de/europe/portugal-latest.osm.pbf' to file '/home/runner/work/srai/srai/examples/embedders/files/Geofabrik_portugal.osm.pbf'.
0%| | 0.00/335M [00:00<?, ?B/s]
0%| | 8.19k/335M [00:00<1:50:45, 50.4kB/s]
0%| | 36.9k/335M [00:00<46:19, 120kB/s]
0%| | 94.2k/335M [00:00<25:35, 218kB/s]
0%| | 209k/335M [00:00<14:05, 396kB/s]
0%| | 434k/335M [00:00<07:40, 726kB/s]
0%| | 737k/335M [00:00<04:26, 1.25MB/s]
0%| | 991k/335M [00:01<03:33, 1.57MB/s]
0%|▏ | 1.53M/335M [00:01<02:10, 2.56MB/s]
1%|▎ | 2.24M/335M [00:01<01:28, 3.77MB/s]
1%|▍ | 3.46M/335M [00:01<00:54, 6.10MB/s]
1%|▌ | 4.93M/335M [00:01<00:38, 8.51MB/s]
2%|▋ | 6.36M/335M [00:01<00:32, 10.2MB/s]
2%|▉ | 7.97M/335M [00:01<00:27, 11.9MB/s]
3%|█ | 9.19M/335M [00:01<00:27, 11.9MB/s]
3%|█▏ | 10.8M/335M [00:01<00:24, 13.3MB/s]
4%|█▍ | 12.2M/335M [00:01<00:24, 13.1MB/s]
4%|█▌ | 13.6M/335M [00:02<00:24, 13.3MB/s]
4%|█▋ | 15.0M/335M [00:02<00:24, 13.2MB/s]
5%|█▊ | 16.3M/335M [00:02<00:25, 12.6MB/s]
5%|█▉ | 17.6M/335M [00:02<00:25, 12.3MB/s]
6%|██▏ | 18.8M/335M [00:02<00:26, 12.0MB/s]
6%|██▎ | 20.1M/335M [00:02<00:25, 12.1MB/s]
6%|██▍ | 21.3M/335M [00:02<00:34, 9.10MB/s]
7%|██▌ | 23.1M/335M [00:02<00:29, 10.5MB/s]
7%|██▋ | 24.2M/335M [00:03<00:29, 10.5MB/s]
8%|██▉ | 25.6M/335M [00:03<00:27, 11.2MB/s]
8%|███ | 27.0M/335M [00:03<00:25, 11.9MB/s]
8%|███▏ | 28.3M/335M [00:03<00:24, 12.3MB/s]
9%|███▍ | 29.9M/335M [00:03<00:23, 13.2MB/s]
9%|███▌ | 31.3M/335M [00:03<00:22, 13.4MB/s]
10%|███▋ | 32.6M/335M [00:03<00:23, 13.1MB/s]
10%|███▊ | 34.0M/335M [00:03<00:23, 12.9MB/s]
11%|████ | 35.3M/335M [00:03<00:23, 12.7MB/s]
11%|████▏ | 36.5M/335M [00:03<00:23, 12.6MB/s]
11%|████▎ | 37.8M/335M [00:04<00:23, 12.4MB/s]
12%|████▍ | 39.2M/335M [00:04<00:23, 12.7MB/s]
12%|████▌ | 40.5M/335M [00:04<00:22, 13.0MB/s]
13%|████▊ | 41.8M/335M [00:04<00:22, 13.0MB/s]
13%|████▉ | 43.2M/335M [00:04<00:22, 13.2MB/s]
13%|█████ | 44.6M/335M [00:04<00:21, 13.3MB/s]
14%|█████▎ | 46.3M/335M [00:04<00:20, 14.2MB/s]
14%|█████▍ | 47.7M/335M [00:04<00:21, 13.4MB/s]
15%|█████▌ | 49.1M/335M [00:04<00:21, 13.5MB/s]
15%|█████▋ | 50.5M/335M [00:05<00:20, 13.7MB/s]
16%|█████▉ | 51.9M/335M [00:05<00:21, 13.0MB/s]
16%|██████ | 53.3M/335M [00:05<00:21, 13.3MB/s]
16%|██████▏ | 54.6M/335M [00:05<00:21, 13.2MB/s]
17%|██████▎ | 56.0M/335M [00:05<00:21, 13.3MB/s]
17%|██████▌ | 57.4M/335M [00:05<00:20, 13.7MB/s]
18%|██████▋ | 58.8M/335M [00:05<00:21, 13.1MB/s]
18%|██████▊ | 60.3M/335M [00:05<00:19, 13.7MB/s]
18%|███████ | 61.7M/335M [00:05<00:20, 13.5MB/s]
19%|███████▏ | 63.1M/335M [00:05<00:20, 13.3MB/s]
19%|███████▎ | 64.7M/335M [00:06<00:19, 13.9MB/s]
20%|███████▌ | 66.1M/335M [00:06<00:20, 13.1MB/s]
20%|███████▋ | 67.6M/335M [00:06<00:19, 13.6MB/s]
21%|███████▊ | 68.9M/335M [00:06<00:19, 13.6MB/s]
21%|███████▉ | 70.3M/335M [00:06<00:20, 13.1MB/s]
21%|████████▏ | 71.7M/335M [00:06<00:19, 13.4MB/s]
22%|████████▎ | 73.1M/335M [00:06<00:20, 13.0MB/s]
22%|████████▍ | 74.4M/335M [00:06<00:20, 13.0MB/s]
23%|████████▌ | 75.8M/335M [00:06<00:19, 13.1MB/s]
23%|████████▊ | 77.1M/335M [00:07<00:20, 12.4MB/s]
23%|████████▉ | 78.5M/335M [00:07<00:19, 13.0MB/s]
24%|█████████ | 79.9M/335M [00:07<00:19, 13.1MB/s]
24%|█████████▏ | 81.2M/335M [00:07<00:20, 12.7MB/s]
25%|█████████▍ | 82.6M/335M [00:07<00:19, 13.1MB/s]
25%|█████████▌ | 83.9M/335M [00:07<00:19, 12.6MB/s]
26%|█████████▋ | 85.5M/335M [00:07<00:18, 13.5MB/s]
26%|█████████▊ | 86.9M/335M [00:07<00:18, 13.3MB/s]
26%|██████████ | 88.2M/335M [00:07<00:18, 13.0MB/s]
27%|██████████▏ | 89.7M/335M [00:08<00:18, 13.4MB/s]
27%|██████████▎ | 91.0M/335M [00:08<00:18, 13.1MB/s]
28%|██████████▍ | 92.3M/335M [00:08<00:18, 13.0MB/s]
28%|██████████▋ | 93.7M/335M [00:08<00:18, 13.2MB/s]
28%|██████████▊ | 95.0M/335M [00:08<00:18, 12.9MB/s]
29%|██████████▉ | 96.5M/335M [00:08<00:18, 13.2MB/s]
29%|███████████ | 97.8M/335M [00:08<00:17, 13.3MB/s]
30%|███████████▎ | 99.3M/335M [00:08<00:17, 13.6MB/s]
30%|███████████▋ | 101M/335M [00:08<00:16, 13.9MB/s]
31%|███████████▉ | 102M/335M [00:08<00:17, 13.4MB/s]
31%|████████████ | 104M/335M [00:09<00:17, 13.6MB/s]
31%|████████████▏ | 105M/335M [00:09<00:17, 13.5MB/s]
32%|████████████▍ | 106M/335M [00:09<00:17, 13.4MB/s]
32%|████████████▌ | 108M/335M [00:09<00:16, 13.8MB/s]
33%|████████████▋ | 109M/335M [00:09<00:16, 13.3MB/s]
33%|████████████▉ | 111M/335M [00:09<00:16, 13.5MB/s]
33%|█████████████ | 112M/335M [00:09<00:16, 13.5MB/s]
34%|█████████████▏ | 113M/335M [00:09<00:16, 13.3MB/s]
34%|█████████████▎ | 115M/335M [00:09<00:16, 13.6MB/s]
35%|█████████████▌ | 116M/335M [00:09<00:16, 13.2MB/s]
35%|█████████████▋ | 118M/335M [00:10<00:15, 13.6MB/s]
36%|█████████████▊ | 119M/335M [00:10<00:16, 13.3MB/s]
36%|██████████████ | 120M/335M [00:10<00:16, 13.2MB/s]
36%|██████████████▏ | 122M/335M [00:10<00:15, 13.6MB/s]
37%|██████████████▎ | 123M/335M [00:10<00:16, 13.2MB/s]
37%|██████████████▍ | 124M/335M [00:10<00:15, 13.3MB/s]
38%|██████████████▋ | 126M/335M [00:10<00:15, 13.6MB/s]
38%|██████████████▊ | 127M/335M [00:10<00:15, 13.4MB/s]
38%|██████████████▉ | 129M/335M [00:10<00:15, 13.5MB/s]
39%|███████████████▏ | 130M/335M [00:11<00:15, 13.2MB/s]
39%|███████████████▎ | 131M/335M [00:11<00:15, 13.4MB/s]
40%|███████████████▍ | 133M/335M [00:11<00:14, 13.5MB/s]
40%|███████████████▋ | 134M/335M [00:11<00:14, 13.4MB/s]
40%|███████████████▊ | 135M/335M [00:11<00:14, 13.4MB/s]
41%|███████████████▉ | 137M/335M [00:11<00:15, 12.9MB/s]
41%|████████████████ | 138M/335M [00:11<00:15, 13.0MB/s]
42%|████████████████▏ | 139M/335M [00:11<00:15, 12.8MB/s]
42%|████████████████▍ | 141M/335M [00:11<00:14, 13.0MB/s]
42%|████████████████▌ | 142M/335M [00:11<00:14, 13.1MB/s]
43%|████████████████▋ | 143M/335M [00:12<00:14, 12.8MB/s]
43%|████████████████▊ | 145M/335M [00:12<00:14, 12.8MB/s]
44%|█████████████████ | 146M/335M [00:12<00:14, 12.7MB/s]
44%|█████████████████▏ | 147M/335M [00:12<00:14, 12.9MB/s]
44%|█████████████████▎ | 149M/335M [00:12<00:14, 13.1MB/s]
45%|█████████████████▍ | 150M/335M [00:12<00:14, 12.8MB/s]
45%|█████████████████▋ | 151M/335M [00:12<00:13, 13.2MB/s]
46%|█████████████████▊ | 153M/335M [00:12<00:13, 13.7MB/s]
46%|█████████████████▉ | 154M/335M [00:12<00:13, 13.6MB/s]
47%|██████████████████▏ | 156M/335M [00:12<00:13, 13.5MB/s]
47%|██████████████████▎ | 157M/335M [00:13<00:13, 13.4MB/s]
47%|██████████████████▍ | 158M/335M [00:13<00:13, 13.2MB/s]
48%|██████████████████▌ | 160M/335M [00:13<00:13, 13.4MB/s]
48%|██████████████████▊ | 161M/335M [00:13<00:13, 13.3MB/s]
49%|██████████████████▉ | 162M/335M [00:13<00:12, 13.3MB/s]
49%|███████████████████ | 164M/335M [00:13<00:12, 13.2MB/s]
49%|███████████████████▏ | 165M/335M [00:13<00:12, 13.4MB/s]
50%|███████████████████▍ | 167M/335M [00:13<00:12, 13.5MB/s]
50%|███████████████████▌ | 168M/335M [00:13<00:12, 13.3MB/s]
51%|███████████████████▋ | 169M/335M [00:13<00:12, 13.3MB/s]
51%|███████████████████▊ | 171M/335M [00:14<00:12, 13.0MB/s]
51%|████████████████████ | 172M/335M [00:14<00:12, 13.1MB/s]
52%|████████████████████▏ | 173M/335M [00:14<00:12, 13.0MB/s]
52%|████████████████████▎ | 174M/335M [00:14<00:12, 13.0MB/s]
53%|████████████████████▍ | 176M/335M [00:14<00:12, 12.9MB/s]
53%|████████████████████▋ | 177M/335M [00:14<00:12, 12.7MB/s]
53%|████████████████████▊ | 178M/335M [00:14<00:12, 13.0MB/s]
54%|████████████████████▉ | 180M/335M [00:14<00:12, 12.9MB/s]
54%|█████████████████████ | 181M/335M [00:14<00:11, 12.9MB/s]
54%|█████████████████████▏ | 182M/335M [00:15<00:15, 10.1MB/s]
55%|█████████████████████▍ | 184M/335M [00:15<00:14, 10.8MB/s]
55%|█████████████████████▌ | 185M/335M [00:15<00:11, 12.8MB/s]
56%|█████████████████████▊ | 187M/335M [00:15<00:12, 11.8MB/s]
56%|█████████████████████▉ | 189M/335M [00:15<00:11, 13.3MB/s]
57%|██████████████████████▏ | 190M/335M [00:15<00:10, 14.2MB/s]
57%|██████████████████████▎ | 192M/335M [00:15<00:11, 12.8MB/s]
58%|██████████████████████▌ | 193M/335M [00:15<00:10, 13.7MB/s]
58%|██████████████████████▊ | 195M/335M [00:15<00:09, 14.9MB/s]
59%|██████████████████████▉ | 197M/335M [00:16<00:10, 13.4MB/s]
59%|███████████████████████ | 198M/335M [00:16<00:10, 13.5MB/s]
60%|███████████████████████▎ | 200M/335M [00:16<00:08, 15.1MB/s]
60%|███████████████████████▌ | 202M/335M [00:16<00:09, 13.5MB/s]
61%|███████████████████████▋ | 203M/335M [00:16<00:10, 12.5MB/s]
61%|███████████████████████▉ | 205M/335M [00:16<00:09, 14.3MB/s]
62%|████████████████████████ | 207M/335M [00:16<00:09, 13.9MB/s]
62%|████████████████████████▏ | 208M/335M [00:16<00:09, 12.7MB/s]
63%|████████████████████████▍ | 210M/335M [00:17<00:09, 13.9MB/s]
63%|████████████████████████▌ | 211M/335M [00:17<00:08, 14.1MB/s]
64%|████████████████████████▊ | 213M/335M [00:17<00:09, 12.9MB/s]
64%|████████████████████████▉ | 215M/335M [00:17<00:08, 14.3MB/s]
65%|█████████████████████████▏ | 216M/335M [00:17<00:08, 14.0MB/s]
65%|█████████████████████████▎ | 217M/335M [00:17<00:09, 12.7MB/s]
65%|█████████████████████████▌ | 219M/335M [00:17<00:08, 13.8MB/s]
66%|█████████████████████████▋ | 221M/335M [00:17<00:08, 13.9MB/s]
66%|█████████████████████████▊ | 222M/335M [00:18<00:08, 12.7MB/s]
67%|██████████████████████████ | 224M/335M [00:18<00:08, 13.8MB/s]
67%|██████████████████████████▏ | 225M/335M [00:18<00:07, 13.8MB/s]
68%|██████████████████████████▍ | 227M/335M [00:18<00:08, 12.5MB/s]
68%|██████████████████████████▌ | 228M/335M [00:18<00:07, 13.4MB/s]
69%|██████████████████████████▊ | 230M/335M [00:18<00:07, 13.7MB/s]
69%|██████████████████████████▉ | 231M/335M [00:18<00:08, 12.7MB/s]
70%|███████████████████████████ | 233M/335M [00:18<00:07, 14.0MB/s]
70%|███████████████████████████▎ | 234M/335M [00:18<00:07, 13.9MB/s]
70%|███████████████████████████▍ | 236M/335M [00:19<00:07, 12.7MB/s]
71%|███████████████████████████▋ | 237M/335M [00:19<00:06, 14.0MB/s]
71%|███████████████████████████▊ | 239M/335M [00:19<00:06, 13.8MB/s]
72%|███████████████████████████▉ | 240M/335M [00:19<00:07, 12.7MB/s]
72%|████████████████████████████▏ | 242M/335M [00:19<00:06, 14.0MB/s]
73%|████████████████████████████▎ | 243M/335M [00:19<00:06, 13.8MB/s]
73%|████████████████████████████▌ | 245M/335M [00:19<00:07, 12.6MB/s]
74%|████████████████████████████▋ | 247M/335M [00:19<00:06, 13.8MB/s]
74%|████████████████████████████▉ | 248M/335M [00:19<00:06, 13.8MB/s]
74%|█████████████████████████████ | 249M/335M [00:20<00:06, 12.8MB/s]
75%|█████████████████████████████▎ | 251M/335M [00:20<00:05, 14.1MB/s]
75%|█████████████████████████████▍ | 253M/335M [00:20<00:05, 13.8MB/s]
76%|█████████████████████████████▌ | 254M/335M [00:20<00:06, 12.7MB/s]
76%|█████████████████████████████▊ | 256M/335M [00:20<00:05, 13.8MB/s]
77%|█████████████████████████████▉ | 257M/335M [00:20<00:05, 13.7MB/s]
77%|██████████████████████████████ | 258M/335M [00:20<00:06, 12.6MB/s]
78%|██████████████████████████████▎ | 260M/335M [00:20<00:05, 13.9MB/s]
78%|██████████████████████████████▍ | 262M/335M [00:20<00:05, 13.6MB/s]
79%|██████████████████████████████▋ | 263M/335M [00:21<00:05, 12.6MB/s]
79%|██████████████████████████████▊ | 265M/335M [00:21<00:05, 13.9MB/s]
80%|███████████████████████████████ | 266M/335M [00:21<00:05, 13.6MB/s]
80%|███████████████████████████████▏ | 268M/335M [00:21<00:05, 12.6MB/s]
80%|███████████████████████████████▍ | 269M/335M [00:21<00:04, 13.8MB/s]
81%|███████████████████████████████▌ | 271M/335M [00:21<00:04, 13.6MB/s]
81%|███████████████████████████████▋ | 272M/335M [00:21<00:04, 12.6MB/s]
82%|███████████████████████████████▉ | 274M/335M [00:21<00:04, 14.1MB/s]
82%|████████████████████████████████ | 275M/335M [00:21<00:04, 13.7MB/s]
83%|████████████████████████████████▎ | 277M/335M [00:22<00:04, 12.7MB/s]
83%|████████████████████████████████▍ | 279M/335M [00:22<00:04, 13.9MB/s]
84%|████████████████████████████████▌ | 280M/335M [00:22<00:03, 13.7MB/s]
84%|████████████████████████████████▊ | 281M/335M [00:22<00:04, 12.7MB/s]
85%|████████████████████████████████▉ | 283M/335M [00:22<00:03, 14.0MB/s]
85%|█████████████████████████████████▏ | 285M/335M [00:22<00:03, 13.7MB/s]
85%|█████████████████████████████████▎ | 286M/335M [00:22<00:03, 12.7MB/s]
86%|█████████████████████████████████▌ | 288M/335M [00:22<00:03, 13.9MB/s]
86%|█████████████████████████████████▋ | 289M/335M [00:22<00:03, 13.7MB/s]
87%|█████████████████████████████████▊ | 290M/335M [00:23<00:03, 12.6MB/s]
87%|██████████████████████████████████ | 292M/335M [00:23<00:03, 13.8MB/s]
88%|██████████████████████████████████▏ | 294M/335M [00:23<00:03, 13.5MB/s]
88%|██████████████████████████████████▎ | 295M/335M [00:23<00:03, 12.5MB/s]
89%|██████████████████████████████████▌ | 297M/335M [00:23<00:02, 13.7MB/s]
89%|██████████████████████████████████▋ | 298M/335M [00:23<00:02, 13.4MB/s]
89%|██████████████████████████████████▉ | 299M/335M [00:23<00:02, 12.5MB/s]
90%|███████████████████████████████████ | 301M/335M [00:23<00:02, 13.9MB/s]
90%|███████████████████████████████████▎ | 303M/335M [00:24<00:02, 13.5MB/s]
91%|███████████████████████████████████▍ | 304M/335M [00:24<00:02, 12.5MB/s]
91%|███████████████████████████████████▌ | 306M/335M [00:24<00:02, 13.8MB/s]
92%|███████████████████████████████████▊ | 307M/335M [00:24<00:02, 13.4MB/s]
92%|███████████████████████████████████▉ | 308M/335M [00:24<00:02, 12.5MB/s]
93%|████████████████████████████████████▏ | 310M/335M [00:24<00:01, 13.8MB/s]
93%|████████████████████████████████████▎ | 312M/335M [00:24<00:01, 13.4MB/s]
94%|████████████████████████████████████▍ | 313M/335M [00:24<00:01, 12.5MB/s]
94%|████████████████████████████████████▋ | 315M/335M [00:24<00:01, 13.8MB/s]
94%|████████████████████████████████████▊ | 316M/335M [00:25<00:01, 13.4MB/s]
95%|████████████████████████████████████▉ | 317M/335M [00:25<00:01, 12.5MB/s]
95%|█████████████████████████████████████▏ | 319M/335M [00:25<00:01, 13.7MB/s]
96%|█████████████████████████████████████▎ | 321M/335M [00:25<00:01, 13.4MB/s]
96%|█████████████████████████████████████▌ | 322M/335M [00:25<00:01, 12.5MB/s]
97%|█████████████████████████████████████▋ | 324M/335M [00:25<00:00, 13.8MB/s]
97%|█████████████████████████████████████▉ | 325M/335M [00:25<00:00, 13.4MB/s]
98%|██████████████████████████████████████ | 326M/335M [00:25<00:00, 12.5MB/s]
98%|██████████████████████████████████████▎| 328M/335M [00:25<00:00, 13.8MB/s]
99%|██████████████████████████████████████▍| 330M/335M [00:26<00:00, 13.5MB/s]
99%|██████████████████████████████████████▌| 331M/335M [00:26<00:00, 12.5MB/s]
99%|██████████████████████████████████████▊| 333M/335M [00:26<00:00, 13.8MB/s]
100%|██████████████████████████████████████▉| 334M/335M [00:26<00:00, 13.5MB/s]
0%| | 0.00/335M [00:00<?, ?B/s]
100%|████████████████████████████████████████| 335M/335M [00:00<00:00, 541GB/s]
SHA256 hash of downloaded file: f8261733f32516541ba034100497cb1496f2575632bfc743dcfe246d160fff14 Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
Finished operation in 0:01:15
geometry | aerialway | airports | buildings | culture_art_entertainment | education | emergency | finances | greenery | healthcare | historic | leisure | other | shops | sport | sustenance | tourism | transportation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
feature_id | ||||||||||||||||||
node/245937922 | POINT (-9.15963 38.77932) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position |
node/245937941 | POINT (-9.15934 38.77340) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position |
node/246763946 | POINT (-9.15911 38.75150) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position |
node/246763964 | POINT (-9.14820 38.74703) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position |
node/246763968 | POINT (-9.14679 38.74111) | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | public_transport=stop_position |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
relation/15467072 | POLYGON ((-9.20272 38.69441, -9.20266 38.69421... | None | None | None | None | None | None | None | None | None | None | None | None | None | None | amenity=restaurant | None | None |
relation/15469675 | POLYGON ((-9.16145 38.70450, -9.16123 38.70455... | None | None | None | None | None | None | None | None | None | None | leisure=garden | None | None | None | None | None | None |
relation/15482422 | POLYGON ((-9.20167 38.69843, -9.20164 38.69835... | None | None | None | None | None | None | None | None | None | None | leisure=garden | None | None | None | None | None | None |
relation/15482569 | MULTIPOLYGON (((-9.20123 38.69749, -9.20120 38... | None | None | None | None | None | None | None | None | None | None | None | None | None | None | None | tourism=museum | None |
relation/15475183 | POLYGON ((-9.20112 38.69844, -9.20106 38.69825... | None | None | None | None | None | None | None | None | None | historic=castle | None | None | None | None | None | tourism=attraction | None |
27861 rows × 18 columns
Join the objects with the regions they belong to¶
joiner = IntersectionJoiner()
joint_gdf = joiner.transform(regions_gdf, features_gdf)
joint_gdf
region_id | feature_id |
---|---|
89393360437ffff | node/3575234314 |
way/423589903 | |
node/9822428226 | |
node/3237731973 | |
node/4161794847 | |
... | ... |
89393362897ffff | node/11886423457 |
way/874285083 | |
node/7940856475 | |
node/7940856474 | |
way/675647967 |
30922 rows × 0 columns
Embed using features existing in data¶
ContextualCountEmbedder
extends capabilities of basic CountEmbedder
by incorporating the neighbourhood of embedded region. In this example we will use the H3Neighbourhood
.
h3n = H3Neighbourhood()
Squashed vector version (default)¶
Embedder will return vector of the same length as CountEmbedder
, but will sum averaged values from the neighbourhoods diminished by the neighbour distance squared.
cce = ContextualCountEmbedder(
neighbourhood=h3n, neighbourhood_distance=10, concatenate_vectors=False
)
embeddings = cce.transform(regions_gdf, features_gdf, joint_gdf)
embeddings
Generating embeddings for neighbours: 0%| | 0/10 [00:00<?, ?it/s]
Generating embeddings for neighbours: 10%|█ | 1/10 [00:00<00:02, 3.11it/s]
Generating embeddings for neighbours: 20%|██ | 2/10 [00:00<00:02, 3.10it/s]
Generating embeddings for neighbours: 30%|███ | 3/10 [00:00<00:02, 3.09it/s]
Generating embeddings for neighbours: 40%|████ | 4/10 [00:01<00:01, 3.03it/s]
Generating embeddings for neighbours: 50%|█████ | 5/10 [00:01<00:01, 3.04it/s]
Generating embeddings for neighbours: 60%|██████ | 6/10 [00:01<00:01, 3.05it/s]
Generating embeddings for neighbours: 70%|███████ | 7/10 [00:02<00:00, 3.00it/s]
Generating embeddings for neighbours: 80%|████████ | 8/10 [00:02<00:00, 2.99it/s]
Generating embeddings for neighbours: 90%|█████████ | 9/10 [00:02<00:00, 2.99it/s]
Generating embeddings for neighbours: 100%|██████████| 10/10 [00:03<00:00, 2.95it/s]
Generating embeddings for neighbours: 100%|██████████| 10/10 [00:03<00:00, 3.00it/s]
aerialway | airports | buildings | culture_art_entertainment | education | emergency | finances | greenery | healthcare | historic | leisure | other | shops | sport | sustenance | tourism | transportation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||
89393360437ffff | 0.000000 | 0.000000 | 1.566794 | 0.058516 | 0.295012 | 0.000000 | 0.044657 | 1.393323 | 2.263529 | 0.156731 | 4.430030 | 0.148450 | 1.957805 | 0.342181 | 2.894121 | 0.639341 | 16.815981 |
89393367413ffff | 0.000394 | 0.005289 | 0.316381 | 0.077913 | 3.529797 | 0.002856 | 0.218676 | 6.954374 | 0.343162 | 0.159862 | 6.027114 | 0.276212 | 1.969268 | 1.865828 | 1.169358 | 1.621739 | 7.417918 |
89393375e73ffff | 0.001490 | 0.033447 | 1.793896 | 0.069441 | 2.761765 | 0.004927 | 1.624825 | 7.177268 | 1.825760 | 3.352643 | 6.170979 | 0.256814 | 3.582355 | 0.595172 | 5.208807 | 5.159251 | 19.958557 |
89393375bc3ffff | 0.002216 | 0.076584 | 0.397382 | 0.035140 | 2.278677 | 0.045652 | 0.205720 | 40.431557 | 0.345720 | 0.062345 | 4.001767 | 1.352747 | 3.684079 | 0.535094 | 1.634117 | 1.258528 | 8.626941 |
89393360573ffff | 0.000000 | 0.000000 | 0.807103 | 2.169448 | 2.568803 | 0.001042 | 0.202192 | 7.721205 | 0.259675 | 4.578178 | 22.804796 | 1.469022 | 3.334838 | 0.825021 | 4.221944 | 12.751230 | 17.387044 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
89393362d0bffff | 0.078704 | 0.001793 | 1.790843 | 1.127624 | 1.472983 | 0.003402 | 0.548506 | 16.199944 | 1.662988 | 0.331778 | 8.453426 | 1.357318 | 3.723447 | 3.526749 | 4.351257 | 1.577514 | 10.268593 |
893933664dbffff | 0.015406 | 0.004618 | 0.478192 | 0.011176 | 0.278273 | 0.035137 | 0.087369 | 1.996503 | 0.146695 | 0.025697 | 3.031453 | 0.062787 | 0.954741 | 0.951701 | 0.633918 | 0.299331 | 2.335816 |
89393362e07ffff | 0.000605 | 0.000000 | 0.119587 | 0.030421 | 0.110645 | 0.000666 | 0.030066 | 0.730527 | 0.067403 | 0.083488 | 3.886630 | 0.167383 | 0.350228 | 1.929292 | 0.298817 | 4.294335 | 1.973611 |
89393375bb3ffff | 0.009787 | 0.029081 | 0.453562 | 1.077177 | 0.430348 | 0.037822 | 0.147575 | 7.194592 | 0.301792 | 0.114512 | 3.645969 | 1.300072 | 1.544128 | 0.832990 | 0.793722 | 0.376199 | 16.541416 |
89393362897ffff | 0.034722 | 0.000000 | 0.912173 | 0.047574 | 1.608507 | 0.002452 | 0.247293 | 3.246519 | 0.455682 | 5.428691 | 7.099025 | 4.365710 | 6.533717 | 0.435269 | 2.806465 | 5.620680 | 15.934737 |
830 rows × 17 columns
Concatenated vector version¶
Embedder will return vector of length n * distance
where n
is number of features from the CountEmbedder
and distance
is number of neighbourhoods analysed.
Each feature will be postfixed with _n
string, where n
is the current distance. Values are averaged from all neighbours.
wide_cce = ContextualCountEmbedder(
neighbourhood=h3n, neighbourhood_distance=10, concatenate_vectors=True
)
wide_embeddings = wide_cce.transform(regions_gdf, features_gdf, joint_gdf)
wide_embeddings
Generating embeddings for neighbours: 0%| | 0/10 [00:00<?, ?it/s]
Generating embeddings for neighbours: 10%|█ | 1/10 [00:00<00:02, 3.14it/s]
Generating embeddings for neighbours: 20%|██ | 2/10 [00:00<00:02, 3.12it/s]
Generating embeddings for neighbours: 30%|███ | 3/10 [00:00<00:02, 3.07it/s]
Generating embeddings for neighbours: 40%|████ | 4/10 [00:01<00:01, 3.04it/s]
Generating embeddings for neighbours: 50%|█████ | 5/10 [00:01<00:01, 3.04it/s]
Generating embeddings for neighbours: 60%|██████ | 6/10 [00:01<00:01, 3.03it/s]
Generating embeddings for neighbours: 70%|███████ | 7/10 [00:02<00:00, 3.04it/s]
Generating embeddings for neighbours: 80%|████████ | 8/10 [00:02<00:00, 3.02it/s]
Generating embeddings for neighbours: 90%|█████████ | 9/10 [00:02<00:00, 2.98it/s]
Generating embeddings for neighbours: 100%|██████████| 10/10 [00:03<00:00, 2.96it/s]
Generating embeddings for neighbours: 100%|██████████| 10/10 [00:03<00:00, 3.01it/s]
aerialway_0 | airports_0 | buildings_0 | culture_art_entertainment_0 | education_0 | emergency_0 | finances_0 | greenery_0 | healthcare_0 | historic_0 | ... | greenery_10 | healthcare_10 | historic_10 | leisure_10 | other_10 | shops_10 | sport_10 | sustenance_10 | tourism_10 | transportation_10 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
region_id | |||||||||||||||||||||
89393360437ffff | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | ... | 2.750000 | 0.166667 | 1.083333 | 2.750000 | 0.333333 | 0.166667 | 0.583333 | 0.916667 | 1.416667 | 5.916667 |
89393367413ffff | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | ... | 7.857143 | 1.000000 | 2.404762 | 4.023810 | 1.095238 | 5.523810 | 0.880952 | 8.690476 | 6.214286 | 19.333333 |
89393375e73ffff | 0.0 | 0.0 | 1.0 | 0.0 | 2.0 | 0.0 | 1.0 | 4.0 | 1.0 | 3.0 | ... | 7.783784 | 0.945946 | 0.270270 | 4.189189 | 0.486486 | 8.243243 | 1.027027 | 3.945946 | 1.027027 | 15.648649 |
89393375bc3ffff | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 33.0 | 0.0 | 0.0 | ... | 4.540541 | 0.756757 | 0.621622 | 3.972973 | 0.486486 | 5.756757 | 1.891892 | 4.270270 | 1.243243 | 13.513514 |
89393360573ffff | 0.0 | 0.0 | 0.0 | 2.0 | 2.0 | 0.0 | 0.0 | 3.0 | 0.0 | 4.0 | ... | 3.200000 | 0.266667 | 0.400000 | 3.933333 | 0.333333 | 2.066667 | 1.600000 | 3.000000 | 2.333333 | 8.600000 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
89393362d0bffff | 0.0 | 0.0 | 1.0 | 1.0 | 1.0 | 0.0 | 0.0 | 11.0 | 1.0 | 0.0 | ... | 3.461538 | 1.134615 | 0.730769 | 3.596154 | 0.788462 | 7.519231 | 1.269231 | 4.596154 | 2.211538 | 10.826923 |
893933664dbffff | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | ... | 7.214286 | 0.428571 | 0.142857 | 5.142857 | 0.285714 | 1.000000 | 1.071429 | 0.642857 | 1.214286 | 15.142857 |
89393362e07ffff | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 8.317073 | 1.268293 | 1.024390 | 6.146341 | 1.024390 | 8.658537 | 0.658537 | 6.195122 | 2.902439 | 13.951220 |
89393375bb3ffff | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 3.0 | 0.0 | 0.0 | ... | 5.200000 | 0.480000 | 0.160000 | 3.320000 | 0.400000 | 0.400000 | 0.800000 | 0.680000 | 1.600000 | 6.040000 |
89393362897ffff | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 1.0 | 0.0 | 5.0 | ... | 7.977273 | 1.227273 | 1.568182 | 5.045455 | 0.659091 | 5.818182 | 1.386364 | 6.340909 | 3.613636 | 13.022727 |
830 rows × 187 columns
Plotting example features¶
plot_numeric_data(regions_gdf, "leisure", embeddings)
plot_numeric_data(regions_gdf, "transportation", embeddings)