Lab 1

Exploring geo-embeddings in the real world

Vitaly Kryukov, Newcastle University

Geospatial cube

We have more data than ever!
Mo Data, Mo Problems?

Diagram of datasets

Google Satellite Embeddings: context

Google Earth Engine

https://developers.google.com/earth-engine/datasets/catalog/GOOGLE_SATELLITE_EMBEDDING_V1_ANNUAL


RGB for bands 1,2,3


RGB for random bands

Multiple dimensions

Dimensions are interlinked and create a ‘semantical map’ altogether,

but even a few may tell a story…

IMAGO pipeline

Description

IMAGO data product

Description

Annual changes

65th dimension

Description

Moving to GIS exercises

Exercise 1. LSOA exploration

Imago Data Product: https://data.imago.ac.uk/datasets/google-satellite-embedding-v1-london-lsoas-2020-2024

Steps

  1. Download two GeoPackage files (2020 and 2024), open 2024
  2. Explore Layer Properties – CRS, format, fields, field types
  3. Create a layer style to illustrate the spatial differences for the A00_mean column:
    Symbology → Layer Properties → Graduated → Default mode → 20 classes → Histogram

Screenshot

  1. Repeat for the second and third bands (A01_mean and A02_mean)
  2. Do you see any visual difference?

  1. Visualise layer with 50% transparency over OSM or Google Satellite background. To find the background, use the QuickMapServices plugin

  1. Find an outskirt LSOA and Westminster LSOA:
    Processing Toolbox -> Select by expression -> Expression:
    “data_zone_code” IN (‘E01002292’, ‘E01004690’)

How different are they?

screenshot

Exercise 2. Is London dynamic?

65th dimension is back - time

Steps

  1. Ensure you have Embeddings-LSOA for two timeframes (2020 and 2024)
  2. Tick on both layers, choose a random LSOA and click Identify features. Then click Identify all.
  3. Do you see 64 mean embedding values for this feature?
  1. Compare visually the values in the first three columns for 2020 and 2024:
    A00_mean, A01_mean, A02_mean.

Do they differ a lot?

Geospatial cube

  1. Let’s move to numbers!
  • Use Processing Toolbox → Vector general → Join attributes by field value

  • Join 2020 layer to 2024

  • Define joined field prefix - 2020_

Geospatial cube

  1. In the output dataset:
  • open Field Calculator
  • calculate the difference between 2024 and 2020 for the A02_mean dimension:
    “A02_mean” - “2020_A02_mean”
  • choose output field type:
    Decimal number (real)

Geospatial cube

  1. Let’s visualise:
  • Layer Properties → Symbology → Graduated
  • In value field choose the last column
  • Colour ramp -> RdBu
  • Mode -> Equal interval
  • Symmetric classification -> Around 0
  • Classes - 10
  • Classify
  • Check out histogram

Geospatial cube

  1. Do you see any trends in London?
Geospatial cube

A02 dimension

(Optional)

  1. Let’s calculate the difference for other dimensions!
  2. Repeat the steps for dimension A03_mean
Geospatial cube

A03 dimension

Exercise 3. Near or Far?

Similarity search: steps

  1. Choose GeoJSON files from the data bundle (2020 and 2024)
  2. Go to Imago Similarity Explorer Tool
    (Chrome recommended)
  3. Upload the Embedding-LSOA dataset (2020) in GeoJSON format
  4. Pick your area and find what is the most similar one?
Geospatial cube
  1. Let’s find more similar areas…
  2. What about the least similar?
  3. Now upload the Embedding-LSOA dataset from 2024
  4. Do you spot differences in similar places?
Geospatial cube

Results

Did something surprise you?

Geospatial cube

Exercise 4. London in Motion

Any prominent examples of land use or land cover changes in London (2020-2024)?
Web, AI, your memory - everything works!

Steps

  1. Ensure you have Embeddings-LSOA data in QGIS for 2020 and 2024
  2. Find a catchy example of land use/land cover change in London (2020-2024)
  3. Spot the location. Which LSOA(s) does it cover?
  4. Do Embedding follow the 2020-2024 change?
    You can explore multiple dimensions…
  1. Share your results! development

[EXTRA] Exercise 5. ‘Ground’ Embeddings

Embedding might seem as a quite abstract thing…

Let’s ground it!

Do Embeddings really ‘feel’ other data?

Steps

  1. Open Layer Properties of Embeddings-2024 GeoPackage
  2. Check out Fields - population (from ONS estimates) screenshot
  1. Calculate population density:
    • Open Field Calculator
    • Create a new field popdens
    • Put expression: ("population"/$area)*1000000
    • Choose Output field typeInteger (32 bit)
    • Save changes
screenshot
  1. Statistical analysis
  • Install Plugins → Data Plotly
  • Open DataPlotly and choose the layer with population density

  1. Statistical analysis

    • Choose plot type → Scatter Plot
    • X field: A04_mean, Y field: popdens
    • Change the marker size to 2 and stroke width to 1
    • Run Create Plot

  1. Statistical analysis

    • Switch to the plot window
    • No obvious relationship?
  1. Statistical analysis
    • Switch back and clean plot canvas
    • Create the same scatter plot for A08_mean column
    • Can you see any relationship?

Correlations?

screenshot
  • Each band is unique but weak alone
  • We can’t use just one dimension for analysis!

Thanks, you were great!

You will have another portion of exercises soon…