{"id":6210,"date":"2025-11-26T14:47:23","date_gmt":"2025-11-26T13:47:23","guid":{"rendered":"https:\/\/revodata.outlawz.dev\/?p=6210"},"modified":"2026-01-23T14:27:20","modified_gmt":"2026-01-23T13:27:20","slug":"geospatial-location-allocation-in-databricks-a-scalable-gis-approach","status":"publish","type":"post","link":"https:\/\/revodata.nl\/nl\/geospatial-location-allocation-in-databricks-a-scalable-gis-approach\/","title":{"rendered":"Geospatial Location-Allocation in Databricks: A Scalable GIS Approach"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"6210\" class=\"elementor elementor-6210\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-52459a6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"52459a6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-50b64aa\" data-id=\"50b64aa\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-be25c4f elementor-widget elementor-widget-text-editor\" data-id=\"be25c4f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Editor&#8217;s note: This post was originally published June 6th, 2025.&nbsp;<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f932491 elementor-widget elementor-widget-heading\" data-id=\"f932491\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is a location-allocation problem in GIS?\n\n\n<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-462a8a8 elementor-widget elementor-widget-text-editor\" data-id=\"462a8a8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"b76e\" class=\"pw-post-body-paragraph xz ya td yb b yc yd ye yf yg yh yi yj rs yk yl ym rv yn yo yp ry yq yr ys yt go bl\" data-selectable-paragraph=\"\">Summer\u2019s almost here, so let\u2019s start with a sunny example. Imagine you\u2019re tasked with finding the top 1,000 most profitable locations across the UK to park an ice cream cart. It\u2019s not just about sweet treats, it\u2019s a geospatial location-allocation problem, one that uses data and analysis to determine optimal placement based on demand, foot traffic, and competition.<\/p><p id=\"324c\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">But let\u2019s be honest: this isn\u2019t really about ice cream carts.<\/p><p id=\"dbb6\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">In our personal lives, we constantly solve these kinds of problems, sometimes without even realizing it.<\/p><p id=\"9c9d\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">Take a simple scenario: you\u2019re planning a weekly dinner with your five closest friends, each living in different parts of the city. You want to choose a restaurant that minimizes the total travel time for everyone. Or think bigger: you\u2019re looking for a house to rent or buy. You want a location that balances multiple factors: your commute, your partner\u2019s job, your child\u2019s school, your gym, and maybe even your favorite grocery store.<\/p><p id=\"e2ac\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\"><em class=\"yz\">These are location-allocation problems, and they are everywhere.<\/em><\/p><p id=\"a452\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">Governments and businesses face them every day. Where should we place new fire stations to achieve 8-minute response times across a city? Where is the best spot for a new wastewater treatment plant? How do we select optimal locations for bus depots, EV charging stations, or substations to relieve electricity grid congestion?<\/p><p id=\"dc5c\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">The list of applications is endless. And the process doesn\u2019t stop after choosing the top candidate locations. You\u2019ll often want to monitor how these locations perform over time. Do they still serve the demand well? Has the population shifted? What insights can we gain from usage data, customer behavior, or operational costs?<\/p><p id=\"dd55\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">This evolution naturally leads into the world of <strong class=\"yb mb\">data products<\/strong>, <strong class=\"yb mb\">big data<\/strong>, <strong class=\"yb mb\">predictive modeling<\/strong>, and <strong class=\"yb mb\">machine learning<\/strong>. Since these challenges are inherently spatial, <strong class=\"yb mb\">Geographic Information Systems (GIS)<\/strong> remain indispensable, not just for mapping, but as comprehensive information systems that integrate <strong class=\"yb mb\">people, data, analysis, software, and hardware<\/strong>. To truly harness the power of GIS and spatial intelligence, you need a platform that supports all these components in a seamless and scalable way.<\/p><p id=\"3000\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">There are many tools out there to solve these problems, but in this article, I want to focus on Databricks and Apache Sedona, two powerful technologies I\u2019ve chosen for tackling large-scale location-allocation problems. I\u2019ll explain why these tools make sense for spatial big data analysis and share how they fit into the modern geospatial stack.<\/p><p id=\"8ed7\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">As you read, I encourage you to reflect on your own work. What kind of location-allocation problems are you facing in your industry? Share your thoughts in the comments, I\u2019d love to hear your perspective.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5bf0ca0 elementor-widget elementor-widget-heading\" data-id=\"5bf0ca0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Implementation:\n<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e66f2cc elementor-widget elementor-widget-text-editor\" data-id=\"e66f2cc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"ef5f\" class=\"pw-post-body-paragraph xz ya td yb b yc yd ye yf yg yh yi yj rs yk yl ym rv yn yo yp ry yq yr ys yt go bl\" data-selectable-paragraph=\"\">The following implementation is part of the training we offer at RevoData focused specifically on leveraging the geospatial capabilities of Databricks.<\/p><p id=\"3f08\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">As someone interested in data product development, I prefer to frame this as an Agile user story, as shown below:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-82bb2c8 elementor-widget elementor-widget-text-editor\" data-id=\"82bb2c8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"go sy sz ta tb\"><div class=\"ac cr\"><div class=\"hd bi he hf hg hh\"><p id=\"f261\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\"><strong class=\"yb mb\"><em class=\"yz\">As an ice-cream company, I want to know where the best spots are to locate 1000 ice-cream carts across the UK, so that I can maximize sales and profit.<\/em><\/strong><\/p><p id=\"0cb2\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\"><strong class=\"yb mb\"><em class=\"yz\">\u00b7 We want to know how many spots we can allocate to each county based on its area.<\/em><\/strong><\/p><p id=\"7d5f\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\"><strong class=\"yb mb\"><em class=\"yz\">\u00b7 Based on our BI dashboards, we know that parks have the highest sales. So, the spots should be near park entrances.<\/em><\/strong><\/p><p id=\"0e2a\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\"><strong class=\"yb mb\"><em class=\"yz\">\u00b7 Larger parks with more functionalities (playgrounds, sports fields, etc.) are more attractive.<\/em><\/strong><\/p><p id=\"b26f\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\"><strong class=\"yb mb\"><em class=\"yz\">\u00b7 Park entrances that are more accessible are more desirable.<\/em><\/strong><\/p><\/div><\/div><\/div><div class=\"ac cr og za zb zc\" role=\"separator\">\u00a0<\/div>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a4a9377 elementor-widget elementor-widget-text-editor\" data-id=\"a4a9377\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Now, as a data engineer or GIS specialist taking on this task, here is how I would approach it:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5eb89c2 elementor-widget elementor-widget-heading\" data-id=\"5eb89c2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Datasets:<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-587e946 elementor-widget elementor-widget-text-editor\" data-id=\"587e946\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><span style=\"font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-size: var( --e-global-typography-text-font-size );\">First, I need to gather the relevant datasets. I found the following datasets from the UK Ordnance Survey:<\/span><\/p><p id=\"3930\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">\u00b7 <a class=\"ah pe\" href=\"https:\/\/osdatahub.os.uk\/downloads\/open\/OpenGreenspace\" target=\"_blank\" rel=\"noopener ugc nofollow\">OS Open Greenspace<\/a><\/p><p id=\"b6f9\" class=\"pw-post-body-paragraph xz ya td yb b yc yu ye yf yg yv yi yj rs yw yl ym rv yx yo yp ry yy yr ys yt go bl\" data-selectable-paragraph=\"\">\u00b7 <a class=\"ah pe\" href=\"https:\/\/osdatahub.os.uk\/downloads\/open\/BoundaryLine\" target=\"_blank\" rel=\"noopener ugc nofollow\">Boundary-Line\u2122<\/a><\/p><p data-selectable-paragraph=\"\">\u00b7 <a class=\"ah pe\" href=\"https:\/\/osdatahub.os.uk\/downloads\/open\/OpenRoads\" target=\"_blank\" rel=\"noopener ugc nofollow\">OS Open Roads<\/a><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-606de55 elementor-widget elementor-widget-image\" data-id=\"606de55\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"659\" height=\"535\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2025-11-26-at-14.54.08.png\" class=\"attachment-large size-large wp-image-6212\" alt=\"\" srcset=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2025-11-26-at-14.54.08.png 659w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2025-11-26-at-14.54.08-300x244.png 300w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2025-11-26-at-14.54.08-15x12.png 15w\" sizes=\"(max-width: 659px) 100vw, 659px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6a06afb elementor-widget elementor-widget-heading\" data-id=\"6a06afb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Data design pattern:<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8b0cc0d elementor-widget elementor-widget-text-editor\" data-id=\"8b0cc0d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"a5a3\" class=\"pw-post-body-paragraph xz ya td yb b yc yd ye yf yg yh yi yj rs yk yl ym rv yn yo yp ry yq yr ys yt go bl\" data-selectable-paragraph=\"\">Next, from a conceptual perspective, I want to follow the Medallion Architecture (or Multi-hop Architecture). This involves ingesting the raw data as-is (bronze tables), applying filtering, joins, and enrichment (silver tables), and producing the top parks and park entrances (gold table).<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9b0ab8c elementor-widget elementor-widget-heading\" data-id=\"9b0ab8c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Ingestion bronze tables: <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4a5c1b9 elementor-widget elementor-widget-text-editor\" data-id=\"4a5c1b9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Using Apache Sedona, ingestion is quite straightforward. In the code below, you can see how the GeoPackage files are read from an Azure storage account or an AWS S3 bucket:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ba65271 elementor-widget elementor-widget-code-highlight\" data-id=\"ba65271\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\nfrom sedona.spark import *\nfrom sedona.maps.SedonaPyDeck import SedonaPyDeck\nfrom sedona.maps.SedonaKepler import SedonaKepler\nfrom pyspark.sql import functions as F\nfrom sedona.sql import st_functions as st\nfrom sedona.sql.types import GeometryType\nfrom pyspark.sql.functions import expr\n\n\nconfig = SedonaContext.builder() .\\\n    config('spark.jars.packages',\n           'org.apache.sedona:sedona-spark-shaded-3.3_2.12:1.7.1,'\n           'org.datasyslab:geotools-wrapper:1.7.1-28.5'). \\\n    getOrCreate()\n\nsedona = SedonaContext.create(config)\n\n\ncatalog_name = \"geospatial\"\ncloud_provider = \"aws\"\n\n# Creating a dictionary that specifies the name of the schemas and tables based on the geopackage layers\nschema_tables = {\n    \"lookups\": {\n        \"bdline_gb.gpkg\": [\"boundary_line_ceremonial_counties\"],\n    },\n    \"greenspaces\": {\n        \"opgrsp_gb.gpkg\": [\"greenspace_site\", \"access_point\"]\n    },\n    \"networks\": {\n        \"oproad_gb.gpkg\": [\"road_link\", \"road_node\"],\n    },\n}\n\n\n# Writing each geopackage layer into a Delta table\nfor schema, files in schema_tables.items():\n    for gpkg_file, layers in files.items():\n        for table_name in layers:\n            if cloud_provider == \"azure\":\n                df = sedona.read.format(\"geopackage\").option(\"tableName\", table_name).load(f\"abfss:\/\/{dataset_container_name}@{dataset_storage_account_name}.dfs.core.windows.net\/{dataset_dir}\/{gpkg_file}\")\n            elif cloud_provider == \"aws\":\n                df = sedona.read.format(\"geopackage\").option(\"tableName\", table_name).load(f\"s3:\/\/{dataset_bucket_name}\/{dataset_input_dir}\/{gpkg_file}\")\n            df.write.mode(\"overwrite\").saveAsTable(f\"{catalog_name}.{schema}.{table_name}\")\n            print(f\"Table {catalog_name}.{schema}.{table_name} is created, yay!\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-15b136b elementor-widget elementor-widget-spacer\" data-id=\"15b136b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"spacer.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-spacer\">\n\t\t\t<div class=\"elementor-spacer-inner\"><\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7996431 elementor-widget elementor-widget-heading\" data-id=\"7996431\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Explore and visualize:  <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5feaf67 elementor-widget elementor-widget-text-editor\" data-id=\"5feaf67\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Now that we have transformed the GeoPackage files into Delta tables with a geometry column, it\u2019s easy and insightful to explore the data and translate business requirements into meaningful code:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cf6a429 elementor-widget elementor-widget-code-highlight\" data-id=\"cf6a429\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-sql line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-sql\">\n\t\t\t\t\t<xmp>\n%sql\nSELECT DISTINCT function\nFROM geospatial.greenspaces.greenspace_site\nORDER BY function;<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-29f2150 elementor-widget elementor-widget-code-highlight\" data-id=\"29f2150\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\n# Reading a Delta table to a spark dataframe\naccess_point_df  = spark.sql(\"\"\"\n  SELECT fid, access_type, geometry\n  FROM geospatial.greenspaces.access_point\n  WHERE access_type = 'Pedestrian'\n\"\"\").limit(500)\n\n# Visualize 500 access points on map using SedonaKepler \nmap = SedonaKepler.create_map()\nSedonaKepler.add_df(map, access_point_df, name=\"park access points\")\nmap<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6d2437d elementor-widget elementor-widget-image\" data-id=\"6d2437d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"670\" height=\"336\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-13.11.29.png\" class=\"attachment-large size-large wp-image-6541\" alt=\"\" srcset=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-13.11.29.png 670w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-13.11.29-300x150.png 300w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-13.11.29-18x9.png 18w\" sizes=\"(max-width: 670px) 100vw, 670px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d2b6771 elementor-widget elementor-widget-spacer\" data-id=\"d2b6771\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"spacer.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-spacer\">\n\t\t\t<div class=\"elementor-spacer-inner\"><\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4d956cd elementor-widget elementor-widget-heading\" data-id=\"4d956cd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Enrich silver tables: <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d192c1c elementor-widget elementor-widget-text-editor\" data-id=\"d192c1c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"7c93\" class=\"pw-post-body-paragraph wt wu rx wv b ww wx wy wz xa xb xc xd qm xe xf xg qp xh xi xj qs xk xl xm xn go bl\" data-selectable-paragraph=\"\">With the GeoPackage files now in Delta format, we apply a few enrichments to each table:<\/p><p id=\"7ca8\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">1. Using UK administrative boundaries, we distribute the 1000 ice-cream carts based on the area of each region, so that larger areas receive more carts.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8ede838 elementor-widget elementor-widget-code-highlight\" data-id=\"8ede838\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\n# Reading the administrative boundaries and calculating the area\nadministrative_boundaries  = spark.sql(\"\"\"\n  SELECT b.fid, b.name, ST_Area(b.geometry) AS area, geometry, ST_Geohash(ST_Transform(geometry,'epsg:27700','epsg:4326'), 5) AS geohash\n  FROM geospatial.lookups.boundary_line_ceremonial_counties b \n\"\"\").repartitionByRange(2, \"geohash\")\n\ntotal_locations = 1000\nuk_arae = administrative_boundaries.selectExpr(\"SUM(area) AS total_area\").first().total_area\n\n# Calculating how many ice-cream carts can be located in each country based on its area\nadministrative_boundaries = administrative_boundaries.withColumn(\n    \"number_of_locations\",\n    F.round(F.col(\"area\") \/ uk_arae * F.lit(total_locations)).cast(\"integer\")\n).orderBy(\"number_of_locations\", ascending=True)\n\nadministrative_boundaries.write.mode(\"overwrite\").option(\"mergeSchema\", \"true\").saveAsTable(f\"geospatial.lookups.boundary_line_ceremonial_counties_silver\")\nadministrative_boundaries.createOrReplaceTempView(\"administrative_boundaries_vw\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5432322 elementor-widget elementor-widget-text-editor\" data-id=\"5432322\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>2. The greenspace table contains parks, cemeteries, church gardens, etc. We focus on parks. While parks may include playgrounds and other features, we are not interested in the geometries of those smaller areas and the larger park geometry is sufficient. We also categorize parks by size using area quantiles. At the end, we are also interested to know that each parks belong to which county.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-33571a3 elementor-widget elementor-widget-code-highlight\" data-id=\"33571a3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\n# Reading the greenspace_site table and filtering relevant objects\ndf_greenspaces_bronze = spark.table(\"geospatial.greenspaces.greenspace_site\") \\\n    .filter(\"function IN ('Play Space', 'Playing Field', 'Public Park Or Garden')\")\ndf_greenspaces_bronze.createOrReplaceTempView(\"greenspace_site_bronze_vw\")\n\n# Finding the small playgrounds inside larger parks\ngreenspace_site_covered = spark.sql(\"\"\"\n  SELECT \n    g1.id AS g1_id,\n    g2.id AS g2_id,\n    g1.function AS g1_function,\n    g2.function AS g2_function,\n    g2.distinctive_name_1 AS g2_name,\n    g2.geometry AS geometry,\n    ST_Geohash(ST_Transform(g2.geometry ,'epsg:27700','epsg:4326'), 5) AS geohash\n  FROM greenspace_site_bronze_vw g1\n  INNER JOIN greenspace_site_bronze_vw g2\n    ON ST_CoveredBy(g1.geometry, g2.geometry)\n   AND g1.id != g2.id\n\"\"\").repartitionByRange(10, \"geohash\")\n\ngreenspace_site_covered.createOrReplaceTempView(\"greenspace_site_covered_vw\")\n\n# Aggrgating the small playgrounds in the larger parks\ngreenspace_site_aggregated = spark.sql(\"\"\"\n  SELECT \n    g2_id AS id,\n    concat_ws(', ', any_value(g2_function), collect_set(g1_function)) AS functions,\n    count(*) + 1 AS num_functions,\n    g2_name AS name,\n    ST_Area(geometry) AS area,\n    geometry,\n    ST_Geohash(ST_Transform(geometry,'epsg:27700','epsg:4326'), 5) AS geohash\n  FROM greenspace_site_covered_vw\n  GROUP BY g2_id, g2_name, geometry\n\"\"\").repartitionByRange(10, \"geohash\")\ngreenspace_site_aggregated.createOrReplaceTempView(\"greenspace_site_aggregated_vw\")\n\n# Find the parks without any smaller playgrounds inside them\ngreenspace_site_non_covered = spark.sql(\"\"\"\n  SELECT id, function, 1 AS num_functions, distinctive_name_1 AS name, ST_Area(geometry) AS area, geometry,\n  ST_Geohash(ST_Transform(geometry,'epsg:27700','epsg:4326'), 5) AS geohash\n  FROM greenspace_site_bronze_vw\n  WHERE id NOT IN (SELECT g1_id FROM greenspace_site_covered_vw)\n    AND id NOT IN (SELECT g2_id FROM greenspace_site_covered_vw)\n\"\"\").repartitionByRange(10, \"geohash\")\ngreenspace_site_non_covered.createOrReplaceTempView(\"greenspace_site_non_covered_vw\")\n\n# Union the above two dataframes\ngreenspace_site_all = spark.sql(\"\"\"\nSELECT * FROM greenspace_site_aggregated_vw\nUNION\nSELECT * FROM greenspace_site_non_covered_vw\"\"\").repartitionByRange(10, \"geohash\")\n\n# Calculate 0%, 20%, 40%, 60%, 80%, 100% quantiles\nquantiles = greenspace_site_all.approxQuantile(\"area\", [0.0, 0.2, 0.4, 0.6, 0.8, 1.0], 0.001)\nprint(\"Quintile breakpoints:\", quantiles)\n\nq0, q20, q40, q60, q80, q100 = quantiles\n\n\n# Categorize each park based on its area\ngreenspace_site_all = greenspace_site_all.withColumn(\n    \"area_category\",\n    F.when(F.col(\"area\") <= q20, 20)\n     .when(F.col(\"area\") <= q40, 40)\n     .when(F.col(\"area\") <= q60, 60)\n     .when(F.col(\"area\") <= q80, 80)\n     .otherwise(100)\n)\n\ndisplay(greenspace_site_all.groupBy(\"area_category\").count().orderBy(\"area_category\"))\ngreenspace_site_all.createOrReplaceTempView(\"greenspace_site_all_vw\")\n\n# In the final step, we aim to identify the county each park falls within.\ngreenspace_site_silver = spark.sql(\"\"\"\nWITH tmp AS (\n  SELECT a.id, a.functions, a.num_functions, a.name, a.area, a.area_category, a.geometry, a.geohash,\n  RANK() OVER(PARTITION BY a.id ORDER BY ST_Area(ST_Intersection(a.geometry, b.geometry)) DESC) AS administrative_rank,\n  b.fid as administrative_fid\n  FROM greenspace_site_all_vw a\n  INNER JOIN administrative_boundaries_vw b\n  ON ST_Intersects(a.geometry, b.geometry)\n)\nSELECT tmp.id, tmp.functions, tmp.num_functions, tmp.name, tmp.area, tmp.area_category, tmp.administrative_fid, tmp.geometry, tmp.geohash\nFROM tmp\nWHERE  administrative_rank = 1\n\"\"\").repartitionByRange(10, \"geohash\")\n\ngreenspace_site_silver.createOrReplaceTempView(\"greenspace_site_silver_vw\")\n\n# Write the dataframe into the correspoding silver Delta Table\ngreenspace_site_silver.write.mode(\"overwrite\").option(\"mergeSchema\", \"true\").saveAsTable(f\"geospatial.greenspaces.greenspace_site_silver\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e4656b3 elementor-widget elementor-widget-text-editor\" data-id=\"e4656b3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>3. For each road node (i.e. street junction), we calculate the number of edges (streets) connected to it (i.e. the node degree). The more connections, the more prominent the junction.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-271160c elementor-widget elementor-widget-code-highlight\" data-id=\"271160c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\n# Calculate the degree of each road node based on the edges it connects to\nroad_nodes_silver = spark.sql(\"\"\"\nSELECT a.fid, a.id, a.form_of_road_node, COUNT(DISTINCT b.id) AS degree, a.geometry, ST_Geohash(ST_Transform(a.geometry,'epsg:27700','epsg:4326'), 5) AS geohash \nFROM geospatial.networks.road_node a\nJOIN geospatial.networks.road_link b\nON a.id = b.start_node\nOR a.id = b.end_node\nGROUP BY a.fid, a.id, a.form_of_road_node, a.geometry\nORDER BY COUNT(DISTINCT b.id) DESC\"\"\").repartitionByRange(10, \"geohash\")\n\n# Write to a Delta table\nroad_nodes_silver.write.mode(\"overwrite\").saveAsTable(f\"geospatial.networks.road_node_silver\")\n<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d4b43c9 elementor-widget elementor-widget-text-editor\" data-id=\"d4b43c9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>4. For park entrances, we focus on pedestrian entries, as they attract more foot traffic, ideal for ice-cream carts. We also consider the node degree of the nearest street junction to each entrance, as this reflects how well-connected and accessible the entrance is.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0961004 elementor-widget elementor-widget-code-highlight\" data-id=\"0961004\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\n# Find relevant park entrances\ngreenspace_entries = spark.sql(\"\"\"\nSELECT a.fid, a.id, a.access_type, a.ref_to_greenspace_site, a.geometry, ST_Geohash(ST_Transform(a.geometry,'epsg:27700','epsg:4326'), 5) AS geohash\nFROM geospatial.greenspaces.access_point a\nWHERE a.ref_to_greenspace_site IN (SELECT id FROM greenspace_site_silver_vw) AND a.access_type IN ('Pedestrian', 'Motor Vehicle And Pedestrian')\"\"\").repartitionByRange(10, \"geohash\")\n\ngreenspace_entries.createOrReplaceTempView(\"greenspace_entries_vw\")\n\n# Find its nearest road junction\nentry_road_1nn = spark.sql(\"\"\"\nSELECT\n    a.fid,\n    a.id,\n    a.access_type,\n    a.ref_to_greenspace_site,\n    b.fid AS nearest_road_node_fid,\n    ST_Distance(a.geometry, b.geometry) AS distance_to_road_node,\n    a.geometry, \n    a.geohash\nFROM greenspace_entries_vw a \nINNER JOIN geospatial.networks.road_node_silver b \nON ST_kNN(a.geometry, b.geometry, 1, FALSE)\"\"\").repartitionByRange(10, \"geohash\")\n\n# Write to a Delta table\nentry_road_1nn.write.mode(\"overwrite\").saveAsTable(f\"geospatial.greenspaces.access_point_silver\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-602d0b4 elementor-widget elementor-widget-heading\" data-id=\"602d0b4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Aggregated gold tables: <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-42fdee6 elementor-widget elementor-widget-text-editor\" data-id=\"42fdee6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Now that we\u2019ve enriched our silver tables, we can rank each park and entrance based on the defined criteria and select our top 1000 locations.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b420ae7 elementor-widget elementor-widget-code-highlight\" data-id=\"b420ae7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-sql line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-sql\">\n\t\t\t\t\t<xmp>\n\n%sql\n-- Top parks to locate the ice-cream carts\nCREATE OR REPLACE TABLE geospatial.greenspaces.top_parks_gold AS ( \n  WITH tmp AS (\n    SELECT \n    gs.id,\n    gs.name AS park_name,\n    gs.functions,\n    gs.num_functions,\n    gs.area_category,\n    sum(rn.degree) AS accessibility_degree,\n    cc.name AS county,\n    cc.number_of_locations,\n    row_number() OVER (PARTITION BY cc.name ORDER BY gs.area_category DESC, gs.num_functions DESC, sum(degree) DESC) AS park_rank,\n    ST_AsEWKB(gs.geometry) AS geometry\n    FROM geospatial.greenspaces.greenspace_site_silver gs\n    LEFT JOIN geospatial.greenspaces.access_point_silver ga\n    ON gs.id = ga.ref_to_greenspace_site\n    LEFT JOIN geospatial.networks.road_node_silver rn\n    ON ga.nearest_road_node_fid = rn.fid\n    LEFT JOIN geospatial.lookups.boundary_line_ceremonial_counties_silver cc\n    ON gs.administrative_fid = cc.fid\n    GROUP BY gs.name, gs.functions, gs.num_functions, gs.area_category, cc.number_of_locations, cc.name, gs.geometry, gs.id, cc.name)\n  SELECT * FROM tmp WHERE park_rank <= number_of_locations);<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6641e54 elementor-widget elementor-widget-code-highlight\" data-id=\"6641e54\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-sql line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-sql\">\n\t\t\t\t\t<xmp>\n%sql\n-- Top entrances to locate the ice-cream carts\nCREATE OR REPLACE TABLE geospatial.greenspaces.top_entrances_gold AS (\n  WITH A AS (\n    SELECT \n    ga.id,\n    ga.ref_to_greenspace_site AS park_id,\n    gs.park_rank,\n    ga.access_type,\n    rn.degree,\n    ga.distance_to_road_node,\n    row_number() OVER (PARTITION BY gs.id ORDER BY rn.degree DESC, ga.distance_to_road_node ASC) AS entry_rank,\n    ST_AsEWKB(ga.geometry) AS geometry,\n    ST_X(ST_Transform(ga.geometry,'epsg:27700','epsg:4326')) AS longitude,\n    ST_Y(ST_Transform(ga.geometry,'epsg:27700','epsg:4326')) AS latitude\n    FROM geospatial.greenspaces.top_parks_gold gs\n    LEFT JOIN geospatial.greenspaces.access_point_silver ga\n    ON gs.id = ga.ref_to_greenspace_site\n    LEFT JOIN geospatial.networks.road_node_silver rn\n    ON ga.nearest_road_node_fid = rn.fid\n  ) \n  SELECT * FROM A WHERE entry_rank = 1\n);<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4da69e7 elementor-widget elementor-widget-heading\" data-id=\"4da69e7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is next? <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-16b095a elementor-widget elementor-widget-text-editor\" data-id=\"16b095a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"rf rg rh ri rj m\"><article><div class=\"m\"><div class=\"m\"><section><div><div class=\"go rs rt ru rv\"><div class=\"ac cr\"><div class=\"hd bi he hf hg hh\"><p id=\"bc97\" class=\"pw-post-body-paragraph wt wu rx wv b ww wx wy wz xa xb xc xd qm xe xf xg qp xh xi xj qs xk xl xm xn go bl\" data-selectable-paragraph=\"\"><strong class=\"wv mb\">In our live training <em class=\"xt\">\u201cDatabricks Geospatial in a Day\u201d<\/em> at RevoData Office<\/strong>, we\u2019ll delve deeper into the logic behind this code and use this example to demonstrate how to:<\/p><p id=\"160d\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">\u00b7 Set up a cluster with all necessary geospatial libraries<\/p><p id=\"e938\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">\u00b7 Follow best practices for working with Unity Catalog<\/p><p id=\"63af\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">\u00b7 Organize data using the Medallion Architecture<\/p><p id=\"2b28\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">\u00b7 Build an orchestration workflow<\/p><p id=\"2336\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">\u00b7 Create an AI\/BI dashboard to visualize the top entrances<\/p><\/div><\/div><\/div><\/div><\/section><\/div><\/div><\/article><\/div><div class=\"ac cr\"><div class=\"hd bi he hf hg hh\"><div class=\"ace acf ac gn\"><div class=\"gt ac\">\u00a0<\/div><\/div><\/div><\/div>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-20b8a1c elementor-widget elementor-widget-spacer\" data-id=\"20b8a1c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"spacer.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-spacer\">\n\t\t\t<div class=\"elementor-spacer-inner\"><\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-96a198e elementor-author-box--layout-image-left elementor-author-box--align-left elementor-widget elementor-widget-author-box\" data-id=\"96a198e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"author-box.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-author-box\">\n\t\t\t\t\t\t\t<div  class=\"elementor-author-box__avatar\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2025-09-08-at-14.16.05-288x300.png\" alt=\"Foto van Melika Sajadian\" loading=\"lazy\">\n\t\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"elementor-author-box__text\">\n\t\t\t\t\t\t\t\t\t<div >\n\t\t\t\t\t\t<h4 class=\"elementor-author-box__name\">\n\t\t\t\t\t\t\tMelika Sajadian\t\t\t\t\t\t<\/h4>\n\t\t\t\t\t<\/div>\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-author-box__bio\">\n\t\t\t\t\t\t<p>Senior Geospatial Consultant at RevoData, sharing with you her knowledge about Databricks Geospatial <\/p>\n\t\t\t\t\t<\/div>\n\t\t\t\t\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-7f92b56 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"7f92b56\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9518311\" data-id=\"9518311\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Editor&#8217;s note: This post was originally published June 6th, 2025.&nbsp; What is a location-allocation problem in GIS? Summer\u2019s almost here, so let\u2019s start with a sunny example. Imagine you\u2019re tasked with finding the top 1,000 most profitable locations across the UK to park an ice cream cart. It\u2019s not just about sweet treats, it\u2019s a [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":6545,"comment_status":"open","ping_status":"closed","sticky":false,"template":"elementor_theme","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[14,21,28],"tags":[],"class_list":["post-6210","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-it","category-databricks","category-geospatial"],"_links":{"self":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts\/6210","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/comments?post=6210"}],"version-history":[{"count":12,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts\/6210\/revisions"}],"predecessor-version":[{"id":6549,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts\/6210\/revisions\/6549"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/media\/6545"}],"wp:attachment":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/media?parent=6210"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/categories?post=6210"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/tags?post=6210"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}