{"id":6563,"date":"2026-01-23T15:32:43","date_gmt":"2026-01-23T14:32:43","guid":{"rendered":"https:\/\/revodata.nl\/?p=6563"},"modified":"2026-01-23T15:57:39","modified_gmt":"2026-01-23T14:57:39","slug":"eyes-to-the-sky-lidar-point-cloud-in-databricks-for-urban-canopy-insights","status":"publish","type":"post","link":"https:\/\/revodata.nl\/nl\/eyes-to-the-sky-lidar-point-cloud-in-databricks-for-urban-canopy-insights\/","title":{"rendered":"Eyes to the Sky: LiDAR Point Cloud in Databricks for Urban Canopy Insights"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"6563\" class=\"elementor elementor-6563\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-52459a6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"52459a6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-50b64aa\" data-id=\"50b64aa\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-be25c4f elementor-widget elementor-widget-text-editor\" data-id=\"be25c4f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Editor&#8217;s note: This post was originally published June 20th, 2025.\u00a0<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f932491 elementor-widget elementor-widget-heading\" data-id=\"f932491\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is the Sky View Factor and why does it matter in our cities?\n\n\n<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-462a8a8 elementor-widget elementor-widget-text-editor\" data-id=\"462a8a8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"3a6a\" class=\"pw-post-body-paragraph wt wu rx wv b ww wx wy wz xa xb xc xd qm xe xf xg qp xh xi xj qs xk xl xm xn go bl\" data-selectable-paragraph=\"\">I\u2019m back with another sunny, summer-inspired geospatial adventure!<br \/>Ever wonder how much sky you can actually see when you\u2019re standing on a busy city street or under a cluster of city trees? That visible slice of sky called the <strong class=\"wv mb\">Sky View Factor (SVF) <\/strong>has a surprisingly big say in how cities heat up, cool down, and even how comfortable we feel outside. The less sky you see, the more heat gets trapped between buildings and trees, creating those <strong class=\"wv mb\">urban heat islands<\/strong>, where city temperatures climb higher than surrounding areas. These heat islands don\u2019t just make summer days unbearable, they can worsen air pollution, increase energy demand for cooling, strain public health by amplifying heat-related illnesses, and even accelerate the wear and tear on city infrastructure.<\/p><p id=\"d7ca\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">Measuring SVF takes more than just a weather app or a quick satellite snapshot, you need a detailed, 3D, street-level view of the city. That\u2019s where <strong class=\"wv mb\">LiDAR point cloud data<\/strong> shines, capturing billions of laser-scanned points from every rooftop, treetop, street, and sidewalk. This treasure trove of 3D data lets us model the <strong class=\"wv mb\">urban canopy<\/strong>, the intricate layer of buildings and greenery that controls how sunlight and air flow through the cityscape. From this, we can calculate SVF and generate cool <strong class=\"wv mb\">fisheye plots<\/strong> that show exactly how much sky you\u2019d see lying anywhere on the ground.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a1376a4 elementor-widget elementor-widget-image\" data-id=\"a1376a4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img fetchpriority=\"high\" decoding=\"async\" width=\"730\" height=\"442\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.38.33.png\" class=\"attachment-large size-large wp-image-6564\" alt=\"\" srcset=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.38.33.png 730w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.38.33-300x182.png 300w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.38.33-18x12.png 18w\" sizes=\"(max-width: 730px) 100vw, 730px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4db95a2 elementor-widget elementor-widget-text-editor\" data-id=\"4db95a2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Back in my master\u2019s program, some classmates and I built an app for the municipality of The Hague that let users click on a map or upload a list of points to get SVF values and how much the sky is blocked by buildings and trees. You can check out our full report with the front-end and backend code <a class=\"ah yg\" href=\"https:\/\/repository.tudelft.nl\/file\/File_286cba8a-c3cb-4613-a169-b613d747f509\" target=\"_blank\" rel=\"noopener ugc nofollow\">here<\/a>. Back then, we ran the calculations using NumPy arrays and plenty of good old-fashioned for-loops. I\u2019m now reworking the project using PySpark, with a focus on scalable, data warehousing for big data analytics rather than real-time, on-the-fly processing, without delving deeply into the underlying mathematical computations. That\u2019s where Databricks shines: its cloud-native platform effortlessly handles massive LiDAR datasets, turning mountains of 3D points into quick, actionable insights. Whether you\u2019re a city planner aiming to cool down urban streets or simply a curious urban data explorer, it\u2019s never been easier or more fun to look up and ask: <em class=\"xt\">how much sky do we really see?<\/em><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5bf0ca0 elementor-widget elementor-widget-heading\" data-id=\"5bf0ca0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Implementation:<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-e66f2cc elementor-widget elementor-widget-text-editor\" data-id=\"e66f2cc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p class=\"pw-post-body-paragraph xz ya td yb b yc yd ye yf yg yh yi yj rs yk yl ym rv yn yo yp ry yq yr ys yt go bl\" data-selectable-paragraph=\"\">The following implementation is part of the training we offer at RevoData focused specifically on leveraging the geospatial capabilities of Databricks.<\/p><p id=\"327b\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">In this implementation, I want to focus on code refactoring and explore the options available to start with the low-hanging fruit, highlighting which parts of the code can be adapted for distributed processing with minimal changes.<\/p><p id=\"0f0a\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">Let\u2019s get started!<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5eb89c2 elementor-widget elementor-widget-heading\" data-id=\"5eb89c2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Datasets<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-587e946 elementor-widget elementor-widget-text-editor\" data-id=\"587e946\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"82c3\" class=\"pw-post-body-paragraph wt wu rx wv b ww wx wy wz xa xb xc xd qm xe xf xg qp xh xi xj qs xk xl xm xn go bl\" data-selectable-paragraph=\"\">As I mentioned earlier, we originally used point cloud data from the City of The Hague for this project. But today, I\u2019m taking you to Washington, partly to switch up the scenery for myself, and partly so you can download the data more easily from an English-language website. The data is fully available <a class=\"ah yg\" href=\"https:\/\/opendata.dc.gov\/datasets\/DCGIS::2020-lidar-classified-las\/explore?location=38.893538%2C-77.011550%2C11.48\" target=\"_blank\" rel=\"noopener ugc nofollow\">here<\/a>. I also generated a grid of 576 points across the area, which we\u2019ll use to calculate the Sky View Factor (SVF).\u00a0<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0fdd4fb elementor-widget elementor-widget-image\" data-id=\"0fdd4fb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"704\" height=\"751\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.40.45.png\" class=\"attachment-large size-large wp-image-6565\" alt=\"\" srcset=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.40.45.png 704w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.40.45-281x300.png 281w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.40.45-11x12.png 11w\" sizes=\"(max-width: 704px) 100vw, 704px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6a06afb elementor-widget elementor-widget-heading\" data-id=\"6a06afb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Import libraries<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8b0cc0d elementor-widget elementor-widget-text-editor\" data-id=\"8b0cc0d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>In this implementation, we need couple of libraries which I import in one go:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ba65271 elementor-widget elementor-widget-code-highlight\" data-id=\"ba65271\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>import math\nimport numpy as np\nimport boto3\nimport os\nimport matplotlib.pyplot as plt\nimport pdal\nimport json\nimport io\nimport pyarrow as pa\nfrom pyspark.sql.functions import col, sqrt, pow, lit, when, atan2, degrees, floor\nfrom pyspark.sql.types import StructType, StructField, DoubleType, FloatType, IntegerType, ShortType, LongType, ByteType, BooleanType, MapType, StringType, ArrayType\nimport pandas as pd\nfrom sedona.spark import *\nfrom pyspark.sql import functions as F\nfrom pyspark.sql.window import Window\nimport base64\nfrom PIL import Image<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9b0ab8c elementor-widget elementor-widget-heading\" data-id=\"9b0ab8c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Data ingestion<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4a5c1b9 elementor-widget elementor-widget-text-editor\" data-id=\"4a5c1b9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>For the generated grid points, Apache Sedona makes geospatial data ingestion remarkably easy.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-35335b5 elementor-widget elementor-widget-code-highlight\" data-id=\"35335b5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\nconfig = SedonaContext.builder() .\\\n    config('spark.jars.packages',\n           'org.apache.sedona:sedona-spark-shaded-3.3_2.12:1.7.1,'\n           'org.datasyslab:geotools-wrapper:1.7.1-28.5'). \\\n    getOrCreate()\n\nsedona = SedonaContext.create(config)\n\n# The path to the grid geopackage and point cloud las file\ndataset_bucket_name = \"revodata-databricks-geospatial\"\ndataset_input_dir=\"geospatial-dataset\/point-cloud\/washington\"\ngpkg_file = \"grid\/pc_grid.gpkg\"\npointcloud_file = \"las-laz\/1816.las\"\ninput_path = f\"s3:\/\/{dataset_bucket_name}\/{dataset_input_dir}\/{pointcloud_file}\"\n\n# Read the grid data\ndf_grid = sedona.read.format(\"geopackage\").option(\"tableName\", \"grid\").load(f\"s3:\/\/{dataset_bucket_name}\/{dataset_input_dir}\/{gpkg_file}\").withColumnRenamed(\"geom\", \"geometry\").withColumn(\"x1\", F.expr(\"ST_X(geometry)\")).withColumn(\"y1\", F.expr(\"ST_Y(geometry)\")).select(\"fid\", \"x1\", \"y1\", \"geometry\")\n\nnum_partitions = math.ceil(df_grid.count()\/2)\n<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-15b136b elementor-widget elementor-widget-spacer\" data-id=\"15b136b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"spacer.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-spacer\">\n\t\t\t<div class=\"elementor-spacer-inner\"><\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-907edbb elementor-widget elementor-widget-text-editor\" data-id=\"907edbb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"fe51\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\">To ingest point cloud data, we can use libraries like <code class=\"gb amx amy amz abr b\">laspy<\/code> or <code class=\"gb amx amy amz abr b\">PDAL<\/code>. In this case, I used <code class=\"gb amx amy amz abr b\">PDAL<\/code>, applying a few read-time optimizations to efficiently convert the output array into a PySpark DataFrame:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c355603 elementor-widget elementor-widget-code-highlight\" data-id=\"c355603\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\ndef _create_arrow_schema_from_pdal(pdal_array):\n    \"\"\"Create Arrow schema from PDAL array structure.\"\"\"\n    fields = []\n    \n    # Map PDAL types to Arrow types\n    type_mapping = {\n        'float32': pa.float32(),\n        'float64': pa.float64(),\n        'int32': pa.int32(),\n        'int16': pa.int16(),\n        'uint8': pa.uint8(),\n        'uint16': pa.uint16(),\n        'uint32': pa.uint32()\n    }\n    \n    for field_name in pdal_array.dtype.names:\n        field_type = pdal_array[field_name].dtype\n        arrow_type = type_mapping.get(str(field_type), pa.float32())  # default to float32\n        fields.append((field_name, arrow_type))\n    \n    return pa.schema(fields)\n\ndef _create_spark_schema(arrow_schema):\n    \"\"\"Convert PyArrow schema to Spark DataFrame schema.\"\"\"\n    spark_fields = []\n    \n    type_mapping = {\n        pa.float32(): FloatType(),\n        pa.float64(): DoubleType(),\n        pa.int32(): IntegerType(),\n        pa.int16(): ShortType(),\n        pa.int8(): ByteType(),\n        pa.uint8(): ByteType(),\n        pa.uint16(): IntegerType(),  # Spark doesn't have unsigned types\n        pa.uint32(): LongType(),     # Spark doesn't have unsigned types\n        pa.string(): StringType(),\n        # Add other type mappings as needed\n    }\n    \n    for field in arrow_schema:\n        arrow_type = field.type\n        spark_type = type_mapping.get(arrow_type, StringType())  # default to StringType\n        spark_fields.append(\n            StructField(field.name, spark_type, nullable=True)\n        )\n    \n    return StructType(spark_fields)\n\n\ndef pdal_to_spark_dataframe_large(pipeline_config, spark, chunk_size=1000000):\n    \"\"\"Streaming version for very large files.\"\"\"\n    pipeline = pdal.Pipeline(json.dumps(pipeline_config))\n    pipeline.execute()\n    \n    # Get schema from first array\n    first_array = pipeline.arrays[0]\n    schema = _create_arrow_schema_from_pdal(first_array)\n    \n    # Create empty RDD\n    rdd = spark.sparkContext.emptyRDD()\n\n    \n    # Process arrays in chunks\n    for array in pipeline.arrays:\n        for i in range(0, len(array), chunk_size):\n            chunk = array[i:i+chunk_size]\n            data_dict = {name: chunk[name] for name in chunk.dtype.names}\n            arrow_table = pa.Table.from_pydict(data_dict, schema=schema)\n            pdf = arrow_table.to_pandas()\n            chunk_rdd = spark.sparkContext.parallelize(pdf.to_dict('records'))\n            rdd = rdd.union(chunk_rdd)\n    \n    # Convert to DataFrame\n    return spark.createDataFrame(rdd, schema=_create_spark_schema(schema))<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2c8da5f elementor-widget elementor-widget-code-highlight\" data-id=\"2c8da5f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\npipeline_config = {\n    \"pipeline\": [\n        {\n            \"type\": \"readers.las\",\n            \"filename\": input_path,\n        }\n    ]\n}\n\n# Convert point cloud array to Spark DataFrame\ndf_pc = pdal_to_spark_dataframe_large(pipeline_config, spark)\ndf_pc = df_pc.withColumn(\"geometry\", F.expr(\"ST_Point(X, Y)\"))\ndf_pc.write.mode(\"overwrite\").saveAsTable(f\"geospatial.pointcloud.wasahington_pc\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7996431 elementor-widget elementor-widget-heading\" data-id=\"7996431\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Identifying point cloud data surrounding each grid point <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-5feaf67 elementor-widget elementor-widget-text-editor\" data-id=\"5feaf67\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Next, we need to retrieve all point cloud data within 100 meters for buildings and high vegetation, these are used for SVF calculation. For ground points, we only consider those within 10 meters, as they\u2019re used solely to estimate the elevation of each grid point.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cf6a429 elementor-widget elementor-widget-code-highlight\" data-id=\"cf6a429\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>df_selected = df_pc.select(\"X\", \"Y\", \"Z\", \"Classification\")\n\ndome_radius = 100\nheight_radius = 10\n\n# Register as temp views\ndf_pc.createOrReplaceTempView(\"pc_vw\")\ndf_grid.createOrReplaceTempView(\"grid_vw\")\n\n# Perform spatial join using ST_DWithin with 100 meters\ngrid_join_pc = spark.sql(f\"\"\"\n    SELECT \n        g.fid, \n        ST_X(g.geometry) AS x1,\n        ST_Y(g.geometry) AS y1,\n        p.classification,\n        p.x AS pc_x,\n        p.y AS pc_y,\n        p.z AS pc_z,\n        ST_Distance(g.geometry, p.geometry) AS distance,\n        g.geometry AS g_geometry,\n        p.geometry AS pc_geometry \n    FROM grid_vw g\n    JOIN pc_vw p\n        ON ST_DWithin(g.geometry, p.geometry, {dome_radius})\n    WHERE p.classification IN (5, 6) OR (p.classification = 2 AND ST_DWithin(g.geometry, p.geometry, {height_radius}))\n\"\"\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4d956cd elementor-widget elementor-widget-heading\" data-id=\"4d956cd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Estimating grid point elevation from nearby point cloud data (10m Radius) <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d192c1c elementor-widget elementor-widget-text-editor\" data-id=\"d192c1c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"2e9d\" class=\"pw-post-body-paragraph wt wu rx wv b ww wx wy wz xa xb xc xd qm xe xf xg qp xh xi xj qs xk xl xm xn go bl\" data-selectable-paragraph=\"\">Here, we determine the elevation of each grid point by identifying its dominant surrounding class, either building or ground, and then computing the average elevation of that class within a defined radius.<\/p><p data-selectable-paragraph=\"\">\u00a0<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8ede838 elementor-widget elementor-widget-code-highlight\" data-id=\"8ede838\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp># Filter only classification 2 and 6 and count occurrences of (fid, classification)\ngrouped = grid_join_pc.filter(\n    (F.col(\"classification\").isin(2, 6)) & (F.col(\"distance\") <= height_radius)\n).groupBy(\"fid\", \"classification\").count()\n\n# Define window: partition by fid, order by count descending\nwindow_spec = Window.partitionBy(\"fid\").orderBy(F.desc(\"count\"))\n\n# Apply row_number\nranked = grouped.withColumn(\"rn\", F.row_number().over(window_spec))\n\n# Compute the average elevation for each grid point using nearby point cloud data within a specified radius.\ngrid_pc_elevation = grid_join_pc.join(g_classification_df, on=[\"fid\", \"classification\"]).filter(\n    (F.col(\"distance\") <= height_radius)\n).groupBy(\"fid\").agg(\n    (F.sum(\"pc_z\") \/ F.count(\"pc_z\")).alias(\"height\")\n)\n\n# Combine point cloud data with classification info and computed height, optimized with repartitioning.\ngrid_pc_elevation_all = grid_join_pc.withColumnRenamed(\"classification\", \"p_classification\").join(g_classification_df, on=[\"fid\"]).join(grid_pc_elevation, on=[\"fid\"]).repartitionByRange(num_partitions, \"fid\")\n\n# Filter out ground points (e.g., class 2) to retain only buildings and high vegetation points for analysis.\ngrid_pc_cleaned = grid_pc_elevation_all.filter(\"p_classification != 2\").repartitionByRange(num_partitions, \"fid\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-602d0b4 elementor-widget elementor-widget-heading\" data-id=\"602d0b4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Creating the dome, generating the plot, and calculating the SVF\n<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-42fdee6 elementor-widget elementor-widget-text-editor\" data-id=\"42fdee6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>The dome is a representation of the sky, going from the horizon all the way to the zenith (directly on top) of the viewpoint. The dome can be split into sectors based on horizontal and vertical directions, in essence creating a dome-like shaped grid. The units used to split the sectors are 2 degrees horizontally (azimuth angle), and 1 degree vertically (elevation angle), which are considered as appropriate values for calculation.<\/p><p id=\"827c\" class=\"pw-post-body-paragraph wt wu rx wv b ww wy wz xa xc xd qm xf xg qp xi xj qs xl xm gx xn go bl\" data-selectable-paragraph=\"\">To calculate the Sky View Factor (SVF), point cloud data is projected onto a dome divided into sectors, marking which sectors are blocked from view. The closest point in each sector determines the obstruction, and if that point is a building, all sectors below it in that direction are also considered blocked. The unobstructed proportion of the dome\u2019s area gives the SVF. For clarity, the results are visualized in a circular plot showing which sectors are clear sky or obstructed by buildings or vegetation, oriented to the north for easy interpretation.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8300e9e elementor-widget elementor-widget-image\" data-id=\"8300e9e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" width=\"717\" height=\"347\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.51.10.png\" class=\"attachment-large size-large wp-image-6566\" alt=\"\" srcset=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.51.10.png 717w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.51.10-300x145.png 300w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.51.10-18x9.png 18w\" sizes=\"(max-width: 717px) 100vw, 717px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b420ae7 elementor-widget elementor-widget-code-highlight\" data-id=\"b420ae7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp># Calculate raw azimuth angle (in degrees) from each grid point to each point in the point cloud.\n# Shifted by -90 to align with the 0\u00b0 direction being north.\ngrid_pc_az = grid_pc_cleaned.withColumn(\n    \"azimuth_raw\",\n    degrees(F.atan2(F.col(\"pc_y\") - F.col(\"y1\"), F.col(\"pc_x\") - F.col(\"x1\"))) - 90\n)\n\n# Normalize azimuth angle to fall within the range [0, 360).\ngrid_pc_az = grid_pc_az.withColumn(\n    \"azimuth\",\n    when(F.col(\"azimuth_raw\") < 0, F.col(\"azimuth_raw\") + 360).otherwise(F.col(\"azimuth_raw\"))\n)\n\n# Drop the temporary azimuth_raw column to clean up the DataFrame.\ngrid_pc_az = grid_pc_az.drop(\"azimuth_raw\")\n\n# Calculate the elevation angle (in degrees) from the grid point to each point in the point cloud.\n# Height is divided by 1000 to convert from millimeters to meters if necessary.\ngrid_pc_az = grid_pc_az.withColumn(\n    \"elevation\",\n    degrees(F.atan2(F.col(\"pc_z\") - F.col(\"height\") \/ 1000, F.col(\"distance\")))\n)\n\n# Bin azimuth angles into 2-degree intervals (0\u2013179 bins for 360\u00b0).\ngrid_pc_az = grid_pc_az.withColumn(\"azimuth_bin\", F.floor(F.col(\"azimuth\") \/ 2))\n\n# Get the minimum elevation angle across all records to define the lower bound of elevation bins.\nmin_val = F.lit(grid_pc_az.select(F.min(\"elevation\")).first()[0])\n\n# Get the maximum elevation angle across all records to define the upper bound of elevation bins.\nmax_val = F.lit(grid_pc_az.select(F.max(\"elevation\")).first()[0])\n\n# Compute bin width by dividing elevation range into 89 equal parts (90 bins total).\nbin_width = (max_val - min_val) \/ 89\n\n# Bin elevation angles into 90 intervals, ensuring they stay within the [0, 89] range.\ngrid_pc_az = grid_pc_az.withColumn(\"elevation_bin\", \n    F.least(\n        F.greatest(\n            F.floor(\n                (F.col(\"elevation\") - F.lit(min_val)) \/ \n                F.lit((max_val - min_val)\/90)\n            ).cast(\"int\"),\n            F.lit(0)  # Clamp minimum bin index to 0\n        ),\n        F.lit(89)  # Clamp maximum bin index to 89\n    )\n)\n\n\n# Define a window that partitions the data by azimuth and elevation bins,\n# and orders points within each bin by their distance to the grid point.\nwindow_spec = Window.partitionBy(\"azimuth_bin\", \"elevation_bin\").orderBy(\"distance\")\n\n# Assign a row number within each azimuth-elevation bin, so the closest point (smallest distance) gets rank 1.\ndf_with_rank = grid_pc_az.withColumn(\"rn\", F.row_number().over(window_spec))\n\n# Keep only the closest point (rank 1) in each bin and drop the temporary rank column.\n# Then repartition the result by 'fid' to optimize parallel processing in subsequent steps.\nclosest_points = df_with_rank.filter(col(\"rn\") == 1).drop(\"rn\").repartitionByRange(num_partitions, \"fid\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a00c8fd elementor-widget elementor-widget-code-highlight\" data-id=\"a00c8fd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\ndef create_dome(pdf: pd.DataFrame, max_azimuth: int = 180, max_elevation: int = 90) -> np.ndarray:\n    \"\"\"\n    Creates a dome matrix based on azimuth and elevation bins, with obstruction handling for buildings.\n    \"\"\"\n    dome = np.zeros((max_azimuth, max_elevation), dtype=int)\n    domeDists = np.zeros((max_azimuth, max_elevation), dtype=float)\n\n    for _, row in pdf.iterrows():\n        a = int(row[\"azimuth_bin\"])\n        e = int(row[\"elevation_bin\"])\n        dome[a, e] = row[\"p_classification\"]\n        domeDists[a, e] = row[\"distance\"]\n\n    # Mark parts of the dome that are obstructed by buildings\n    if np.any(dome == 6):  # 6 = buildings\n        bhor, bver = np.where(dome == 6)\n        builds = np.stack((bhor, bver), axis=-1)\n        shape = (builds.shape[0] + 1, builds.shape[1])\n        builds = np.append(builds, (bhor[0], bver[0])).reshape(shape)\n        azimuth_change = builds[:, 0][:-1] != builds[:, 0][1:]\n        keep = np.where(azimuth_change)\n        roof_rows, roof_cols = builds[keep][:, 0], builds[keep][:, 1]\n        for roof_row, roof_col in zip(roof_rows, roof_cols):\n            condition = np.where(np.logical_or(\n                domeDists[roof_row, :roof_col] > domeDists[roof_row, roof_col],\n                dome[roof_row, :roof_col] == 0\n            ))\n            dome[roof_row, :roof_col][condition] = 6\n\n    return dome\n\n# Plot dome\ndef generate_plot_image(dome):\n    # Create circular grid\n    theta = np.linspace(0, 2*np.pi, 180, endpoint=False)\n    radius = np.linspace(0, 90, 90)\n    theta_grid, radius_grid = np.meshgrid(theta, radius)\n\n    Z = dome.copy().astype(float)\n    \n    Z = Z.T[::-1, :]  # Transpose and flip vertically\n\n    Z[Z == 0] = 0\n    Z[np.isin(Z, [5])] = 0.5\n    Z[Z == 6] = 1\n\n    if Z[Z == 6].size == 0:\n        Z[0, 0] = 1  # Force plot to show something\n\n    fig = plt.figure(figsize=(4, 4))\n    ax = fig.add_subplot(111, projection='polar')\n    cmap = plt.get_cmap('tab20c')\n    ax.pcolormesh(theta, radius, Z, cmap=cmap)\n    ax.set_ylim([0, 90])\n    ax.tick_params(labelleft=False)\n    ax.set_theta_zero_location(\"N\")\n    ax.set_xticks([])\n    ax.set_yticks([])\n\n    buf = io.BytesIO()\n    plt.savefig(buf, format='png', bbox_inches='tight', pad_inches=0)\n    plt.close(fig)\n    buf.seek(0)\n    img_base64 = base64.b64encode(buf.read()).decode('utf-8')\n    return img_base64\n\n\ndef process_and_plot(pdf: pd.DataFrame) -> pd.DataFrame:\n    fid = pdf[\"fid\"].iloc[0]\n\n    # Create dome with building\/vegetation obstruction\n    dome = create_dome(pdf)\n\n    # Generate base64-encoded fisheye plot image\n    plot_base64 = generate_plot_image(dome)\n\n    # Compute SVF and obstruction metrics\n    SVF, tree_percentage, build_percentage = calculate_SVF(100, dome)\n\n    return pd.DataFrame(\n        [[fid, dome.tolist(), plot_base64, SVF, tree_percentage, build_percentage]],\n        columns=[\"fid\", \"dome\", \"plot\", \"SVF\", \"treeObstruction\", \"buildObstruction\"]\n    )<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6f6d5f7 elementor-widget elementor-widget-heading\" data-id=\"6f6d5f7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Results <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-1d2428d elementor-widget elementor-widget-text-editor\" data-id=\"1d2428d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>As you can see in the code above, some parts still rely on NumPy structures for operations like dome construction, plotting and SVF calculation. Since each grid point corresponds to a single dome, we can safely apply these NumPy-based functions on a per-row basis. To do this efficiently in PySpark , <em class=\"xt\">without too much code refactoring<\/em>, I use the <code class=\"gb amx amy amz abr b\">applyInPandas<\/code> method. This allows us to apply our existing Pandas-based logic directly to each group of rows (grouped by the <code class=\"gb amx amy amz abr b\">\"fid\"<\/code> column) within the PySpark DataFrame. This way, we can leverage distributed processing in Spark while reusing existing, well-tested NumPy code for the dome and SVF calculations.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c2c62a0 elementor-widget elementor-widget-code-highlight\" data-id=\"c2c62a0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp># Desired schema\noutput_schema = StructType([\n    StructField(\"fid\", IntegerType()),\n    StructField(\"dome\", ArrayType(ArrayType(IntegerType()))),\n    StructField(\"plot\", StringType()),\n    StructField(\"SVF\", FloatType()),\n    StructField(\"treeObstruction\", FloatType()),\n    StructField(\"buildObstruction\", FloatType())\n])\n\nresult_df = closest_points.groupBy(\"fid\").applyInPandas(process_and_plot, schema=output_schema)\nresult_df.write.mode(\"overwrite\").saveAsTable(f\"geospatial.pointcloud.wasahington_grid\")<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-60253b9 elementor-widget elementor-widget-code-highlight\" data-id=\"60253b9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"code-highlight.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"prismjs-default copy-to-clipboard \">\n\t\t\t<pre data-line=\"\" class=\"highlight-height language-python line-numbers\">\n\t\t\t\t<code readonly=\"true\" class=\"language-python\">\n\t\t\t\t\t<xmp>\n# Fetch the the grid point with fid = 105 for a sample visualization\npdf = result_df.filter(\"fid = 105\").select(\"fid\", \"plot\").toPandas()\n\nfor index, row in pdf.iterrows():\n  # Decode base64 string to bytes\n  img_bytes = base64.b64decode(img_base64)\n\n  # Load image with PIL\n  image = Image.open(io.BytesIO(img_bytes))\n\n  # Display using matplotlib (preserves original colors)\n  plt.figure(figsize=(6, 6))\n  plt.imshow(image)\n  plt.axis('off')  # Hide axes\n  plt.show()<\/xmp>\n\t\t\t\t<\/code>\n\t\t\t<\/pre>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ea42f0f elementor-widget elementor-widget-image\" data-id=\"ea42f0f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"702\" height=\"374\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.54.36.png\" class=\"attachment-large size-large wp-image-6567\" alt=\"\" srcset=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.54.36.png 702w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.54.36-300x160.png 300w, https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2026-01-23-at-15.54.36-18x10.png 18w\" sizes=\"(max-width: 702px) 100vw, 702px\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4da69e7 elementor-widget elementor-widget-heading\" data-id=\"4da69e7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is next? <\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-16b095a elementor-widget elementor-widget-text-editor\" data-id=\"16b095a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p id=\"4366\" class=\"pw-post-body-paragraph wt wu rx wv b ww wx wy wz xa xb xc xd qm xe xf xg qp xh xi xj qs xk xl xm xn go bl\" data-selectable-paragraph=\"\"><strong class=\"wv mb\">In our live training <em class=\"xt\">\u201c<\/em><\/strong><a class=\"ah yg\" href=\"https:\/\/revodata.nl\/databricks-geospatial-in-a-day\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><strong class=\"wv mb\"><em class=\"xt\">Databricks Geospatial in a Day<\/em><\/strong><\/a><strong class=\"wv mb\"><em class=\"xt\">\u201d<\/em> at RevoData Office<\/strong>, we\u2019ll delve deeper into the logic behind this code and use this example to demonstrate how to:<\/p><ul class=\"\"><li id=\"af6b\" class=\"wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn anc and ane bl\" data-selectable-paragraph=\"\">Set up a cluster capable of processing point cloud data<\/li><li id=\"8ddd\" class=\"wt wu rx wv b ww anf wy wz xa ang xc xd qm anh xf xg qp ani xi xj qs anj xl xm xn anc and ane bl\" data-selectable-paragraph=\"\">Visualize LiDAR point clouds directly in Databricks<\/li><li id=\"6a8e\" class=\"wt wu rx wv b ww anf wy wz xa ang xc xd qm anh xf xg qp ani xi xj qs anj xl xm xn anc and ane bl\" data-selectable-paragraph=\"\">Efficiently partition point cloud data for distributed processing<\/li><li id=\"b858\" class=\"wt wu rx wv b ww anf wy wz xa ang xc xd qm anh xf xg qp ani xi xj qs anj xl xm xn anc and ane bl\" data-selectable-paragraph=\"\">Tackle the challenges of code migration and minimal refactoring<\/li><\/ul><p id=\"7704\" class=\"pw-post-body-paragraph wt wu rx wv b ww xo wy wz xa xp xc xd qm xq xf xg qp xr xi xj qs xs xl xm xn go bl\" data-selectable-paragraph=\"\"><strong class=\"wv mb\">Go ahead and grab your spot for the training using the link below, can\u2019t wait to see you there!<\/strong><br \/><a class=\"ah yg\" href=\"https:\/\/revodata.nl\/databricks-geospatial-in-a-day\/\" target=\"_blank\" rel=\"noopener ugc nofollow\"><em class=\"xt\">https:\/\/revodata.nl\/databricks-geospatial-in-a-day\/<\/em><\/a><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-20b8a1c elementor-widget elementor-widget-spacer\" data-id=\"20b8a1c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"spacer.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-spacer\">\n\t\t\t<div class=\"elementor-spacer-inner\"><\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-96a198e elementor-author-box--layout-image-left elementor-author-box--align-left elementor-widget elementor-widget-author-box\" data-id=\"96a198e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"author-box.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-author-box\">\n\t\t\t\t\t\t\t<div  class=\"elementor-author-box__avatar\">\n\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/revodata.nl\/wp-content\/uploads\/Screenshot-2025-09-08-at-14.16.05-288x300.png\" alt=\"Foto van Melika Sajadian\" loading=\"lazy\">\n\t\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"elementor-author-box__text\">\n\t\t\t\t\t\t\t\t\t<div >\n\t\t\t\t\t\t<h4 class=\"elementor-author-box__name\">\n\t\t\t\t\t\t\tMelika Sajadian\t\t\t\t\t\t<\/h4>\n\t\t\t\t\t<\/div>\n\t\t\t\t\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-author-box__bio\">\n\t\t\t\t\t\t<p>Senior Geospatial Consultant at RevoData, sharing with you her knowledge about Databricks Geospatial <\/p>\n\t\t\t\t\t<\/div>\n\t\t\t\t\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-7f92b56 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"7f92b56\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9518311\" data-id=\"9518311\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap\">\n\t\t\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Editor&#8217;s note: This post was originally published June 20th, 2025.\u00a0 What is the Sky View Factor and why does it matter in our cities? I\u2019m back with another sunny, summer-inspired geospatial adventure!Ever wonder how much sky you can actually see when you\u2019re standing on a busy city street or under a cluster of city trees? [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":6562,"comment_status":"open","ping_status":"closed","sticky":false,"template":"elementor_theme","format":"standard","meta":{"_angie_page":false,"content-type":"","page_builder":"","footnotes":""},"categories":[14,21,28],"tags":[],"class_list":["post-6563","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-it","category-databricks","category-geospatial"],"_links":{"self":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts\/6563","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/comments?post=6563"}],"version-history":[{"count":3,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts\/6563\/revisions"}],"predecessor-version":[{"id":6570,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/posts\/6563\/revisions\/6570"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/media\/6562"}],"wp:attachment":[{"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/media?parent=6563"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/categories?post=6563"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/revodata.nl\/nl\/wp-json\/wp\/v2\/tags?post=6563"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}