RevoData https://revodata.nl/nl We empower your Data & AI journey Wed, 26 Nov 2025 13:41:51 +0000 nl-NL hourly 1 https://wordpress.org/?v=6.9 https://revodata.nl/wp-content/uploads/cropped-Layer-1-1-32x32.png RevoData https://revodata.nl/nl 32 32 Mastering Location Data: Geospatial Magic Meets Databricks Power https://revodata.nl/nl/mastering-location-data-geospatial-magic-meets-databricks-power/ https://revodata.nl/nl/mastering-location-data-geospatial-magic-meets-databricks-power/#respond Wed, 26 Nov 2025 13:09:52 +0000 https://revodata.outlawz.dev/?p=6194

Mastering Location Data: Geospatial Magic Meets Databricks Power

Ever used Google Maps to find your way around? That’s geospatial data in action! It’s information tied to a place on Earth-like where your favorite ice-cream shop is, where roads go, where cities are expanding, and how places change over time, just to name a few.

GIS, or Geographic Information Systems, takes this data and turns it into smart maps and tools that help people make better decisions. From choosing the safest route for a delivery truck to planning where to build a new hospital or identifying areas at risk of floods or urban heat islands, GIS helps us understand where things happen and how we can use that insight to make better decisions.

Geospatial experts often use tools like FME or ArcGIS to look at maps and analyze location data. They usually keep their data in databases like Postgres or Oracle Spatial, and write code in SQL or Python using libraries like PostGIS, GeoPandas, GDAL, or PDAL to get the job done.

But today, we’re dealing with way more data than before. That’s where platforms like Databricks come in. It’s a modern tool that can handle huge amounts of data, run complex workflows faster, and work alongside the tools geospatial folks already use. Think of it as a powerful new teammate for your geospatial projects.

Where should you begin your journey into geospatial data on Databricks? The good news is that RevoData is offering a specialized training session focused entirely on using Databricks for geospatial workflows. This session will guide you through the essentials of working with Databricks. We’ll also look at how Databricks works together with other geospatial tools like FME, ArcGIS, and Postgres. Whether you’re just getting started or looking to optimize your current processes, this training will help you understand the core principles and practical applications of geospatial data integration within the Databricks ecosystem.

You’ll explore the benefits of migrating your geospatial workflows to Databricks, leveraging its modern lakehouse architecture that merges scalable storage with lightning-fast analytics. We’ll walk through the key Python and Spark libraries that enable efficient and flexible spatial data processing, helping you unlock Databricks’ full potential. By the end of the session, you’ll have a clear understanding of when and how to make the shift, and the tools you’ll need to get there.

This training is designed to be hands-on and practical, with exercises that guide you through real-world applications. We’ll work with a variety of geospatial data types – including vector data (like topographic maps and point clouds), raster data (such as aerial imagery and netCDF files), and even graph-based data – to solve meaningful geospatial problem.

Here’s a quick sneak peek at the hands-on training cases:

SQL Server executes queries within a single-node environment, meaning all operations—such as joins, aggregations, and filtering—occur on a centralized database server. The query optimizer determines the best execution plan, using indexes, statistics, and caching to improve efficiency. However, performance is ultimately limited by the resources (CPU, memory, and disk) of a single machine.

Databricks, powered by Apache Spark, distributes query execution across multiple nodes in a cluster. Instead of a single execution plan operating on one server, Spark breaks down queries into smaller tasks, which are executed in parallel across worker nodes. This approach enables Databricks to handle massive datasets efficiently, leveraging memory and compute resources across a distributed system.

Location-allocation

Location-allocation problem: Summer’s almost here, and what better way to celebrate than with a sunny use case? We’ll dive into a geospatial analysis to uncover the top 1,000 sweetest spots in the UK to park an ice cream cart and scoop up the highest profits.

Shortest path between A and B

Shortest path calculation: The shortest path algorithm is one of the most widely used techniques in network analysis, often applied to optimize routes and reduce travel time. In this case, we’ll use it to map out the most efficient paths from a well-known landmark to all other locations within a selected area in the UK, helping us better understand connectivity and accessibility across the region.

Change Detection

Temporal change detection using aerial images: This use case compares high-resolution (0.25 meter) aerial orthophotos with RGB and infrared bands from 2022 and 2025 to detect changes in land use, buildings, and vegetation in SoMa, San Francisco. The results support urban planning and development decisions by highlighting growth and transformation in the neighborhood.

Sky View Factor

Temporal change detection using aerial images: This use case compares high-resolution (0.25 meter) aerial orthophotos with RGB and infrared bands from 2022 and 2025 to detect changes in land use, buildings, and vegetation in SoMa, San Francisco. The results support urban planning and development decisions by highlighting growth and transformation in the neighborhood.

In the upcoming posts, we’ll dive deeper into each use case. Stay tuned!

Foto van Melika Sajadian

Melika Sajadian

Senior Geospatial Consultant at RevoData, sharing with you her knowledge about Databricks Geospatial

]]>
https://revodata.nl/nl/mastering-location-data-geospatial-magic-meets-databricks-power/feed/ 0
SQL Server vs Apache Spark: A Deep Dive into Execution Differences https://revodata.nl/nl/sql-server-vs-apache-spark-a-deep-dive-into-execution-differences/ https://revodata.nl/nl/sql-server-vs-apache-spark-a-deep-dive-into-execution-differences/#respond Wed, 26 Nov 2025 10:28:51 +0000 https://revodata.outlawz.dev/?p=6187

SQL Server vs Apache Spark: A Deep Dive into Execution Differences

The way SQL Server and Apache Spark (backbone of Databricks) process queries is fundamentally different, and understanding these differences is crucial when migrating or optimizing workloads. While SQL Server relies on a single-node, transaction-optimized execution engine, Spark in Databricks is built for distributed, parallel processing.

Execution Model: Single-Node vs. Distributed Processing

SQL Server executes queries within a single-node environment, meaning all operations—such as joins, aggregations, and filtering—occur on a centralized database server. The query optimizer determines the best execution plan, using indexes, statistics, and caching to improve efficiency. However, performance is ultimately limited by the resources (CPU, memory, and disk) of a single machine.

Databricks, powered by Apache Spark, distributes query execution across multiple nodes in a cluster. Instead of a single execution plan operating on one server, Spark breaks down queries into smaller tasks, which are executed in parallel across worker nodes. This approach enables Databricks to handle massive datasets efficiently, leveraging memory and compute resources across a distributed system.

Query Execution Breakdown

  • SQL Server: A query is parsed, optimized into an execution plan, and executed on a single machine. It reads data from disk (or memory if cached), processes it using indexes and statistics, and returns results.
  • Databricks (Spark): A query is parsed and transformed into a Directed Acyclic Graph (DAG), which is then broken down into stages and tasks. The Spark scheduler distributes these tasks across worker nodes, where computations are executed in memory as much as possible before writing results back to storage.

Data Shuffling and Joins

One of the biggest differences between the two systems is how they handle joins and aggregations.

  • SQL Server: Since all data is processed on a single machine, joins rely heavily on indexes and sorting. If indexes are missing or inefficient, operations like hash joins or merge joins can cause expensive disk I/O.
  • Databricks (Spark): Joins require shuffling, where data is redistributed across nodes to ensure matching keys are on the same worker. This introduces network overhead but allows for massive scalability. Techniques like broadcast joins (sending a small table to all nodes) help reduce shuffle costs and improve performance.

Caching and Storage Optimization

OptSQL Server relies on the buffer pool to cache frequently accessed data in memory, minimizing disk reads. Indexed data is stored efficiently on disk, and execution plans are cached for reuse.

Databricks, on the other hand, benefits from in-memory caching using Spark’s caching feature, reducing repeated reads from cloud storage (e.g., Azure Blob or AWS S3). Additionally, techniques like Z-ordering and partitioning help optimize data layout, reducing scan times for large datasets.

Fault Tolerance and Scalability

SQL Server operates with ACID transactions and high availability mechanisms like Always On Availability Groups, but it lacks inherent fault tolerance in query execution. If a process fails, it must restart.

Databricks, through Spark, provides fault tolerance via lineage and recomputation. If a node fails, Spark reruns only the affected tasks, ensuring resilience without manual intervention. Additionally, horizontal scalability allows it to scale dynamically based on workload demands.

Do you want to know more?

Are you considering migrating workloads from SQL Server to Databricks? Understanding execution models is key to designing efficient queries and avoiding performance pitfalls. Let’s connect and discuss how to make your transition seamless!

Foto van Rafal Frydrych

Rafal Frydrych

Senior Consultant at RevoData, sharing with you his knowledge in the opinionated series: Migrating from MSBI to Databricks.

]]>
https://revodata.nl/nl/sql-server-vs-apache-spark-a-deep-dive-into-execution-differences/feed/ 0
Optimizing Performance: SQL Server vs Databricks https://revodata.nl/nl/optimizing-performance-sql-server-vs-databricks/ https://revodata.nl/nl/optimizing-performance-sql-server-vs-databricks/#respond Wed, 26 Nov 2025 10:21:26 +0000 https://revodata.outlawz.dev/?p=6181

Optimizing Performance: SQL Server vs Databricks

Optimization in a Databricks’s Data Lakehouse differs significantly from traditional SQL Server environments due to its architecture and the nature of data storage. While SQL Server relies on indexing, row-based storage, and dedicated disk structures, Databricks leverages distributed storage, columnar formats, and advanced clustering techniques to enhance performance.

Storage Differences: SQL Server vs. Databricks

SQL Server primarily operates with row-oriented storage, which is optimized for transactional workloads where entire records are frequently accessed. It uses indexes to speed up queries by pre-sorting and structuring data efficiently within a disk-based system. On the other hand, Databricks and other modern Lakehouse platforms use columnar storage formats like Parquet, which enable efficient compression and retrieval for analytical workloads. Instead of fixed disk storage, data in Databricks is often stored in cloud-based solutions such as Azure Blob Storage or AWS S3, leveraging distributed file systems to improve scalability and performance.

Indexing in SQL Server vs. Partitioning in Databricks

In SQL Server, indexing is one of the primary ways to optimize queries, allowing fast lookups within structured tables. However, in Databricks, indexing works differently due to the distributed nature of storage. Instead of relying on indexes, Databricks employs partitioning, which segments large datasets into smaller, manageable chunks based on logical keys like date ranges or categories. While SQL Server indexing is crucial for reducing scan times on relational tables, partitioning in Databricks minimizes the amount of data read, significantly improving query performance.

Advanced Optimizations: Z-Ordering, Liquid Clustering, and Vacuum

Beyond partitioning, Databricks offers additional optimization techniques such as Z-Ordering and Liquid Clustering. Z-Ordering helps co-locate related data within files, reducing the amount of data scanned during queries and enhancing performance for range-based filtering. Liquid Clustering further refines this process by dynamically managing data clustering over time, adjusting to changing query patterns without manual intervention.

Another critical aspect of performance tuning in Databricks is Vacuuming. Unlike SQL Server, where deleted data is managed through transaction logs and page reorganizations, Databricks maintains historical file versions that can accumulate over time. Running Vacuum operations purges obsolete data, ensuring storage efficiency and preventing performance degradation.

Making the Most of Lakehouse Optimization

Optimizing your Data Lakehouse isn’t just about applying best practices—it’s about continuously refining your approach based on your data and workloads. Whether you’re transitioning from SQL Server or looking to enhance your Databricks performance, now is the time to take action.

Are you ready to implement these optimization techniques in your own environment? Start by analyzing your query patterns, revisiting your partitioning strategy, or experimenting with Z-Ordering and Liquid Clustering. If you’re facing challenges, let’s talk! Reach out, share your experiences, and let’s navigate the path to high-performance data together.

Foto van Rafal Frydrych

Rafal Frydrych

Senior Consultant at RevoData, sharing with you his knowledge in the opinionated series: Migrating from MSBI to Databricks.

]]>
https://revodata.nl/nl/optimizing-performance-sql-server-vs-databricks/feed/ 0
Orchestration – SQL Server Agent vs. Workflows https://revodata.nl/nl/orchestration-sql-server-agent-vs-workflows/ https://revodata.nl/nl/orchestration-sql-server-agent-vs-workflows/#respond Wed, 26 Nov 2025 10:11:19 +0000 https://revodata.outlawz.dev/?p=6174

Orchestration - SQL Server Agent vs. Workflows

One of the pillars of a migration from MSBI to Databricks is orchestration. For years, SQL Server Agent has been the trusted solution for scheduling and automating tasks. It’s simple, well integrated with SQL Server, and has been the backbone of countless ETL jobs, backups, and maintenance routines. But as we look at modern data platforms like Databricks, the question arises: how do Databricks Workflows compare to the familiar SQL Server Agent?

SQL Server Agent: A Reliable Classic with Limits

SQL Server Agent excels in its simplicity. Its GUI-based interface makes it easy to schedule jobs and monitor execution, and its integration with SQL Server ensures a seamless experience for database administrators and BI developers. However, it was built for an era of monolithic systems, and its limitations become apparent in today’s landscape. Scaling beyond SQL Server, working with distributed data, or integrating with cloud-native tools often feels like trying to fit a square peg into a round hole.

Databricks Workflows: Built for Modern Data Needs

Databricks Workflows, on the other hand, are designed for the complexities of modern data engineering. They bring scalability and flexibility to the forefront, enabling you to orchestrate complex pipelines that span Spark jobs, machine learning models, and real-time analytics. Unlike SQL Server Agent, which is tightly tied to SQL Server, Workflows embrace a multi-cloud, multi-tool environment, integrating seamlessly with APIs, cloud services, and third-party platforms.

The shift to Databricks Workflows also introduces new paradigms, such as event-driven orchestration. Tasks can be triggered by events like file arrivals (Auto Loader) or changes in a database, allowing for real-time automation that SQL Server Agent struggles to achieve. Additionally, Databricks provides advanced monitoring and alerting capabilities, giving you deeper insights into your workflows and the ability to resolve issues quickly.

Making the Transition: Challenges and Opportunities

While the transition might feel daunting at first, it’s essential to focus on the opportunities it brings. The flexibility of Workflows allows teams to start small, using familiar SQL tasks, while gradually exploring more advanced capabilities like PySpark. This approach not only reduces the learning curve but also ensures that your team remains productive during the migration.

Orchestration is more than a technical challenge—it’s a transformation in how we think about automation and scalability. Transitioning from SQL Server Agent to Databricks Workflows requires a shift in mindset, but it’s one that unlocks immense potential for modern data teams.

Join the Conversation

Have you started rethinking your approach to orchestration? What challenges or insights have you encountered? Let’s discuss! And if you’re ready to take the next step, we’re here to help you navigate the transition and make the most of what Databricks has to offer.

Foto van Rafal Frydrych

Rafal Frydrych

Senior Consultant at RevoData, sharing with you his knowledge in the opinionated series: Migrating from MSBI to Databricks.

]]>
https://revodata.nl/nl/orchestration-sql-server-agent-vs-workflows/feed/ 0
Quick wins in your Databricks journey: Show value early https://revodata.nl/nl/quick-wins-in-your-databricks-journey-show-value-early/ https://revodata.nl/nl/quick-wins-in-your-databricks-journey-show-value-early/#respond Wed, 26 Nov 2025 09:29:31 +0000 https://revodata.outlawz.dev/?p=6149

The common trap: Starting from the bottom

Many companies approach their Databricks migration by starting at the bottom of the stack: rolling out the platform, re-integrating data sources (often via ODBC/JDBC), and building a bronze layer before modelling and consuming the data. While this method seems logical, it often leaves teams “below the surface” for too long, struggling to demonstrate value as they work through foundational layers.

To avoid this, it’s crucial to rethink how you start. Databricks, for instance, can pull data via JDBC, but its true strength lies in AutoLoader and working with files stored in cost-effective blob storage. Adding change data capture (CDC) capabilities with tools like Debezium can enhance this, but it may also introduce dependencies on platform or infrastructure teams who may not share your timeline or goals.

The quickest unlock: Federate into legacy

If your data already resides in a cloud platform like Azure or AWS, the quickest path to success is leveraging native services such as Azure Data Factory (ADF) or AWS Data Workflow Services (DWS). These can convert CDC streams into Parquet files, which are easily stored on blob storage. By using these existing tools, you simplify the process, reduce dependencies, and get data into Databricks faster.

When this isn’t an option, or if you really want to go fast, Unity Catalog’s Federation capabilities can provide a workaround. By making your SQL Server databases available in Databricks, you can federate queries directly to the source, enabling you to join live data with datasets already in Databricks. Whether it’s staging databases, data warehouses, or data marts, this approach allows you to build on your existing infrastructure while transitioning to a modern platform.

Show business value from day one

Instead of focusing solely on ingestion pipelines and modelling workflows, prioritise moving consumption use cases to Databricks early. By demonstrating business value—almost from day one—you can gain buy-in from stakeholders and justify further investments in the migration process.

Once the immediate needs are met, gradually shift your data sources from staging into a new ingestion pattern that leverages blob storage and AutoLoader. This step-by-step approach ensures a smoother transition while delivering results that matter to your business.

Ready to take the next step?

At RevoData, we specialize in helping organizations unlock the full potential of Databricks. Whether you’re migrating from SQL Server, optimizing your workflows, or building a modern data platform, our consultants are here to guide you every step of the way. Let us show you how Databricks can transform your data strategy and drive real business impact. Contact RevoData today to get started!

Foto van Rafal Frydrych

Rafal Frydrych

Senior Consultant at RevoData, sharing with you his knowledge in the opinionated series: Migrating from MSBI to Databricks.

]]>
https://revodata.nl/nl/quick-wins-in-your-databricks-journey-show-value-early/feed/ 0
From BI to Databricks: Simplifying Architecture Layers https://revodata.nl/nl/from-bi-to-databricks-simplifying-architecture-layers/ https://revodata.nl/nl/from-bi-to-databricks-simplifying-architecture-layers/#respond Mon, 08 Sep 2025 11:18:33 +0000 https://revodata.nl/?p=4797

From BI to Databricks: Simplifying Architecture Layers

Over the past few weeks, we’ve been exploring the journey from traditional Business Intelligence (BI) to Databricks. As part of this transition, it’s essential to address a key aspect: architecture. While the terminology might seem daunting at first—Bronze, Silver, Gold, these layers aren’t so different from what you’re already familiar with. Let’s break it down and show how you can adapt this framework to suit your organization.

Layers Are Layers—Let’s Keep It Simple

 

When it comes to data architectures, we all think in layers. They bring structure and clarity to an otherwise complex ecosystem. So, if you’re transitioning to the medallion architecture with its Bronze, Silver, and Gold layers, don’t let the terminology overwhelm you. We’ve even seen customers add Platinum and Diamond to their layers—why not? If it works for your organization, it works! Remember, a framework is just a starting point; tailor it to fit your needs.

Mapping Staging to the Bronze Layer

The key is to focus on the characteristics of each layer. For example, in the MSBI world, a staging layer is where raw source data lands. It’s still structured around the source, with minimal transformation. The Bronze layer in Databricks serves the same purpose: it’s the raw, unprocessed representation of the source data. Once you see this connection, the transition becomes less intimidating.

Mapping the Data Warehouse to the Silver Layer

The Data Warehouse layer in MSBI aligns closely with the Silver layer in the medallion architecture. In this stage, you introduce organizational standards, naming conventions, and other structures while keeping data at its lowest granularity. This layer is your backbone, designed to remain stable over time.

One key difference in Databricks is the flexibility around traditional data modeling approaches like Kimball or Inmon (star-schema), Anchor modeling, or Data Vault. Here, you can choose how strictly to adhere to these techniques based on your organizational needs. However, it’s critical to ensure this layer is resilient. Changes to data sources or organizational structures should have minimal impact on your models. To achieve this, consider domain-driven design, bounded contexts, and data mesh principles—these sociotechnical concepts help keep your architecture flexible and future-proof.

The Data Mart Layer: Gold (or Platinum, or Diamond)

The final layer—often referred to as the Gold layer in Databricks—is where you optimize data for consumption. Whether it’s a one-big-table design, 3NF, or star-schema, this layer is about delivering business value. Because of its direct impact on the end user, this is where companies tend to allocate the most investment. However, it’s vital not to overlook the upstream layers. A stable foundation is the only way to ensure a reliable and effective Gold layer.

At RevoData, we’ve learned that a logical and user-friendly structure for your Data Catalog is key. Instead of naming catalogs “Bronze,” “Silver,” or “Gold,” we use descriptive labels like “sources,” “domains,” or “data products” and apply the familiar terms as metadata tags. This approach provides a clear path to data for all users while keeping the architecture intuitive and scalable.

Make your Architecture Work for You

Transitioning to Databricks doesn’t mean starting from scratch. By mapping your existing architecture to the medallion framework and customizing it for your organization, you can create a system that’s both familiar and future-ready.

Ready to Take the Next Step?

At RevoData, we specialize in helping organizations make the most of Databricks. Whether you’re starting your journey or looking to refine your approach, we’re here to support you. Let us show you how Databricks can transform your data strategy and deliver real business impact. Reach out to us today to get started!

Foto van Rafal Frydrych

Rafal Frydrych

Senior Consultant at RevoData, sharing with you his knowledge in the opinionated series: Migrating from MSBI to Databricks.

]]>
https://revodata.nl/nl/from-bi-to-databricks-simplifying-architecture-layers/feed/ 0
SBI Developer – what does migration to Databricks mean for you? https://revodata.nl/nl/bi-developer-what-does-migration-to-databricks-mean-for-you/ https://revodata.nl/nl/bi-developer-what-does-migration-to-databricks-mean-for-you/#respond Tue, 02 Sep 2025 08:04:24 +0000 https://revodata.nl/?p=4719

BI Developer - what does migrating to Databricks mean for you?

When transitioning from MSBI to Databricks, the hardest part often isn’t the tools or the technology—it’s the people and their skills. That’s why, even though it’s listed last on our leaflet, we’re tackling this topic first. Let’s talk about what this migration means for your team and how to align their expertise with the Databricks ecosystem.

Expanding horizons - from Dashboards to Analysis and beyond

In the MSBI world, BI developers hold a central role. They’re highly skilled in SQL, possess extensive domain knowledge, and excel at creating dashboards, reports, and even complex cubes using MDX or DAX. Traditionally, this expertise has been closely tied to the classic Data Warehouse (DWH) environment, where structured data models and ETL processes form the backbone of the work. However, in the Databricks landscape, the BI Developer role evolves significantly, adapting to new paradigms and technologies that emphasize scalability, agility, and advanced data analytics.

With Databricks, SQL remains a vital skill, forming a strong foundation for exploring the platform’s capabilities. However, Databricks also introduces the world of Spark, with PySpark emerging as a favored tool among organizations. For BI developers, this shift offers an exciting opportunity to expand their skill set and evolve their role. Rather than a departure from strengths (SQL), this transition represents a chance to adapt and thrive in a rapidly changing environment and to become a more complete data professional.

The Data Engineer - why software skills matter?

As organizations venture into modern platforms like Databricks, the role of the Data Engineer emerges as critical for unlocking its full potential. To set the stage, it’s important to understand why Databricks excels—it’s a platform designed for flexibility, scalability, and advanced processing. However, it truly shines when operated by individuals with strong software engineering skills, particularly if PySpark is a key component of the data processing strategy.

For teams missing this expertise, our advice is clear: stick to SQL-based workloads in the beginning. This approach minimizes migration risks and ensures your team isn’t overwhelmed by the demands of Spark. After all, you don’t want to leave anyone behind at the station as the data train rolls forward.

The Platform Engineer - bringing infrastructure in-house

In an MSBI environment, platform support often comes from external teams, such as platform, infrastructure, or cloud operations. With Databricks, embedding a Platform Engineer, etc, within your team—even temporarily—can make all the difference.

This person ensures your team owns and optimizes the Azure Subscription and/or Resource Group. They help leverage Databricks’ robust security, isolate data storage and workloads, and manage dependencies effectively. Without this role integrated into your team, you risk missing out on these critical capabilities.

Building a future data team

Migrating to Databricks is more than just a technological shift; it’s a transformation of roles, skills, and team dynamics. This change brings challenges but also opportunities to build a robust, future-proof data team.

  1. Leverage existing SQL expertise as the starting point for migration to reduce risk and maintain momentum.
  2. Invest in upskilling your team to embrace new tools and workflows, positioning them for long-term growth.
  3. Embed platform engineering expertise, whether internally or through temporary support, to fully optimize Databricks’ capabilities.

Ultimately, the success of your Databricks implementation hinges on aligning your team’s skills with the platform’s strengths. By empowering your people and providing the right resources, you’ll not only navigate the migration smoothly but also unlock the full potential of a modern, agile data ecosystem. If you’re ready to make the leap, let’s start the journey together—reach out, and we’ll help you chart the course.

Foto van Rafal Frydrych

Rafal Frydrych

Senior Consultant at RevoData, sharing with you his knowledge in the opinionated series: Migrating from MSBI to Databricks.

]]>
https://revodata.nl/nl/bi-developer-what-does-migration-to-databricks-mean-for-you/feed/ 0
Still on MSBI? You’re Not Alone https://revodata.nl/nl/still-on-msbi-youre-not-alone/ https://revodata.nl/nl/still-on-msbi-youre-not-alone/#respond Tue, 26 Aug 2025 12:53:31 +0000 https://revodata.nl/?p=4673

Still on MSBI ? You're not alone

You might be surprised by how many organizations still rely on the full MSBI stack. Despite the rapid shifts in the data landscape, MSBI continues to be a robust, reliable solution that delivers significant value to businesses worldwide. Its enduring presence is a testament to its strength—but is it enough for what lies ahead?

That question is even more interesting when we consider Microsoft’s Azure-based alternatives. Take Azure Data Factory (ADF) for example—does it meet your expectations? Are Synapse and SQL Server Pools delivering the seamless performance and scalability you need? For many, the answer is lukewarm at best.

Then there’s the question of cubes: MDX or DAX? Are you stuck with Multi-Dimensional cubes, or transitioned to Tabular cubes via Azure Analysis Services (AAS) or Power BI models? Excel users (sorry, Mac folks!) still find these features useful for self-service analytics. And while Power BI has emerged as a standout in Microsoft’s ecosystem, even it doesn’t solve every challenge posed by modern data demands.

However, there is another solution, enter Databricks, a modern data platform that addresses a lot of the challenges when moving from descriptive to predictive and automated analytics. Like Obi-Wan Kenobi said about the lightsaber, that it’s a more elegant weapon for more civilized times, Databricks is a modern tool addressing current and future data problems.

Why Companies are Looking to Databricks

Faced with these challenges, it’s no surprise that many organizations are turning to Databricks as their next-generation data platform. Databricks offers an open, unified platform that goes beyond traditional BI capabilities, making it possible to integrate advanced analytics, data engineering, and machine learning in one place.

Interestingly, many of our customers choose to retain Power BI as their primary data consumption layer while leveraging Databricks to power their data processing and engineering needs. This hybrid approach offers the best of both worlds: familiar tools for end users and cutting-edge capabilities for data teams.

Join the Conversation

Over the coming weeks, I will share a series of posts designed to help you navigate the shift from MSBI to Databricks. Through an opinionated mental map (attached), I will provide apples-to-pears comparisons, practical advice, and insights into building a future-proof data architecture.

These conversations may spark debate—and that’s a good thing! I invite you to join the dialogue, share your experiences, and explore new perspectives. Stay tuned for the next post in the series. Let’s chart this journey together!

Foto van Rafal Frydrych

Rafal Frydrych

Senior Consultant at RevoData, sharing with you his knowledge in the opinionated series: Migrating from MSBI to Databricks.

]]>
https://revodata.nl/nl/still-on-msbi-youre-not-alone/feed/ 0
Databricks Demystified https://revodata.nl/nl/the-power-of-databricks-and-revodata/ Thu, 13 Apr 2023 07:47:07 +0000 https://revodata.nl/de-kracht-van-databricks-en-revodata-copy-2/

Databricks Demystified

You may have come across the term Databricks and wondered what it’s all about. Is it just another buzzword in the world of big data? Or can it genuinely impact your organisation’s data management and analytics capabilities? What follows is a simple introduction to Databricks, explaining what it is, how it works, and how it can be relevant to your organisation and team.

What is Databricks, and what can it do for you?

Databricks is a unified platform for managing and analysing vast amounts of data, combining the power of data engineering, machine learning, and analytics in one place. It offers an array of tools for processing, storing, cleaning, sharing, and analysing data. Making it easier for (non-technical) managers to understand and leverage the insights that data can provide. In a nutshell, Databricks helps organisations derive value from their data, guiding decision-making and driving growth.

Making sense of Databricks’ features

Let’s break down some of the key features and functionalities of Databricks:

  1. Data processing and management: Databricks makes it easy to schedule and manage data processing workflows, ingest data from various sources, and discover and explore datasets.
  2. Analytics and visualisation: With tools for working in SQL and generating visualisations and dashboards, Databricks simplifies the process of gleaning insights from your data.
  3. Machine learning: Databricks offers tools for creating and tracking machine learning models, making it easier to incorporate artificial intelligence into your organisation’s operations.
  4. Open-source integrations: As a platform committed to the open-source community, Databricks integrates with popular open-source projects like Apache Spark, Delta Lake, and MLflow.

Databricks, AWS and Azure a perfect match?

Databricks works closely with Amazon Web Services (AWS) and Microsoft Azure to provide seamless integration and optimal performance. Instead of forcing you to migrate your data into proprietary storage systems, Databricks connects with your cloud account and deploys compute clusters using cloud resources that you control. This flexibility ensures that your organisation’s data remains secure and accessible while still benefiting from Databricks’ powerful tools and features.

Real-world applications of Databricks

So, how can Databricks be useful in your organisation? Here are some common use cases:

  1. Building an enterprise data lakehouse: A data lakehouse combines the strengths of data warehouses and data lakes to create a single source of truth for your data.
  2. ETL and data engineering: Databricks simplifies the process of extracting, transforming, and loading (ETL) data, making it easier for everyone to manage and analyse its data.
  3. Machine learning and AI: Databricks provides tools tailored for data scientists and ML engineers supporting the development of AI applications that can drive growth and innovation.
  4. Data warehousing, analytics, and BI: it provides a powerful platform for running analytic queries and generating insights that inform your decision-making processes.
  5. Data governance and secure data sharing: Databricks helps you manage permissions and secure access to your data, enabling collaboration both within and outside your organisation.

In summary

In today’s data-driven world, having the right tools and platforms to manage and analyse data is crucial. Databricks is a powerful solution that can help you unlock the full potential of your data, transforming raw information into actionable insights that drive growth and success.

So, next time you hear the term Databricks, you’ll know that it’s not just another buzzword. On the contrary, it’s a powerful platform that can transform the way you harness the power of data. By simplifying data processing, analytics, machine learning, and data governance, Databricks enables you to make better-informed decisions, improve operational efficiency, and drive innovation across your organisation.

So why not explore the potential of Databricks and see how it can help you turn your data into a valuable strategic asset? Neem contact op to us more information, a Proof of Concept (PoC) or a Value Assessment.

]]>
Retail Analytics: de 9-box methode https://revodata.nl/nl/retail-analytics-the-9-box-method/ Wed, 01 Feb 2023 07:48:59 +0000 https://revodata.outlawz-ontwikkeling.nl/7-stappen-naar-een-data-cultuur-copy/ /*! elementor - v3.8.1 - 13-11-2022 */
.elementor-heading-title{padding:0;margin:0;line-height:1}.elementor-widget-heading .elementor-heading-title[class*=elementor-size-]>a{color:inherit;font-size:inherit;line-height:inherit}.elementor-widget-heading .elementor-heading-title.elementor-size-small{font-size:15px}.elementor-widget-heading .elementor-heading-title.elementor-size-medium{font-size:19px}.elementor-widget-heading .elementor-heading-title.elementor-size-large{font-size:29px}.elementor-widget-heading .elementor-heading-title.elementor-size-xl{font-size:39px}.elementor-widget-heading .elementor-heading-title.elementor-size-xxl{font-size:59px}

Hoe de 9-box methode voor replenishment waarde kan toevoegen voor jouw bedrijf

In de retailwereld is het van groot belang om inzicht te hebben in de verkoop en marge van jouw producten. Een effectieve manier om dit te doen is door gebruik te maken van de 9-box methode voor replenishment. Deze methode maakt gebruik van gegevens over verkoop en marge om een overzicht te krijgen van hoe jouw producten zich verkopen en waar mogelijkheid voor verbetering ligt. In dit blogpost gaan we dieper in op hoe de 9-box methode waarde kan toevoegen voor jouw bedrijf.

Wat is de 9-box methode?

De 9-box methode is een tool waarmee je de verkoop en marge van jouw producten kunt visualiseren in een 9-vakken matrix. Elke vak in de matrix representeert een combinatie van verkoop en marge. Hierdoor kun je snel zien welke producten goed presteren en welke producten verbeterd moeten worden.

Hoe werkt de 9-box methode?

Om de 9-box methode te gebruiken, heb je gegevens nodig over verkoop en marge van jouw producten. Deze gegevens worden vervolgens ingevoerd in de matrix. Elke vak in de matrix representeert een combinatie van verkoop en marge. Hierdoor kun je snel zien welke producten goed presteren en welke producten verbeterd moeten worden.

Waarom is de 9-box methode waardevol?

De 9-box methode is waardevol omdat het een eenvoudige manier is om inzicht te krijgen in de prestaties van jouw producten. Hierdoor kun je gericht actie ondernemen om de verkoop en marge van jouw producten te verhogen. Bovendien geeft de 9-box methode inzicht in de producten die een groot potentieel hebben en die waarmee snel resultaat behaald kan worden.

Conclusie

De 9-box methode voor replenishment is een krachtige tool om inzicht te krijgen in de prestaties van jouw producten en gericht actie te ondernemen. Door gebruik te maken van gegevens over verkoop en marge, kun je snel zien welke producten goed presteren en welke producten verbeterd moeten worden. De 9-box methode kan daarom waarde toevoegen voor jouw bedrijf door een efficiëntere voorraadbeheer en grotere winstmarge te realiseren. “

]]>