Day 01 of 31 Days of SQL For Data Engineering: Why SQL Still Runs the Modern Data Stack
Part of Week 1: Foundations
Let’s start this off with a realistic fact:
SQL is never going to be depreciated when it comes to the data stack because it’s the best language out there for pulling, modelling and organising data.
In 2024, Stack Overflow surveyed that 51% of all responders use SQL; you can see this here. When looking at the overall database stats you can see that many variations of SQL account for two thirds of engineers touch SQL on a weekly basis.
This comes from the declarative nature of the language, Because you describe what you want, not how to get it, the optimiser can keep reinventing faster execution plans while your query text stays the same. That portability is why tools from dbt to Power BI and even AI copilots generate SQL behind the scenes.
This makes learning SQL one of the most valuable skillsets in the industry, in my humble opinion.
Breaking Down Microsoft Fabric: One Lake, Three SQL Engines
Microsoft Fabric is a versatile SaaS product. Capable of handling very intense, very different workloads:
Warehouse: Built for BI and ETL, queried using T-SQL and runs on Polaris.
Lakehouse: Large-scale batch and ML workloads, queried using T-SQL and SparkSQL. Delta tables in OneLake give you open format storage shared across spark, notebooks and the Warehouse.
Real-Time Intelligence: Sub-second streaming capabilities and log analytics, queried using T-SQL and KQL (Kusto Query Language). Eventhouse databases spin up within seconds allowing quick access to fresh streams.
The common denominator is SQL: one language surface that lets you pivot from historical facts (Warehouse), to petabyte-scale transformations (Lakehouse), to live event telemetry (Real-Time) without refactoring business logic.
With all that being said, there’s a lot that we will be getting into when it comes to this series. With a heavy focus on Fabric. You’ll be learning SQL skills and Microsoft’s new flagship data product meaning by the end of these 31 days you’ll have a few hands-on projects behind your belt that will position you in the best possible to way to secure that data engineering role, help drive engineering best practices, or even refactor old data platforms into Fabric platforms.
Up Next: Building Your Lab (Day 2)
We’ll kick the next day off getting into the meaty side of spinning up a Fabric Trail workspace, creating a Warehouse & Lakehouse item, and stand up local Postgres + SQL Server with Docker. The aim is to get your environment ready for learning, rather than having you back and forth dealing with multiple environments.
This sets the tone nicely for what is coming up as this is going to be a huge series!