In our previous discussion, we delved into the realm of operational databases within the Microsoft Data Platform, recognizing the divergent nature of analytic databases such as data warehouses in the cloud environment. Operational databases prioritize hardware and architecture for optimal write throughput, contrasting with the focus of analytic databases on large batch data ingestion and analytic querying.
The evolution of analytical systems experienced a pivotal shift during the era of 2008 to 2013, largely influenced by the emergence of Hadoop. Prior to this, data warehouses predominantly resided within relational databases on single servers. The advent of massively parallel processing (MPP) clusters altered this landscape by introducing large-scale parallelism and adopting an object-storage model, laying the foundation for data lakes, data lake houses, and modern data warehouses.
With this historical backdrop, let’s scrutinize the current array of available solutions. A significant development in this sphere occurred last year with Microsoft’s introduction and subsequent general availability of Fabric, an all-encompassing data analytics solution. While Fabric holds promise with its comprehensive vision and direction, the current iteration remains a work in progress, marked by certain rough edges. Nonetheless, Fabric’s nature as a Software as a Service (SaaS) product ensures frequent updates and enhancements.
Microsoft Fabric: Fabric encompasses a wide spectrum of capabilities, predominantly leveraging existing Azure components:
- Data Engineering: Employs Spark Fabric to facilitate authoring experiences and leverage the entire Spark surface area.
- Data Factory: Harnesses Azure Data Factory functionality to enable Extract, Transform, Load (ETL) processes using Power Query and other tools.
- Data Science: Utilizes Azure Machine Learning for model management and training, enabling the integration of predictive models into Power BI dashboards.
- Data Warehouse: Offers a SQL-based data warehousing experience akin to Azure Synapse Analytics, albeit with distinctive nuances.
- Real-Time Analytics: Introduces observational data analytics through Kusto Query Language from Azure Log Analytics.
- Power BI: Seamlessly integrates Microsoft’s BI data visualization platform with Fabric’s data storage and management solutions.
While Fabric presents a plethora of capabilities, its current state may not radically transform existing analytics stacks. However, the potential for game-changing impact lies in successful implementations like mirroring, which could democratize business analytics across diverse organizations.
Despite Fabric’s prominence, inquiries abound regarding the future trajectory of Azure Synapse Analytics. Initially conceived as Azure SQL Data Warehouse, Synapse Analytics evolved to incorporate serverless SQL and Spark functionality, now shared with Fabric. For existing Synapse customers content with their experiences, a migration to Fabric may not be urgent. Conversely, for those contemplating the establishment of new large-scale data warehouses, Synapse remains a compelling choice on Azure owing to its familiarity and likely superior performance.
What About Databricks? Azure Databricks, despite its external ownership, has emerged as a leading data warehousing solution, rooted in Apache Spark and bolstered by notebook-based development. Offering support for multiple languages and robust functionality, Databricks has made strides in data governance through its Unity Catalog service. While Microsoft’s introduction of Fabric underscores its ambitions in the data warehousing realm, Databricks retains its appeal, especially for organizations already invested in its ecosystem.
In conclusion, the modular separation of data and compute in analytic workloads facilitates agility and adaptability to evolving solutions. This paradigm shift has propelled significant development in the analytics domain, necessitating meticulous evaluation, planning, and alignment with organizational requirements and skill sets. Amidst high visibility and executive scrutiny, selecting the optimal solution mandates a nuanced understanding of the full feature sets and strategic implications of each offering.
Discover more from SQLYARD
Subscribe to get the latest posts sent to your email.


