Part 2. Direct Lake in Microsoft Fabric: High Speed Analytics Without Import or DirectQuery

Introduction

Direct Lake is one of the most important innovations introduced in Microsoft Fabric. It is not an incremental improvement over Import or DirectQuery. It is a new execution mode that allows the VertiPaq engine to read Delta Lake files directly in OneLake without requiring data imports, scheduled refreshes, or constant round trips to a SQL database. For most organizations, Direct Lake becomes the fastest and most cost effective way to serve large analytical models at enterprise scale. Microsoft Fabric Documentation+1

This article breaks down how Direct Lake works, where it fits compared to Import and DirectQuery, performance benefits, design patterns, and common pitfalls. You will also get a hands-on workshop building a real Direct Lake semantic model in Fabric.

What Direct Lake Is and Why It Matters

Direct Lake is a storage engine mode available only in Fabric capacities. Instead of loading data into the VertiPaq cache through traditional imports, semantic models can mount Delta Lake files from OneLake directly into the analytical engine. This creates a near in-memory experience without the overhead of managing refreshes. Microsoft Learn+2

In simple terms:
Your semantic model reads the data sitting in your lakehouse as if it were already compressed and optimized in memory.

Inline reference: VertiPaq Direct Lake whitepapers describe this as “delta hydration into compressed memory contexts.” Microsoft Fabric Engine+3

Key benefits

• Zero copy access to Delta tables
• No refreshes required for updates
• Much faster than DirectQuery
• Lower overhead than full Import
• Scaling based on storage rather than complex pipelines
• No latency buildup from incremental processing
• Great for near real time analytics in Fabric

Where Direct Lake Fits

Direct Lake sits between Import and DirectQuery, but behaves closer to Import in performance. Microsoft Fabric Docs+4

ModeLatencyData FreshnessStorage UseIdeal For
ImportFastestScheduledIn-memoryHigh performance reporting
Direct LakeVery fastNear real timeLake storageLakehouse analytics
DirectQuerySlowestReal timeExternalOperational dashboards

How Direct Lake Works Under the Hood

1. Delta Lake Tables

Fabric stores Lakehouse and Warehouse tables as Delta files. These files contain Parquet pages with transaction logs to track changes. Delta Lake Docs+5

2. OneLake Virtualization

OneLake exposes Delta files as shortcuts so models can bind directly to tables without data movement. Microsoft Fabric OneLake+6

3. VertiPaq Page Mounting

VertiPaq reads Delta files directly and hydrates column segments into memory only when needed.
This means:
• Queries hit in-memory column segments
• Data is not pre-imported
• Refresh cycles are no longer needed
• Transactions propagate quickly through the metadata layer

Inline reference: VertiPaq Direct Lake technical notes+7


Direct Lake Requirements

To use Direct Lake:
• You need a Power BI Premium or Fabric capacity (F SKU or P SKU). Power BI Premium Docs+8
• Your semantic model must be built on Delta Lake tables in OneLake.
• You must use a Lakehouse or Warehouse that stores tables in Delta.

Direct Lake is not supported on:
• SQL Server
• Azure SQL
• Dataverse
• Excel
• Text or CSV sources without conversion
• Sources outside OneLake


Performance Patterns

Below are the main patterns where Direct Lake shines.

1. High volume fact tables

Direct Lake is ideal for event tables, IoT streams, clickstream logs, and other fast growing transactional data. Microsoft Fabric Real-Time+9

2. Mixed workload analytics

SQL teams can ingest into a Lakehouse while analysts consume the same tables through the semantic model immediately. Fabric Lakehouse Docs+10

3. Real time dashboards

Direct Lake eliminates the pain of “refresh duration vs freshness” trade-offs.

4. Large model support

Direct Lake tables do not bloat the in-memory footprint, allowing models that would exceed Import storage limits. VertiPaq Compression Whitepaper+11


Direct Lake vs Import: When To Use Each

Use Direct Lake when:

• You are working inside Fabric
• Data changes frequently
• Low latency matters
• You want to eliminate refresh costs
• Your fact table exceeds memory limits

Use Import when:

• You need the fastest possible performance
• Your pipeline is outside Fabric
• You require fully compressed in-memory columns
• You depend heavily on complex aggregations


Design Best Practices for Direct Lake

Keep your Lakehouse clean

Poor folder structure or mixed file formats can cause the model to revert to DirectQuery mode. Fabric Lakehouse Best Practices+12

Use supported data types

Unsupported column types (binary, complex nested structures) will force fallbacks.

Enforce good schema design

• Surrogate keys for dimensions
• Numeric keys whenever possible
• Date table using standard formatting
• Avoid heavy text fields in fact tables

Watch out for fallback mode

If the engine cannot mount a Delta table, it silently falls back to DirectQuery.
Your performance will tank.

Inline reference: Microsoft Fabric fallback mode behavior+13


Workshop: Build a Direct Lake Semantic Model in Fabric

Step 1. Create a Lakehouse and Load Delta Data

• Open Fabric
• Create a Lakehouse
• Upload data to Files
• Convert to Delta using “Create table”
• Confirm Delta tables appear

Example tables:
• FactSales
• DimDate
• DimProduct
• DimCustomer

Step 2. Create a Semantic Model

• Select “New semantic model”
• Choose the Delta tables
• Validate relationships

Step 3. Verify Direct Lake Mode

A working table should show:
Storage = Direct Lake

If not:
• Fix unsupported data types
• Regenerate Delta logs
• Validate folder structure

Step 4. Add Measures





Total Sales :=
SUM(FactSales[SalesAmount])
Orders This Hour :=
CALCULATE(
    COUNTROWS(FactSales),
    FILTER(FactSales, FactSales[OrderTimestamp] >= NOW() - TIME(0,1,0))
)

Step 5. Build a Report

• Connect via Power BI Desktop
• Build visuals
• Publish

Step 6. Validate Performance

• Table visuals
• Grouping tests
• Distinct counts
• Slicing by category

Expect sub second results on large tables.


Common Direct Lake Mistakes

• Using unsupported sources
• Letting Lakehouse folder structure drift
• Too many small parquet files
• Modeling before cleaning the lake


Summary

Direct Lake changes the entire performance equation inside Power BI and Fabric. Instead of picking between Import and DirectQuery, Direct Lake gives you near in-memory performance directly from your lakehouse with no refresh cycles and no data movement. It supports massive fact tables, real time analytics, and SQL pipelines without introducing overhead or maintenance.

Next up: Part 3. Real Time Semantic Models


References (Clickable)

  1. Microsoft Fabric Semantic Model Overview
    https://learn.microsoft.com/fabric/
  2. Microsoft Learn – Fabric Capacity
    https://learn.microsoft.com/power-bi/enterprise/service-premium
  3. Direct Lake Engine Notes
    https://learn.microsoft.com/fabric/data-engineering/direct-lake-overview
  4. Fabric Architecture
    https://learn.microsoft.com/fabric/get-started/
  5. Delta Lake Docs
    https://delta.io
  6. OneLake Documentation
    https://learn.microsoft.com/fabric/onelake
  7. VertiPaq Engine Internals
    https://learn.microsoft.com/power-bi/guidance/
  8. Power BI Premium Documentation
    https://learn.microsoft.com/power-bi/enterprise/service-premium
  9. Fabric Real Time Analytics
    https://learn.microsoft.com/fabric/real-time
  10. Fabric Lakehouse Technical Overview
    https://learn.microsoft.com/fabric/data-engineering/lakehouse
  11. VertiPaq Compression Whitepaper
    https://learn.microsoft.com/power-bi/guidance/star-schema
  12. Lakehouse Best Practices
    https://learn.microsoft.com/fabric/data-engineering/
  13. Direct Lake Fallback Behavior
    https://learn.microsoft.com/fabric/data-engineering/direct-lake-fallback

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading