Part 9 — Designing the Semantic Layer and Metrics Layer

Part 9 — Designing the Semantic Layer and Metrics Layer – SQLYARD
Deep Technical Series — Building a Modern Data Warehouse and Lakehouse

Part 9 — Designing the Semantic Layer and Metrics Layer


Part 9 of 14 — Deep Technical Series: Building a Modern Data Warehouse and Lakehouse. Series index (Part 0) · ← Part 8: Implementing the Lakehouse

After building your Gold layer in Parts 7 and 8, data is physically available and queryable. But it is not yet ready for analysts and business users. Without a semantic layer, every dashboard developer defines “revenue” differently, every report uses slightly different date filters, and KPIs vary between teams. A semantic layer fixes this by defining business logic — measures, hierarchies, relationships, and security — once, in one place, and sharing it across every report and dashboard in the organization.

This part covers the semantic layer across all major platforms: Power BI and Fabric semantic models, Looker LookML, dbt Metrics with MetricFlow, Databricks SQL semantic views, and Snowflake dynamic tables. Every section includes working code and the design principles that make a semantic layer trustworthy and maintainable at scale.

How this connects to the series: The semantic layer sits between the Gold tables built in Parts 7 and 8 and the BI tools and dashboards consumed by analysts and business users. Orchestration (Part 10) ensures the Gold layer is refreshed on schedule. The semantic layer translates that refreshed Gold data into business-meaningful metrics that anyone in the organization can use without writing SQL.

1 What the Semantic Layer Does and Why It Matters Beginner

The semantic layer is the governed business intelligence surface that sits between your Gold tables and your BI tools. It defines what the data means in business terms — not just what columns exist, but what they represent, how they relate to each other, which ones analysts should see, and how measures should be calculated.

Without a semantic layer, business logic leaks into every dashboard. Developer A calculates revenue as SUM(unit_price * quantity). Developer B includes discounts: SUM(extended_amount). Developer C excludes returns. When leadership compares three dashboards and gets three different revenue numbers, trust in the data collapses. The semantic layer prevents this by making revenue a single, governed, reusable measure that every report uses.

Business logic must not live in dashboards. When measure calculations are embedded in individual Power BI reports, Tableau workbooks, or Looker explores, they diverge over time as different developers make slightly different assumptions. The semantic layer is the single source of truth for every KPI. Define it once, use it everywhere, version-control it like code.

What the Semantic Layer Provides

  • Consistent metric definitions — revenue means the same thing in every report
  • Business-friendly namingCustomerKey becomes “Customer”, ExtendedAmount becomes “Revenue”
  • Logical relationships — how facts join to dimensions, expressed once for all consumers
  • Hierarchies — Year → Quarter → Month → Day, Country → Region → City
  • Row-level security — which users see which rows, enforced at the semantic layer not in each dashboard
  • Aggregation tables — pre-computed summaries that accelerate common queries
  • Documentation — what each measure means, how it is calculated, who owns it

2 Where the Semantic Layer Lives Across Platforms Beginner

PlatformSemantic Layer ToolLanguage / FormatPrimary BI Consumer
Microsoft Fabric / Power BITabular Semantic ModelDAX measures, Power Query MPower BI reports and dashboards
LookerLookMLYAML-like declarative languageLooker explores, Looker Studio
dbtdbt Metrics / MetricFlowYAML metric definitionsAny BI tool via MetricFlow API
DatabricksSQL views + Unity CatalogSQL (CREATE VIEW)Power BI, Tableau, Looker via SQL Warehouse
SnowflakeDynamic Tables, Secure ViewsSQLTableau, Looker, Power BI via Snowflake connector
BigQueryLooker semantic layer or BI EngineLookML or native BigQueryLooker, Looker Studio, connected sheets
Cube.devCube semantic layer (standalone)JavaScript / YAMLAny BI tool or application via Cube API

3 Design Principles of a Strong Semantic Layer Beginner

Principle 1: Define Business Measures Once

Every KPI — revenue, gross margin, churn rate, monthly recurring revenue, customer acquisition cost — must be defined in exactly one place in the semantic layer. If the same metric is defined differently in two reports, one of them is wrong. Version control your semantic layer definitions the same way you version control your dbt models — in Git, with code review, with a change log.

Principle 2: Hide Technical Columns

Analysts and business users should never see surrogate keys, SCD Type 2 metadata columns, ETL audit timestamps, or natural keys. The semantic layer’s job is to expose only what is meaningful for analysis. Hiding CustomerKey, EffectiveFrom, EffectiveTo, IsCurrent, and LoadedAt dramatically simplifies the model from the analyst’s perspective.

Principle 3: Define Relationships Explicitly

Every relationship between a fact table and a dimension — which columns join them, what cardinality the relationship has, whether it is active or inactive — must be defined explicitly in the semantic layer. Do not rely on BI tools to infer relationships from column names. Inferred relationships are fragile and produce incorrect results when column names are ambiguous.

Principle 4: Build Hierarchies for Navigation

Hierarchies enable drill-down navigation in BI reports. Define them explicitly rather than expecting users to figure out which columns drill into which. Common hierarchies every warehouse needs:

  • Date: Year → Quarter → Month → Week → Day
  • Geography: Country → Region → State → City → Postal Code
  • Product: Category → Subcategory → Product → SKU
  • Organization: Company → Division → Department → Team

Principle 5: Security Belongs in the Semantic Layer

Row-level security — which users see which rows — must be enforced in the semantic layer, not in individual dashboards. If security is defined in a dashboard, a developer can accidentally bypass it by building a new dashboard on the same Gold tables without the security filter. Semantic-layer security is enforced for all consumers automatically.

4 Fabric and Power BI Semantic Models Intermediate

The Power BI / Fabric tabular semantic model is the most widely adopted semantic layer in the world. It uses the Tabular Analysis Services engine and DAX (Data Analysis Expressions) for measure definitions. In Microsoft Fabric, semantic models connect directly to Gold tables via Direct Lake mode — no data import, no scheduled refresh, always current.

Reference: Microsoft Semantic Models Documentation

Building the Semantic Model — Steps

  1. Import or connect Gold tables (DimCustomer, DimDate, DimProduct, FctOrderLine)
  2. Define relationships in Model view
  3. Hide surrogate keys and technical columns
  4. Create DAX measures for all KPIs
  5. Build date and geography hierarchies
  6. Apply Row-Level Security roles
  7. Publish to Fabric workspace

DAX Measures — Core KPI Definitions

-- DAX measures: defined in the semantic model, used by all Power BI reports
-- These are the single source of truth for each KPI

-- Basic revenue measure
Revenue =
    SUM ( FctOrderLine[ExtendedAmount] )

-- Revenue with currency formatting
Revenue $ =
    FORMAT ( [Revenue], "$#,##0" )

-- Year-over-year revenue comparison
Revenue YoY % =
VAR CurrentYear   = CALCULATE ( [Revenue], YEAR ( DimDate[FullDate] ) = YEAR ( TODAY () ) )
VAR PreviousYear  = CALCULATE ( [Revenue], YEAR ( DimDate[FullDate] ) = YEAR ( TODAY () ) - 1 )
RETURN
    DIVIDE ( CurrentYear - PreviousYear, PreviousYear, BLANK () )

-- Customer count (distinct -- not row count)
Customer Count =
    DISTINCTCOUNT ( FctOrderLine[CustomerKey] )

-- Average order value
Avg Order Value =
    DIVIDE ( [Revenue], DISTINCTCOUNT ( FctOrderLine[OrderLineKey] ), BLANK () )

-- Gross margin (requires cost in the fact or a separate cost table)
Gross Margin % =
    DIVIDE (
        SUM ( FctOrderLine[ExtendedAmount] ) - SUM ( FctOrderLine[CostAmount] ),
        SUM ( FctOrderLine[ExtendedAmount] ),
        BLANK ()
    )

-- Running total (year-to-date revenue)
Revenue YTD =
    CALCULATE (
        [Revenue],
        DATESYTD ( DimDate[FullDate] )
    )

-- Prior period comparison using time intelligence
Revenue Prior Month =
    CALCULATE ( [Revenue], PREVIOUSMONTH ( DimDate[FullDate] ) )

Revenue MoM Change =
    [Revenue] - [Revenue Prior Month]

Defining Relationships in Power BI

-- Relationships are defined in Power BI Model view (GUI)
-- or via Tabular Object Model (TOM) API for automation
-- The following represents the relationship structure:

-- FctOrderLine[OrderDateKey]  → DimDate[DateKey]         (Many-to-One, Active)
-- FctOrderLine[CustomerKey]   → DimCustomer[CustomerKey]  (Many-to-One, Active)
-- FctOrderLine[ProductKey]    → DimProduct[ProductKey]    (Many-to-One, Active)
-- FctOrderLine[StoreKey]      → DimStore[StoreKey]        (Many-to-One, Active)

-- IMPORTANT: filter direction should be Single (not Both) on all relationships
-- Bidirectional filtering causes ambiguous filter propagation in complex models
-- Use CROSSFILTER() in DAX when you need bidirectional filtering for a specific measure

Hiding Technical Columns

-- In Power BI Desktop: right-click column → Hide
-- Columns to ALWAYS hide from analysts:
-- Surrogate keys:    CustomerKey, ProductKey, DateKey, StoreKey, OrderLineKey
-- SCD2 metadata:     EffectiveFrom, EffectiveTo, IsCurrent
-- Natural keys:      CustomerNaturalKey, ProductNaturalKey
-- ETL metadata:      LoadedAt, _ingested_at, _source_file

-- Columns to EXPOSE:
-- All descriptive attributes: CustomerName, Email, City, LoyaltyTier
-- All measures: already defined as DAX measures
-- Date attributes: Year, Quarter, MonthName, WeekOfYear, IsWeekend

5 Looker and LookML Intermediate

LookML is Looker’s declarative modeling language and one of the most structured semantic layers available. Every dimension, measure, and join is defined in version-controlled YAML-like files. The LookML model is the single source of truth for all Looker explores and dashboards — no calculation can be defined in a dashboard that is not first defined in LookML.

Reference: LookML Introduction Documentation

LookML View — Dimensions and Measures

# LookML View: defines dimensions and measures for a single Gold table
# File: views/fct_order_line.view.lkml

view: fct_order_line {
  sql_table_name: gold.fct_order_line ;;

  # Dimensions -- descriptive attributes
  dimension: order_line_key {
    type:        number
    primary_key: yes
    hidden:      yes              # hide surrogate key from analysts
    sql:         ${TABLE}.order_line_key ;;
  }

  dimension: customer_key {
    type:    number
    hidden:  yes                  # hide foreign key -- exposed through join
    sql:     ${TABLE}.customer_key ;;
  }

  dimension_group: order_date {
    type:        time
    timeframes:  [date, week, month, quarter, year]
    datatype:    date
    sql:         ${TABLE}.order_date ;;
    # Generates: order_date_date, order_date_week, order_date_month, etc.
  }

  dimension: quantity {
    type: number
    sql:  ${TABLE}.quantity ;;
  }

  dimension: unit_price {
    type:        number
    value_format: "$#,##0.00"
    sql:          ${TABLE}.unit_price ;;
  }

  # Measures -- aggregations defined once, used everywhere
  measure: revenue {
    type:         sum
    sql:          ${TABLE}.extended_amount ;;
    value_format: "$#,##0"
    description:  "Total revenue after discounts. Source: FctOrderLine.ExtendedAmount"
  }

  measure: order_count {
    type:        count_distinct
    sql:         ${TABLE}.order_line_key ;;
    description: "Number of distinct order lines"
  }

  measure: avg_order_value {
    type:         number
    sql:          ${revenue} / NULLIF(${order_count}, 0) ;;
    value_format: "$#,##0.00"
    description:  "Average revenue per order line"
  }

  measure: customer_count {
    type:        count_distinct
    sql:         ${TABLE}.customer_key ;;
    description: "Number of distinct customers with at least one order"
  }
}

LookML Explore — Joining Views

# LookML Explore: defines how views join together
# This is the interface analysts use to build queries in Looker
# File: models/analytics.model.lkml

explore: orders {
  label:       "Order Analytics"
  description: "Analyze order revenue, customer behavior, and product performance"
  view_name:   fct_order_line

  join: dim_customer {
    type:        left_outer
    sql_on:      ${fct_order_line.customer_key} = ${dim_customer.customer_key}
                 AND ${dim_customer.is_current} = 1 ;;
    relationship: many_to_one
  }

  join: dim_product {
    type:        left_outer
    sql_on:      ${fct_order_line.product_key} = ${dim_product.product_key}
                 AND ${dim_product.is_current} = 1 ;;
    relationship: many_to_one
  }

  join: dim_date {
    type:        left_outer
    sql_on:      ${fct_order_line.order_date_key} = ${dim_date.date_key} ;;
    relationship: many_to_one
  }
}

# LookML Access Filter: row-level security within an Explore
# Restricts rows based on user attributes set in Looker user configuration
explore: orders {
  # ... joins above ...

  access_filter: {
    field: dim_customer.region
    user_attribute: allowed_region  # Looker user attribute -- set per user/group
  }
}

6 dbt Metrics and MetricFlow Intermediate

dbt Metrics provides a metric definition layer built in YAML within your dbt project. Metrics defined in dbt are version-controlled, testable, and compiled to SQL by MetricFlow — dbt’s semantic layer query engine. Any BI tool that integrates with the MetricFlow API can query dbt metrics without writing their own SQL.

Reference: dbt Metrics Documentation

# dbt metric definition in YAML
# File: models/metrics/revenue_metrics.yml

metrics:
  - name: revenue
    label:       "Revenue"
    description: "Total revenue after discounts. Sum of ExtendedAmount on FctOrderLine."
    type:        simple
    type_params:
      measure:
        name:        extended_amount
        agg:         sum
        expr:        extended_amount
    filter: |
      {{ Dimension('order_line__is_voided') }} = false
    meta:
      owner:       "data-team@company.com"
      source_table: "gold.fct_order_line"

  - name: customer_count
    label:       "Unique Customers"
    description: "Number of distinct customers who placed at least one order"
    type:        simple
    type_params:
      measure:
        name: customer_key
        agg:  count_distinct
        expr: customer_key

  - name: avg_order_value
    label:       "Average Order Value"
    description: "Revenue divided by number of distinct order lines"
    type:        ratio
    type_params:
      numerator:   revenue
      denominator: order_count
# MetricFlow query: compile metrics to SQL
# Run from the dbt CLI or via the MetricFlow API

# Query revenue by month using MetricFlow
mf query \
  --metrics revenue,customer_count \
  --group-by metric_time__month \
  --start-time 2025-01-01 \
  --end-time 2025-12-31

# MetricFlow generates the correct SQL for your target warehouse:
SELECT
    DATE_TRUNC('month', f.order_date)        AS metric_time__month,
    SUM(f.extended_amount)                   AS revenue,
    COUNT(DISTINCT f.customer_key)           AS customer_count
FROM gold.fct_order_line f
WHERE f.order_date BETWEEN '2025-01-01' AND '2025-12-31'
AND   f.is_voided = false
GROUP BY 1
ORDER BY 1;

7 Databricks SQL Semantic Layer Intermediate

In a Databricks environment, the semantic layer is built as SQL views in Unity Catalog on top of Gold Delta tables. These views hide surrogate keys, apply business naming conventions, pre-join frequently combined tables, and expose only the columns and measures that analysts need. BI tools connect to the Databricks SQL Warehouse and query these semantic views rather than the underlying Gold tables directly.

-- Databricks SQL: semantic view over Gold tables
-- Hides surrogate keys, pre-joins common dimensions, exposes business-friendly names

CREATE OR REPLACE VIEW prod.semantic.v_order_analytics AS
SELECT
    -- Date attributes (from DimDate -- analysts query by these)
    d.full_date                 AS order_date,
    d.month_name,
    d.quarter_name,
    d.year,
    d.is_weekend,

    -- Customer attributes (from DimCustomer current version)
    c.customer_name,
    c.city,
    c.state_province,
    c.country,
    c.loyalty_tier,

    -- Product attributes (from DimProduct current version)
    p.product_name,
    p.category,
    p.subcategory,

    -- Store attributes
    s.store_name,
    s.district,
    s.region,

    -- Measures (from FctOrderLine)
    f.quantity,
    f.unit_price,
    f.extended_amount           AS revenue
    -- Note: SUM() aggregations happen in BI tool -- view exposes row-level data

FROM prod.gold.fct_order_line   f
JOIN prod.gold.dim_date         d ON f.order_date_key = d.date_key
JOIN prod.gold.dim_customer     c ON f.customer_key   = c.customer_key
                                 AND c.is_current     = 1
JOIN prod.gold.dim_product      p ON f.product_key    = p.product_key
                                 AND p.is_current     = 1
JOIN prod.gold.dim_store        s ON f.store_key      = s.store_key
                                 AND s.is_current     = 1;

-- Grant access to analysts via Unity Catalog
GRANT SELECT ON VIEW prod.semantic.v_order_analytics TO `analysts`;
-- analysts see only this view -- not the underlying Gold tables
REVOKE SELECT ON SCHEMA prod.gold FROM `analysts`;
-- Pre-aggregated semantic view for common dashboard queries
-- Reduces query time for frequently run revenue-by-month reports

CREATE OR REPLACE VIEW prod.semantic.v_revenue_monthly AS
SELECT
    d.year,
    d.month_name,
    d.month_number,
    c.loyalty_tier,
    p.category,
    s.region,
    SUM(f.extended_amount)           AS revenue,
    COUNT(DISTINCT f.customer_key)   AS unique_customers,
    COUNT(*)                         AS order_line_count
FROM prod.gold.fct_order_line    f
JOIN prod.gold.dim_date          d ON f.order_date_key = d.date_key
JOIN prod.gold.dim_customer      c ON f.customer_key   = c.customer_key
                                  AND c.is_current     = 1
JOIN prod.gold.dim_product       p ON f.product_key    = p.product_key
                                  AND p.is_current     = 1
JOIN prod.gold.dim_store         s ON f.store_key      = s.store_key
                                  AND s.is_current     = 1
GROUP BY
    d.year, d.month_name, d.month_number,
    c.loyalty_tier, p.category, s.region;

8 Snowflake Dynamic Tables and Secure Views Intermediate

Snowflake’s semantic layer is built using a combination of Dynamic Tables for pre-aggregated materialized views and Secure Views for governed access to dimensional data. Dynamic Tables automatically refresh when upstream data changes, making them ideal for frequently queried aggregations that would be expensive to compute on every dashboard load.

Reference: Snowflake Dynamic Tables Documentation

-- Snowflake Secure View: governed access to Gold dimension
-- Secure views prevent analysts from inspecting the view definition
-- and hide surrogate key implementation details

CREATE OR REPLACE SECURE VIEW gold.semantic.customer_dim AS
SELECT
    customer_name,
    email,
    city,
    state_province,
    country,
    loyalty_tier
    -- Surrogate keys and SCD metadata NOT exposed
FROM gold.dim_customer
WHERE is_current = 1;  -- always show current records only

GRANT SELECT ON VIEW gold.semantic.customer_dim TO ROLE analyst_role;
-- Snowflake Dynamic Table: pre-aggregated revenue by month and loyalty tier
-- Automatically refreshes when gold.fct_order_line or gold.dim_customer changes
-- Target lag controls how current the data is (1 minute = near-real-time)

CREATE OR REPLACE DYNAMIC TABLE gold.semantic.revenue_by_month_tier
    TARGET_LAG = '1 hour'    -- refresh within 1 hour of upstream changes
    WAREHOUSE  = bi_warehouse
AS
SELECT
    d.year,
    d.month_name,
    d.month_number,
    c.loyalty_tier,
    SUM(f.extended_amount)          AS revenue,
    COUNT(DISTINCT f.customer_key)  AS unique_customers,
    COUNT(*)                        AS order_line_count,
    AVG(f.extended_amount)          AS avg_order_value
FROM gold.fct_order_line    f
JOIN gold.dim_date           d ON f.order_date_key = d.date_key
JOIN gold.dim_customer       c ON f.customer_key   = c.customer_key
                              AND c.is_current     = 1
GROUP BY d.year, d.month_name, d.month_number, c.loyalty_tier;

-- Monitor Dynamic Table refresh status
SELECT
    name,
    target_lag,
    last_completed_refresh,
    scheduling_state,
    rows
FROM information_schema.dynamic_tables
WHERE schema_name = 'SEMANTIC'
ORDER BY last_completed_refresh DESC;

9 Row-Level Security Across Platforms Advanced

Row-level security (RLS) restricts which rows each user can see based on their identity. A regional sales manager should see only their region’s data. A country-level analyst should see only their country. RLS must be enforced at the semantic layer — not in individual dashboards — so it applies automatically to every report built on top of the semantic layer.

Power BI RLS

-- Power BI RLS: defined in the semantic model using DAX filter expressions
-- In Power BI Desktop: Modeling → Manage Roles → Create Role

-- Role: "Regional Manager"
-- Table: DimStore
-- DAX filter: [Region] = USERPRINCIPALNAME()
-- This restricts the role to rows where Region matches the logged-in user's email

-- Better pattern: use a separate security table
-- Security table: maps user email to allowed regions/countries

-- Role DAX filter on FctOrderLine:
[StoreKey] IN
    CALCULATETABLE (
        VALUES ( SecurityTable[StoreKey] ),
        SecurityTable[UserEmail] = USERPRINCIPALNAME ()
    )

-- Test RLS in Power BI Desktop:
-- Modeling → View As → Select a role → Reports now show filtered data

Looker Access Filters

# Looker RLS via access_filter in LookML
# User attributes set in Looker Admin → User Attributes

explore: orders {
  # Restrict rows by region based on Looker user attribute
  access_filter: {
    field:          dim_store.region
    user_attribute: allowed_region
    # In Looker Admin: set allowed_region = "Northeast" for each user/group
  }

  # Multiple access filters AND together
  access_filter: {
    field:          dim_customer.country
    user_attribute: allowed_country
  }
}

Databricks Unity Catalog Row Filters

-- Databricks: row-level security via Unity Catalog row filters
-- Row filters apply to all queries on the table -- cannot be bypassed

-- Create a row filter function
CREATE OR REPLACE FUNCTION prod.security.filter_by_region(region STRING)
RETURNS BOOLEAN
RETURN region = current_user() OR IS_MEMBER('admin-group');

-- Apply the row filter to the Gold fact table
ALTER TABLE prod.gold.fct_order_line
SET ROW FILTER prod.security.filter_by_region ON (region_column);

-- Now every query on fct_order_line automatically applies the filter
-- Admin group members see all rows
-- Other users see only rows matching their username (replace with actual attribute logic)

10 Aggregation Tables for Performance Advanced

Large fact tables with billions of rows can make common dashboard queries slow even with good physical design. Aggregation tables — pre-computed summaries at common grain levels (daily, monthly, by region, by category) — eliminate the need to scan the full fact table for most dashboard queries. The semantic layer routes queries to the aggregation table automatically when the query grain matches.

-- Power BI aggregation tables: defined in the semantic model
-- Step 1: Create an aggregation table in your warehouse

CREATE TABLE gold.fct_order_line_monthly_agg AS
SELECT
    order_date_key,
    customer_key,
    product_key,
    store_key,
    SUM(quantity)          AS quantity_sum,
    SUM(extended_amount)   AS revenue_sum,
    COUNT(*)               AS order_line_count
FROM gold.fct_order_line
GROUP BY order_date_key, customer_key, product_key, store_key;

-- Step 2: In Power BI semantic model, import the aggregation table
-- and set up the aggregation mapping:
-- fct_order_line_monthly_agg[revenue_sum]     aggregates fct_order_line[extended_amount]
-- fct_order_line_monthly_agg[quantity_sum]    aggregates fct_order_line[quantity]
-- fct_order_line_monthly_agg[order_line_count] aggregates fct_order_line[OrderLineKey]

-- Step 3: Set the aggregation table to Import mode, detail table to DirectQuery
-- Power BI automatically uses the aggregation table for high-grain queries
-- Falls back to the detail table for row-level queries
-- Snowflake: materialized view as aggregation table
-- Automatically maintained by Snowflake when base tables change
CREATE OR REPLACE MATERIALIZED VIEW gold.fct_revenue_monthly_mv AS
SELECT
    date_trunc('month', d.full_date)    AS revenue_month,
    c.loyalty_tier,
    p.category,
    s.region,
    SUM(f.extended_amount)              AS revenue,
    COUNT(DISTINCT f.customer_key)      AS unique_customers
FROM gold.fct_order_line     f
JOIN gold.dim_date            d ON f.order_date_key = d.date_key
JOIN gold.dim_customer        c ON f.customer_key   = c.customer_key
                               AND c.is_current     = 1
JOIN gold.dim_product         p ON f.product_key    = p.product_key
                               AND p.is_current     = 1
JOIN gold.dim_store           s ON f.store_key      = s.store_key
                               AND s.is_current     = 1
GROUP BY 1, 2, 3, 4;
-- Reference: https://docs.snowflake.com/en/user-guide/views-materialized

11 Conformed Metrics — Defining KPIs Once Intermediate

Conformed metrics are the semantic layer equivalent of conformed dimensions — they are defined once and used consistently across all reports, dashboards, and analytics tools. Every KPI your organization uses should be conformed: revenue, gross margin, churn rate, customer acquisition cost, monthly recurring revenue, daily active users.

A Conformed Metrics Reference

MetricDefinitionSource TableNotes
RevenueSUM(ExtendedAmount)FctOrderLineAfter discounts, before returns unless returns are a separate fact
Gross Margin(Revenue – COGS) / RevenueFctOrderLine + DimProductRequires unit cost in dimension or separate cost fact
Customer CountCOUNT DISTINCT CustomerKeyFctOrderLineCustomers with at least one order in the period
Avg Order ValueRevenue / Order CountFctOrderLineRevenue divided by number of distinct orders (not lines)
Churn Rate(Churned Customers / Start Customers)FctSubscriptionRequires subscription start/end logic in the fact
MRRSUM(MonthlyAmount) WHERE ActiveFctSubscriptionMonthly Recurring Revenue — active subscriptions only

Document every metric. For each conformed metric, document: the exact calculation, which columns it uses, what filters apply, whether it includes or excludes specific scenarios (e.g., does “Revenue” include returns?), who approved the definition, and when it was last reviewed. dbt Metrics YAML provides a structured place to store this documentation alongside the definition — it lives in the same Git repository as the transformation code and is reviewed in the same pull request process.

12 Workshops

Novice

Build Your First Semantic Model (Fabric)

  • Import DimCustomer, DimDate, and FctOrderLine into a Fabric semantic model
  • Define relationships between all three tables
  • Hide all surrogate key columns
  • Create a Revenue DAX measure: SUM(FctOrderLine[ExtendedAmount])
  • Create a Customer Count measure: DISTINCTCOUNT(FctOrderLine[CustomerKey])
  • Build a simple Power BI report: Revenue by Month using a bar chart

Intermediate

Build a LookML Semantic Layer

  • Create LookML view files for FctOrderLine, DimCustomer, and DimDate
  • Define Revenue and Customer Count measures in LookML
  • Build an Explore joining all three views
  • Add an access_filter for regional security
  • Validate the Explore in the Looker UI
  • Compare Revenue from LookML vs your Power BI DAX measure — they should match

Advanced

Build a dbt Metrics Layer + Aggregation Table

  • Define Revenue and Customer Count metrics in dbt YAML
  • Run MetricFlow queries at day and month grain
  • Create a monthly aggregation table in your warehouse
  • Configure Power BI aggregation mapping to use the agg table for monthly queries
  • Verify: monthly dashboard queries use the agg table; row-level queries use the detail table
  • Compare Revenue from dbt Metrics vs DAX measure — validate consistency

References

Up next → Part 10: Orchestration and Automation — Airflow, Fabric Data Pipelines, dbt Cloud, Databricks Workflows, ADF, Glue Workflows, Step Functions, and Cloud Composer — scheduling, dependencies, retries, and SLA monitoring.

Discover more from SQLYARD

Subscribe to get the latest posts sent to your email.

Leave a Reply

Discover more from SQLYARD

Subscribe now to keep reading and get access to the full archive.

Continue reading