Back to Blog

Microsoft Fabric Data Factory Pricing Explained - What You Actually Pay For

June 17, 20267 min readMichael Ridland

The first question almost every client asks about Microsoft Fabric is "what's this going to cost us?" And the honest answer, before you've looked at their workloads, is "it depends." That's not a dodge. Fabric pricing genuinely works differently from the per-pipeline, per-activity model people were used to in the standalone Azure Data Factory, and if you carry the old mental model across, you'll either over-provision or get a surprise on the bill.

We've helped a fair few Australian organisations move data workloads onto Fabric, and the pricing conversation is one we have early and often. It's not complicated once you understand the unit of currency, but it's different enough that it catches people out. Let me walk through how it actually works.

Microsoft's official pricing overview for Fabric Data Factory is the source of truth for the numbers. What follows is the practical interpretation - how to think about it when you're planning a budget rather than reading a rate card.

The Big Shift - Capacity Units, Not Per-Activity Billing

In the old Azure Data Factory, you paid per pipeline activity run, per data integration unit hour, per pipeline orchestration. You could estimate a pipeline's cost by counting its activities. Granular, predictable, and a bit fiddly.

Fabric throws that model out. Instead, everything in Fabric - Data Factory pipelines, dataflows, notebooks, warehouses, Power BI, the lot - runs on a shared pool of compute called a capacity, measured in Capacity Units, or CUs. You buy a capacity at a certain size (F2, F4, F8, F64 and so on, where the number is roughly the CU count), and every workload draws from that same pool.

This is the single most important thing to understand. You're not buying Data Factory in isolation. You're buying a slab of compute that Data Factory shares with everything else in your Fabric tenant. The implication is that your data pipelines and your Power BI reports are competing for the same resource, and you budget for the whole thing together rather than line by line.

What Drives Data Factory Cost Inside a Capacity

Within that shared capacity, Data Factory operations consume CUs based on what they do. The two main consumers are:

Pipeline orchestration and activity execution. Running pipelines, moving data between sources, executing copy activities - all of this burns CU seconds. The heavier the data movement, the more it consumes.

Dataflow Gen2 refreshes. Dataflows are compute-hungry. If you've got complex transformations running on a schedule, they can become the dominant cost in your Data Factory usage. We've seen dataflows quietly eat a big chunk of a capacity simply because they were set to refresh more often than the business actually needed.

There's also a separate charge worth knowing about: data movement across certain connectors and the moving of data itself is metered. Copy activity in particular has its own consumption rate tied to how much data you shift and how far. High-volume ingestion jobs are where this shows up.

The key behaviour to grasp is that Fabric meters consumption against your capacity in near real time, and it smooths spikes through a mechanism called bursting and smoothing. If a pipeline run briefly demands more than your capacity provides, Fabric can burst to handle it, then spread that consumption out over the following minutes or hours. This is genuinely helpful - it stops a single heavy job from immediately throttling everything else - but it also means your usage and your billed consumption don't line up minute to minute, which confuses people the first time they look at the metrics.

Pay-As-You-Go vs Reserved Capacity

You've got two ways to pay for a Fabric capacity, and the choice matters for the budget.

Pay-as-you-go bills you by the hour for whatever capacity size you've provisioned. The nice feature here is that you can pause a capacity when you don't need it, and while it's paused you're not paying for compute. For dev and test environments, or workloads that only run during business hours, pausing can cut the bill substantially. We set this up routinely - there's no reason to pay for an idle test capacity overnight and on weekends.

Reserved capacity is a one-year commitment that gives you a meaningful discount over pay-as-you-go, in the order of 40 per cent for the equivalent capacity. The trade-off is you commit to and pay for that capacity whether you use it or not, and you can't pause it for savings the way you can with pay-as-you-go.

The rule of thumb we give clients: run pay-as-you-go while you're still figuring out your steady-state usage, then move your stable production capacity to a reservation once you know your real baseline. Committing to a reservation before you understand your workload is how you end up paying for capacity you don't use.

The Trap - Sizing Capacity Wrong

The most common pricing mistake we see isn't about the rate card at all. It's about capacity sizing.

People look at the F-SKU list, see that F64 is the threshold for certain Power BI features, and reflexively buy F64 even when their actual compute needs sit comfortably at F8 or F16. Or they go the other way, buy too small, and then everything throttles during the morning refresh window when half the business is hitting reports at once.

The right approach is to start with an honest estimate of concurrent workload, provision somewhere sensible, then watch the Fabric Capacity Metrics app for a couple of weeks. That app is your best friend for cost control. It shows you exactly which workloads are consuming CUs, when you're hitting capacity limits, and whether you're over-provisioned. We always stand this up early on a Fabric engagement, because guessing at capacity size without it is just expensive trial and error.

One thing to watch: the F64 threshold genuinely matters if you want Power BI report consumption included without per-user Pro licences. Below F64, report viewers need their own licences. So the "right" capacity size sometimes has more to do with how many people view Power BI reports than how much data your pipelines move. Factor that into the decision rather than sizing purely on Data Factory load.

Honest Take - What's Good and What's Rough

The good: the unified capacity model is genuinely simpler to reason about at the organisational level once it clicks. You buy one thing, everything draws from it, and you can scale or pause that one thing. For organisations consolidating a sprawl of separate Azure data services, the single-bill simplicity is a real win.

The rough edges: the smoothing and bursting behaviour makes real-time cost attribution harder than it should be. If your finance team wants to know "what did the marketing data pipeline cost us last month," the answer involves digging through the Capacity Metrics app and doing some interpretation, because everything shares the pool. Cost allocation across teams in a single capacity is a known pain point. If you need clean per-team chargeback, you may end up running separate capacities per team, which costs more in aggregate.

The other thing to watch is that it's easy to let consumption creep. Because individual pipelines don't each show a dollar figure, there's less natural pressure to optimise them than there was when every activity had a visible cost. We've reviewed Fabric tenants where a handful of badly-scheduled dataflows were driving most of the consumption, and nobody had noticed because the capacity just absorbed it until it started throttling. Regular review matters.

Keeping the Bill Under Control

A few practical habits keep Fabric Data Factory costs sensible:

Pause non-production capacities outside business hours. Schedule dataflow and pipeline refreshes to match actual business need rather than defaulting to hourly. Watch the Capacity Metrics app and act on what it tells you. Move stable production workloads to reserved capacity once you know your baseline, but not before. And review your heaviest consumers every month or two, because the heavy hitters are usually a small number of jobs that can be optimised.

This is the sort of ongoing discipline that's easy to describe and easy to let slide. It's also exactly where good Microsoft Fabric consulting pays for itself, often several times over, by keeping a capacity right-sized rather than quietly bloated. When we run Data Factory work for clients, cost monitoring is part of the build, not an afterthought once the bill arrives.

If you're planning a move to Fabric and want a realistic cost picture before you commit, talk to our team. We can look at your actual workloads and give you a grounded estimate rather than a finger-in-the-air number. Getting the sizing right at the start saves a lot of money and a lot of awkward budget conversations later.