Back to Blog

Sizing the On-Premises Data Gateway for Power BI - What Actually Works

April 30, 20268 min readMichael Ridland

Sizing the On-Premises Data Gateway for Power BI - What Actually Works

The on-premises data gateway is one of those bits of Power BI infrastructure that nobody thinks about until something breaks. Refreshes start timing out. DirectQuery reports get sluggish. Someone in IT mentions the gateway server, and suddenly you've got finance asking why their Monday morning dashboard hasn't loaded.

We see this pattern a lot. Most Australian organisations we work with started with a tiny gateway server, sized for a handful of reports, and then their Power BI usage grew. The gateway never got resized. A few years later, the same VM is trying to push hundreds of millions of rows during refresh windows while also serving live queries to DirectQuery reports. It does not end well.

This post walks through how we actually think about gateway sizing when we're helping clients. Microsoft's documentation gives you the official guidance, and I'll link to it at the bottom. What I want to share here is the stuff you only learn after running gateways in production for a few years.

The Two Workloads That Behave Completely Differently

Before you can size a gateway, you have to understand what it's doing. The gateway runs two workloads, and they put very different pressure on the box.

The first is the cached data workload. This is what happens during a scheduled refresh of an Import semantic model. The gateway connects to your source, pulls the data, runs any Power Query transformations that couldn't be folded back to the source, and pushes the result up to the Power BI service. This workload is hungry for CPU and memory, and the more complex your Power Query logic, the worse it gets.

The second is the Live Connection and DirectQuery workload. Here the gateway is just a router. Power BI sends a query, the gateway forwards it to the source, the source returns a result, and the gateway sends it back. The work is mostly network and routing. CPU matters, but not nearly as much. Memory pressure is usually negligible.

If you have a single gateway doing both, you've got a problem. A heavy refresh job can starve the box of CPU at exactly the moment your sales team is loading a DirectQuery dashboard. We've watched this happen at clients where finance refreshes ran at 8am, right as the sales team was logging in for stand-up.

Why Query Folding Is The Single Biggest Lever

I want to flag this early because it gets overlooked constantly. Query folding is the difference between a gateway that runs comfortably on 16GB of RAM and one that needs 64GB.

When Power Query can fold a transformation back to the source database, the work happens on your SQL Server (or wherever). The gateway just pulls the already-transformed result. When folding fails, the gateway has to load all the raw data into memory and do the transformation itself. For a 50 million row table with a few merges and grouped aggregations, that's the difference between a refresh that takes 4 minutes and one that takes 45 minutes and OOM-kills your gateway service.

Before you spend money upgrading hardware, get someone to audit your Power Query steps. We've had clients save themselves a full hardware refresh by fixing five or six dataflows that were silently breaking folding. If you want help with this, our Power BI consultants do this kind of review regularly.

Where to Start - The Honest Numbers

Microsoft recommends starting at 8 cores and 8GB of RAM. That's a reasonable starting point if you have very light usage. For most production workloads at Australian mid-market and enterprise clients, we usually start higher.

For a typical mid-sized organisation with maybe 20 to 50 semantic models, a mix of Import and DirectQuery, and a few hundred report users, we tend to recommend:

  • 16 cores minimum
  • 32GB RAM minimum
  • SSD-backed storage (the gateway uses local disk for buffering)
  • Gigabit network, preferably with a clear path to your data sources

For larger workloads or where there's heavy DirectQuery against on-prem SQL Server or SSAS, we'll often go to 32 cores and 64GB. The CPU matters more than people expect because the gateway is doing a lot of compression and serialisation work, especially on the upload side.

The thing nobody warns you about is the network. We've debugged gateway performance issues for hours at a time only to find that the actual bottleneck was a hop through a firewall proxy that was rate-limiting outbound traffic to Azure. The Power BI service expects a fat, reliable pipe. If your gateway sits behind layers of network appliances, that's worth looking at before you throw more CPU at the problem.

When to Cluster, and When Not To

Gateway clusters give you two things. They remove a single point of failure, and they load balance traffic across multiple machines. Both are useful. Neither is necessary for every deployment.

We tend to recommend clustering when:

  • The organisation has any kind of SLA on Power BI availability (even an informal one)
  • Refresh schedules are tight enough that you can't afford a missed window
  • You're running enough concurrent users that a single gateway is genuinely the bottleneck
  • You want to separate cached refresh work from live query work onto different machines

That last point matters and it's where clusters earn their keep. If you set up two gateways and dedicate one to scheduled refreshes and the other to DirectQuery and Live Connection traffic, your live query users stop suffering when finance kicks off the monthly close refresh. You can do this by putting the gateways in separate clusters and pointing different data sources at each.

Where I'd push back on clustering is when an organisation thinks of it as a magic scalability button. If your gateway is slow because Power Query isn't folding, adding a second gateway doesn't fix anything. You've just doubled the cost of running slow refreshes.

The Capacity Ceiling Nobody Mentions

Here's something worth flagging. Your Power BI capacity SKU sets a hard ceiling on refresh parallelism and DirectQuery throughput. There's no point building a 64-core gateway monster if your capacity can only run 6 refreshes at once.

A P1 capacity (or its F-SKU equivalent) is going to use a very different gateway than an F64 or F256. We've had clients spend tens of thousands of dollars upgrading their gateway VMs only to discover the bottleneck was the capacity all along. The capacity throttles, the gateway sits idle, and nobody can figure out why refreshes are still slow.

Check your capacity utilisation in the Fabric admin portal before you change the gateway. If you're seeing capacity throttling, hardware spend on the gateway is wasted. If you want help working out where the actual bottleneck is, our team does this kind of analysis as part of our Microsoft Fabric consulting work.

Things That Reduce Gateway Load (And Almost Nobody Does)

Most of the conversation around gateway performance is about hardware. The cheaper conversation is about reducing what the gateway has to do in the first place.

Some things worth checking:

For Import models, are you running daily refreshes when weekly would do? Are you using incremental refresh on your large tables? A semantic model with 200 million rows that does a full refresh every morning will eat your gateway. The same model with incremental refresh on the date column might only need to pull last week's data, and finishes in two minutes.

For DirectQuery models, have you set up aggregations? A well-designed aggregation table caches the high-level summaries that most dashboard visuals use, and only falls through to DirectQuery for detailed drill-downs. The gateway load drops dramatically.

For real-time reports, check your automatic page refresh intervals. We've seen reports configured to refresh every 5 seconds when they're showing yesterday's data. Every refresh is a round trip through the gateway. Multiply that by a couple of hundred users and you've manufactured your own performance problem.

These changes cost nothing except the time to make them. If your gateway is groaning under load, this is where I'd start before opening a hardware ticket.

A Note on VM-Hosted Gateways

Most clients we work with run their gateway on a VM rather than physical hardware. That's fine and we don't usually push back on it. A couple of things to watch though.

If your data sources are also in Azure, put the gateway VM in the same region as your Power BI tenant. We've watched gateways run twice as slowly because someone provisioned the VM in a region with higher latency to the Power BI service. The data has to travel further than necessary.

ExpressRoute is worth thinking about for larger deployments, especially where you're pushing significant data volumes between on-prem and Azure. The connection is more consistent than public internet, which matters when you're refreshing 50GB of data and you can't afford retries.

And for SSAS Live Connection workloads with dynamic row-level security, make sure the gateway has a fast path back to your on-prem Active Directory. The gateway has to resolve user tokens, and AD latency adds up across hundreds of concurrent queries.

Wrapping Up

Gateway sizing is one of those problems that looks simple until you try to do it. The right answer depends on your workloads, your capacity, your data sources, and how much your business actually relies on Power BI being responsive.

If you start with the official Microsoft starting point of 8 cores and 8GB, monitor real usage for a few weeks, and then size up based on what you actually see, you'll do fine for most workloads. Just don't forget to look at query folding, capacity limits, and network paths before you spend money on bigger machines.

If you're trying to work out whether your current gateway setup is fit for purpose, or you're planning a Power BI rollout and want to get the foundations right, get in touch. We've done a lot of gateway optimisation work and it's usually one of the cheapest wins available.

Reference: On-premises data gateway sizing on Microsoft Learn