Back to Blog

Securing Data Factory in Microsoft Fabric - A Practical Checklist

March 26, 20268 min readMichael Ridland

Every data engineering project eventually hits the same conversation. You've built the pipelines, the data is flowing, stakeholders are happy with the output - and then the security team asks to review the deployment. If you haven't thought about security from the start, this conversation goes badly.

Data Factory in Microsoft Fabric moves data between sources and destinations. That means credentials, network connections, sensitive business data, and API keys are all in play. Getting this right isn't optional, and yet we regularly see deployments where the security configuration is basically "whatever the defaults were."

I've put together this guide based on what we recommend to our clients and what Microsoft's own security documentation outlines. Think of it as a checklist you can work through, whether you're setting up a new Fabric deployment or hardening an existing one.

Network Security - Keep Your Data Off the Public Internet

This is the foundation. If your data is travelling over the public internet when it doesn't need to, everything else you do is compensating for a problem you shouldn't have.

On-premises data gateway is the first thing to configure if you're connecting to on-premises data sources. It creates an encrypted channel through your firewall without exposing your internal network. We set this up on almost every client engagement because most Australian enterprises still have at least some data sitting on-premises - whether that's a SQL Server instance, a file share, or an ERP system.

The setup itself is straightforward, but there are a few things people get wrong. Put the gateway on a machine with decent specs - it's doing real work, not just proxying requests. Keep it updated - Microsoft releases patches regularly and falling behind can cause connectivity issues. And run it as a service, not interactively. I've seen production gateways that required someone to log into a server and click "start" after every reboot.

VNet data gateway is the better option when your data sources are Azure-based but behind private endpoints. It gives you the same secure connectivity without the overhead of managing a physical gateway. If all your sources are in Azure, this is the way to go.

Service tags are worth knowing about. They let you enable secure connectivity to Fabric data sources in Azure virtual networks through network security group rules, without manually managing IP address ranges. Less maintenance, same security outcome.

Private links at the tenant level force all traffic to Fabric through Microsoft's private network backbone. This is the "belt and braces" option that compliance teams love. It does add some complexity to the initial setup, but for organisations handling sensitive financial or health data, it's a reasonable trade-off.

Identity and Access Management

Getting network security right means nothing if everyone in the organisation has admin access to your Data Factory workspace. This sounds obvious, but you'd be surprised how often we inherit Fabric environments where half the company has Contributor access "because it was easier."

Workspace roles should follow least privilege. Fabric gives you Admin, Member, Contributor, and Viewer roles. Most users need Viewer. People building pipelines need Contributor. Admins should be a small, named list. We typically set this up during the initial workspace creation and review it quarterly.

Microsoft Entra conditional access is something we push hard for. It lets you set policies based on location, device, and risk level. Practical example - if someone's accessing your Data Factory workspace from an unmanaged device in a country where you don't have staff, that request should be blocked or at minimum require step-up authentication.

Multi-factor authentication should be non-negotiable for anyone accessing Fabric. Configure this through conditional access policies rather than per-user MFA settings - it's more manageable and gives you better control. Yes, it adds friction. That friction is the point.

Workspace identities are newer and worth understanding. They let you establish secure connections between Data Factory and firewalled data sources without using personal credentials. Instead of Bob from the data team having his credentials stored in a connection string (and then Bob leaves the company and everything breaks), the workspace itself has an identity that's managed centrally.

Data source access lists are easy to overlook. When you add a cloud data source to a gateway, you can control exactly who can use that data source in their pipelines. This matters because just because someone has Contributor access to the workspace doesn't mean they should be able to query every connected data source.

Row-level security controls what data users see in semantic models. If your pipelines feed Power BI reports and different teams should see different data, RLS is how you enforce that at the data layer rather than building separate reports for everyone.

Data Protection

This is where things get interesting from a compliance perspective, especially for Australian organisations dealing with the Privacy Act and industry-specific regulations.

Sensitivity labels from Microsoft Purview let you classify data as it flows through your pipelines. The useful thing is that these labels stick with the data - export it to Excel, the label follows. This gives you auditability. When the compliance team asks "how do we know our customer PII is being handled correctly?" you can point to the labelling and say "it's classified, tracked, and protected."

Data loss prevention policies monitor for sensitive data moving where it shouldn't. If someone builds a pipeline that exports customer financial data to an unprotected location, DLP policies can flag or block it. Setting these up requires some upfront work to define what counts as sensitive in your context, but it pays off.

Azure Key Vault for credential storage should be a hard requirement. Don't embed passwords, API keys, or connection strings directly in your pipeline configurations. Store them in Key Vault and reference them. This centralises your secrets management and means you can rotate credentials without touching every pipeline that uses them.

I've personally seen production environments where SQL passwords were hardcoded into pipeline connection strings. Changing the password meant updating dozens of pipelines manually. Key Vault makes this a non-issue.

Logging and Monitoring

If something goes wrong - and eventually, something will - you need to know about it quickly.

Audit logging tracks who did what and when. Pipeline creation, modification, execution, connection changes - it's all captured. Enable this from day one, not after an incident. We recommend setting up retention policies that match your organisation's compliance requirements. Microsoft Purview lets you manage these retention policies centrally.

Monitoring hub in Fabric gives you real-time visibility into pipeline executions. Failed runs, long-running operations, unusual patterns - all visible in one place. Make a habit of checking this regularly, not just when something breaks.

Notifications through Teams or Outlook are worth setting up for critical pipelines. If your nightly data load fails at 2am, you want to know before your stakeholders notice stale data in their morning reports. Fabric supports both Teams and Outlook activity notifications directly from pipeline activities.

Compliance and Governance

For larger organisations, especially those in regulated industries, the governance layer matters as much as the technical security.

Microsoft Purview Information Protection gives you the classification and protection framework. It works alongside the sensitivity labels mentioned earlier but extends into broader governance - data cataloguing, lineage tracking, and policy enforcement.

Microsoft Defender for Cloud Apps integration adds threat detection on top of your Fabric environment. It monitors for suspicious activities and provides alerting. If someone suddenly starts exporting large volumes of data at unusual hours, Defender can flag that.

Content endorsement is a governance feature that lets you mark Data Factory items as certified or promoted. In large organisations with dozens of workspaces, this helps users identify which pipelines and dataflows are official and trusted versus experimental or personal.

Data lineage tracking shows you where your data came from, what transformations it went through, and where it ended up. This is essential for impact analysis - when someone asks "if we change the schema on this source table, what breaks?" lineage tracking gives you the answer.

Backup and Recovery

Microsoft's documentation mentions Git integration for managing pipeline and dataflow development, and I'd say this is the most practical backup strategy available right now. Version control your Data Factory artifacts in Azure DevOps or GitHub. If something goes wrong - an accidental deletion, a bad configuration change - you can restore from source control.

This also gives you proper development workflows. Branching, pull requests, code review for pipeline changes. It sounds like overkill until the first time someone accidentally modifies a production pipeline and you can't figure out what changed.

Where to Start

If you're reading this and thinking "we haven't done half of this stuff" - don't panic. Start with the highest-impact items:

  1. Get credentials into Key Vault
  2. Set up proper workspace roles
  3. Enable audit logging
  4. Configure network security (gateways, private links)
  5. Roll out conditional access and MFA

That covers the biggest risks. Then layer on sensitivity labels, DLP policies, and governance features over time.

Microsoft's full security documentation for Data Factory in Fabric goes deeper on each of these areas.

How We Help

We do a lot of Microsoft Fabric consulting and Data Factory work for Australian businesses, and security configuration is always part of the conversation. It's not a separate project - it's built into how we set things up from the start.

If you're planning a Fabric deployment or want a security review of an existing one, reach out to us. We can run through your current configuration, identify gaps, and help you get to a state where your security team and your data team are both happy.