Data Factory for SAP Integration - Connecting SAP to Power BI

April 21, 2026•11 min read•Michael Ridland

SAP is everywhere in Australian enterprise. Mining companies, manufacturers, utilities, large retailers, government departments - if you operate at scale in Australia, there's a good chance your core business processes run on SAP ECC, S/4HANA, or SAP BW.

And if you run SAP, you almost certainly have a reporting problem.

SAP's built-in reporting tools (SAP GUI reports, BEx queries, embedded analytics in Fiori) serve SAP users well enough, but they don't help the 80% of your organisation that doesn't live in SAP. Finance wants data in Excel. Operations wants dashboards. Executives want a single view across SAP and non-SAP systems. The sales team wants CRM data blended with SAP order data.

This is where Data Factory and Power BI come in. Data Factory extracts data from SAP, stages it in a modern data platform, and Power BI turns it into reports and dashboards that anyone in the organisation can use.

We've built this architecture for Australian organisations across mining, manufacturing, retail, and professional services. Here's how to do it right.

Why Not Connect Power BI Directly to SAP?

Power BI has native SAP connectors (SAP HANA and SAP BW), so you might wonder why you need Data Factory at all. Fair question.

Reasons to use Data Factory instead of direct Power BI-to-SAP connections:

Performance. Direct queries against SAP for large datasets are slow and put load on your SAP system. Your SAP admin will not be happy when 50 Power BI users all hit SAP at 9 AM.
Data blending. Most useful reports combine SAP data with non-SAP data (CRM, HR, external market data). Data Factory lets you land everything in one place and join it there.
Data transformation. SAP data structures are notoriously complex. Material master data alone can involve 20+ tables. Doing this transformation in Power BI is painful and brittle. Data Factory handles it much better.
History and snapshots. SAP is transactional - it shows current state, not history. Data Factory can capture daily snapshots to enable trend analysis that SAP alone can't provide.
Reduced SAP dependency. If SAP goes down for maintenance, your Power BI reports still work because they're reading from a separate data store, not from SAP directly.
Security separation. You can grant reporting access through the data platform without granting SAP access. This is a big deal in Australian enterprises where SAP licencing and access control are tightly managed.

SAP Connector Options in Data Factory

Data Factory offers several ways to connect to SAP. Choosing the right one depends on your SAP landscape and what data you need.

SAP Table Connector

Best for: Extracting raw table data from SAP ECC or S/4HANA.

The SAP Table connector reads directly from SAP tables using the SAP RFC protocol. It's the simplest approach and works well for:

Master data extraction (materials, customers, vendors, cost centres)
Transactional data (sales orders, purchase orders, invoices)
Configuration tables

Requirements:

SAP .NET Connector (NCo) 3.0 installed on the self-hosted integration runtime
RFC user in SAP with appropriate authorisations (S_RFC, S_TABU_DIS)
Network connectivity from SHIR to SAP application server

Limitations:

You need to know which SAP tables contain the data you want (not always obvious)
No built-in support for SAP's delta extraction mechanisms
Large table extractions can be slow without proper filtering

SAP ODP (Operational Data Provisioning) Connector

Best for: Structured extraction with delta capability from SAP BW or S/4HANA.

The ODP connector accesses SAP's extraction framework, which supports:

Delta (incremental) extraction
DataSources from SAP BW extractors
CDS views from S/4HANA
SAP BW InfoProviders and queries

Requirements:

SAP .NET Connector installed on the SHIR
ODP framework activated in SAP (usually already active in recent versions)
Appropriate SAP authorisations

This is our preferred connector for most SAP integration projects. It provides structured extraction with delta support, which means after the initial full load, subsequent extractions only pull changed records. This dramatically reduces extraction time and SAP system load.

SAP HANA Connector

Best for: Direct access to SAP HANA database tables or views.

If your SAP system runs on HANA (S/4HANA or Suite on HANA), you can bypass the application layer and read directly from the HANA database.

Requirements:

SAP HANA ODBC driver installed on the SHIR
HANA database user with SELECT permissions
Network connectivity from SHIR to HANA database port (typically 30015)

When to use:

You need high-performance extraction of large data volumes
You want to access HANA calculation views
Your SAP team has built data models in HANA that are ready for extraction

When to avoid:

Your SAP system is not on HANA
You need delta extraction (HANA connector does full loads by default)
SAP admin restricts direct HANA database access (common in large enterprises)

SAP BW Connector

Best for: Extracting data from SAP BW InfoProviders and queries.

If your organisation uses SAP BW or BW/4HANA as its data warehouse, this connector accesses the curated data models that your SAP BW team has already built.

Requirements:

SAP .NET Connector installed on the SHIR
MDX or BAPI access to BW
Appropriate SAP BW authorisations

Advantage: You get data that's already been modelled and curated by your SAP team, reducing the transformation work in Data Factory.

Recommended Architecture

Here's the architecture we recommend for SAP-to-Power-BI via Data Factory:

SAP ECC/S4HANA/BW
       |
       | (SAP ODP or Table connector via SHIR)
       v
Azure Data Factory / Fabric Data Factory
       |
       | (Copy + Transform)
       v
Azure Data Lake Gen2 / Fabric Lakehouse
       |
       | (Bronze -> Silver -> Gold layers)
       v
Azure SQL / Fabric Warehouse (Gold/Serving layer)
       |
       v
Power BI (Reports and Dashboards)

The Three-Layer Pattern

Bronze layer: Raw SAP data, extracted as-is. No transformations. This is your safety net - you can always reprocess from bronze if something goes wrong in later layers.

Silver layer: Cleaned and standardised SAP data. Column names translated from SAP technical names to business-friendly names. Data types standardised. Basic quality checks applied. SAP tables joined into meaningful entities (e.g., combining MARA, MAKT, and MARC into a unified materials master).

Gold layer: Business-ready data models optimised for Power BI. Star schema or dimensional models with fact and dimension tables. Pre-calculated measures where appropriate. This is what Power BI connects to.

Why three layers instead of one? Because SAP data is messy. Tables have cryptic names (VBAK, VBAP, LIKP, LIPS). Fields use abbreviations and internal codes. Relationships between tables aren't always obvious. Processing in layers lets you separate concerns and makes troubleshooting much easier.

Step-by-Step Implementation

Step 1 - Identify the Data You Need

Start with the Power BI reports you want to build, and work backwards:

What questions do the reports answer? (e.g., "What are our sales by product by region?")
What data fields do you need? (sales order amount, product name, ship-to region, date)
Which SAP tables or extractors contain those fields? (VBAK, VBAP, KNA1, MARA)

This is where SAP knowledge is essential. If you don't have someone on your team who understands SAP data structures, bring in help. Guessing at SAP table relationships is a fast way to produce wrong reports.

Common SAP data entities and their source tables:

Business Entity	Key SAP Tables	Notes
Sales Orders	VBAK, VBAP, VBEP	Header, item, schedule line
Deliveries	LIKP, LIPS	Header and item
Invoices	VBRK, VBRP	Header and item
Purchase Orders	EKKO, EKPO	Header and item
Materials Master	MARA, MAKT, MARC, MARD	General, text, plant, storage
Customers	KNA1, KNB1, KNVV	General, company code, sales area
Vendors	LFA1, LFB1	General, company code
GL Accounts	SKA1, SKAT	Chart of accounts, text
Financial Postings	BSEG, BKPF	Line items, document header
Cost Centres	CSKS, CSKT	Master, text

Step 2 - Set Up the Self-Hosted Integration Runtime

Follow the setup instructions in our on-premises connectivity guide.

For SAP specifically:

Install the SAP .NET Connector 3.0 on the SHIR server (download from SAP Marketplace - requires an SAP S-user)
Copy the NCo DLLs to the SHIR installation directory
Restart the SHIR service
Verify the SAP connector appears in the linked service options in Data Factory

Step 3 - Build Extraction Pipelines

For each SAP entity:

Create a linked service for your SAP system (using the appropriate connector)
Build a pipeline with a Copy activity
Configure the source (SAP table, ODP extractor, or HANA query)
Configure the destination (Data Lake or Lakehouse - bronze layer)
Add appropriate filters (date ranges, company codes) to limit extraction scope
Test with a small dataset first

Delta extraction pattern (using ODP):

For the initial load:

Set the extraction mode to "Full"
Extract all historical data

For subsequent loads:

Set the extraction mode to "Delta"
ODP tracks the last extraction point and only sends changed records
Merge deltas into the bronze layer

Step 4 - Build Transformation Pipelines

Transform bronze data to silver and gold layers using:

Mapping data flows (ADF) for visual, code-free transformations
Dataflows Gen2 (Fabric) for Power Query-based transformations
Spark notebooks (Fabric or Databricks) for complex transformations

Key transformations for SAP data:

Rename columns from SAP technical names (MATNR) to business names (Material_Number)
Convert SAP date formats (YYYYMMDD as string) to proper date types
Decode SAP coded values (e.g., order status codes to readable descriptions)
Join related tables into unified entities
Apply currency conversion where needed
Filter to relevant company codes, sales organisations, or plant codes

Step 5 - Build Power BI Reports

Connect Power BI to the gold layer (not to the bronze or silver layers, and definitely not directly to SAP).

Power BI configuration tips for SAP data:

Use Import mode for the best end-user performance
Schedule Power BI dataset refreshes after Data Factory pipelines complete
Build a date dimension table in the gold layer (don't rely on SAP date fields directly)
Use Power BI row-level security to replicate SAP authorisation logic where needed

Performance Considerations

SAP extractions can be slow. Here's how to improve performance:

1. Filter aggressively. Don't extract all data from all time. Use date filters, company code filters, and plant filters to reduce volume.

2. Use parallel extraction. The SAP Table connector supports partitioning - split large tables across multiple parallel reads. For MARA with 500,000 records, splitting into 10 partitions can cut extraction time by 5-8x.

3. Extract during off-peak hours. Schedule SAP extractions for nights and weekends (Australian time) when SAP system load is lowest. This improves extraction speed and keeps your SAP admin happy.

4. Use delta extraction. After the initial full load, incremental deltas reduce extraction volume by 90%+ for most entities.

5. Size the SHIR appropriately. SAP extraction is CPU-intensive due to data serialisation. For large SAP estates, use an 8-core, 32 GB SHIR server minimum.

Common SAP Integration Challenges

SAP authorisations. Getting the right RFC authorisations in SAP can take weeks in large enterprises. Start this process early. You need an SAP Basis team member to create the RFC user and assign appropriate roles.

SAP system availability. SAP systems have maintenance windows, transport schedules, and occasional unplanned downtime. Build retry logic into your pipelines and don't depend on 100% SAP availability.

Data volume management. SAP financial data (BSEG) can contain hundreds of millions of records. Extracting the full history requires planning - consider initial load in batches by fiscal year.

Currency and unit handling. SAP stores amounts and quantities with separate currency and unit fields. Make sure your transformation logic handles currency conversion correctly, especially for multi-currency organisations.

Language-dependent texts. SAP stores text descriptions in multiple languages. Make sure your extraction filters for the correct language key (EN for English, or your organisation's preferred language).

How Team 400 Can Help

SAP integration is one of the most technically demanding Data Factory projects you can undertake. It requires deep knowledge of both SAP data structures and Azure data platform architecture.

We're Microsoft Data Factory consultants with hands-on SAP integration experience. We've built SAP-to-Power-BI architectures for Australian enterprises across mining, manufacturing, and professional services.

Our team works across the entire stack - Data Factory for extraction, Microsoft Fabric for the data platform, and Power BI for reporting. That means one team, one architecture, and no finger-pointing between vendors when something needs to be fixed.