Running Python Scripts in Power BI - A Practical Guide for Data Teams
Python in Power BI is one of those features that most teams know exists but few actually use. That's a missed opportunity, especially for Australian organisations sitting on data that doesn't fit neatly into Power BI's built-in connectors.
I've worked with data teams across manufacturing, finance, and logistics who all hit the same wall eventually. Power Query handles 80% of their data prep, but that last 20% - the API calls, the custom calculations, the file format conversions - needs something more flexible. That's where Python scripts come in, and honestly, once you set it up the first time, you wonder why you didn't do it sooner.
What Python in Power BI Actually Does
When you run a Python script in Power BI Desktop, you're using the "Get Data" pipeline to execute a .py file and pull the resulting data into your model. The key requirement is straightforward: your script needs to produce a pandas DataFrame. Power BI reads that DataFrame and treats it like any other data source.
That's it. No special SDK, no proprietary format. If you can get your data into a pandas DataFrame, Power BI can ingest it.
This opens up some genuinely useful workflows. I've seen teams use it to pull data from REST APIs that don't have a native Power BI connector, to run statistical models and import the results alongside raw data, and to parse weird file formats - think legacy CSVs with inconsistent delimiters or nested JSON from IoT sensors.
Setting It Up - The Bits That Trip People Up
Microsoft's documentation covers the basic setup process well, but let me highlight the things that actually cause problems in practice.
Install Python properly. Use the official distribution from python.org. I know Conda is popular, and for good reason, but Power BI calls python.exe directly. Conda environments need activation steps that Power BI doesn't handle, and you'll get mysterious failures with no useful error messages. If your data science team already uses Conda, install a separate vanilla Python alongside it for Power BI. Yes, it feels redundant. Do it anyway.
Install pandas and matplotlib. These are the two required packages. You'll use pandas constantly. Matplotlib you may never touch, but Power BI checks for it regardless. Open a command prompt and run:
pip install pandas
pip install matplotlib
Tell Power BI where Python lives. Go to File > Options and Settings > Options > Python scripting, and point it at your Python installation directory. If you've got multiple Python versions installed (and who doesn't), make sure you're pointing at the right one. This is where 30 minutes of troubleshooting usually lives.
Spaces in the path are fine now. Older versions of Power BI choked on installation paths with spaces. That's been fixed, but I still see blog posts from 2022 telling people to install Python in C:\Python39. You can use the default installation path without worry.
Writing Scripts That Work Well in Power BI
There are a few constraints to know about before you start writing scripts.
Only DataFrames get imported. If your script produces a string, a dictionary, or a NumPy array, Power BI won't see it. Everything must end up in a pandas DataFrame. This is actually a reasonable constraint because it forces you to think about your data structure upfront.
30-second timeout. Scripts that run longer than 30 minutes get killed. Most data pulls won't hit this, but if you're calling a slow API with pagination, you might. The fix is usually to cache intermediate results or restructure your API calls to be more efficient.
No interactive input. Your script can't prompt for user input. No input() calls, no file dialogs. Everything needs to be hardcoded or parameterised through Power Query.
Use absolute paths. If your script reads files from disk, use full paths. Relative paths are unreliable because Power BI's working directory isn't always what you'd expect.
Here's a simple example that actually does something useful - pulling data from a CSV with some preprocessing:
import pandas as pd
# Read the raw data
df = pd.read_csv('C:/Data/sales_raw.csv')
# Clean up column names
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
# Convert date column
df['order_date'] = pd.to_datetime(df['order_date'], dayfirst=True)
# Filter to last 12 months
cutoff = pd.Timestamp.now() - pd.DateOffset(months=12)
df = df[df['order_date'] >= cutoff]
Power BI will see the df DataFrame in the Navigator window and let you load it into your model.
Where This Gets Really Useful
The basic "read a CSV" use case is fine, but the real value shows up in more sophisticated scenarios.
API integration. Got a data source with a REST API but no Power BI connector? Python handles it. I've built scripts that pull data from Xero, ServiceNow, and various proprietary platforms that Australian businesses use. The pattern is always the same: call the API with the requests library, parse the JSON response, reshape it into a DataFrame.
import pandas as pd
import requests
response = requests.get(
'https://api.example.com/data',
headers={'Authorization': 'Bearer YOUR_TOKEN'}
)
data = response.json()
df = pd.json_normalize(data['results'])
Statistical preprocessing. One manufacturing client needed quality control metrics calculated before loading into Power BI. Their quality team knew Python but not DAX. We built the statistical calculations in Python - standard deviations, control limits, anomaly flags - and imported the results. The Power BI report just displays what Python already computed. Much simpler than trying to replicate scipy functions in DAX.
Data blending from disparate sources. Sometimes you need to combine data from sources that Power Query can't easily join. Python lets you merge, reshape, and clean data from multiple sources in a single script before Power BI ever sees it.
The Gotchas Nobody Warns You About
After deploying Python-in-Power-BI solutions for dozens of clients, here are the things that catch people off guard.
Refresh in the Power BI Service is tricky. Python scripts work beautifully in Power BI Desktop. But when you publish to the Power BI Service and want scheduled refresh, you need an on-premises data gateway with Python installed on the gateway machine. The gateway runs the script, not the service. This means managing a Python installation on a server, keeping packages up to date, and dealing with permissions. It's doable but adds operational overhead.
Package management gets messy. Your script works on your laptop because you've got numpy 1.24 and requests 2.31 installed. Your colleague opens the same .pbix file and gets an error because they've got different versions. Document your requirements. Better yet, include a requirements check at the top of your script.
Error messages are unhelpful. When a Python script fails in Power BI, you get a generic error dialog that tells you almost nothing. The fix is to test your script thoroughly outside Power BI first. Get it running perfectly in a terminal, then paste it into Power BI. Debug outside, execute inside.
Nested DataFrames break things. If your DataFrame contains columns that hold lists or dictionaries, Power BI won't handle it well. Flatten your data before the import step. Use pandas' json_normalize or explode methods to get everything into simple rows and columns.
When to Use Python vs Power Query
This comes up constantly: should we do this in Power Query or Python?
Use Power Query when: The transformation is straightforward - filtering, pivoting, merging, basic type conversions. Power Query is visual, auditable, and doesn't require a Python installation. For most standard ETL tasks, it's the right choice.
Use Python when: You need libraries that don't exist in Power Query, like statistical packages, API clients, or specialised parsers. Or when the transformation logic is complex enough that expressing it in M (Power Query's language) would be painful. I've seen Power Query scripts that span hundreds of lines doing what Python could do in twenty.
Use both. This is the approach I recommend most often. Use Python to handle the ingest and heavy transformation, then use Power Query for the final shaping and type casting. Power BI handles this workflow well, and it means your Power Query steps are simple and readable.
Making It Production-Ready
If you're going to use Python scripts in a production Power BI deployment, treat them like production code.
Version control your scripts. Don't just embed them in the .pbix file and forget about them. Keep the Python files in a Git repository. When something breaks six months later, you'll want the commit history.
Add logging. Even simple print statements help. When a script fails during a scheduled refresh at 3am, you need some breadcrumbs to figure out what went wrong.
Document the dependencies. Maintain a requirements.txt for every Python-enabled Power BI report. Include it in your deployment documentation.
Test with realistic data volumes. A script that works fine with 1,000 rows might timeout with 100,000. Test at production scale before deploying.
Getting Started
If your team is new to this, start small. Pick one report that has a data source Power Query can't handle well, and build a Python script for just that source. Get comfortable with the workflow - write the script, test it outside Power BI, paste it in, load the data. Once that's working, you'll naturally find more use cases.
We help Australian businesses with exactly this kind of work - connecting data sources, building proper data models, and getting Power BI reports that actually answer business questions. If your team is hitting the limits of Power Query, or if you're dealing with data sources that need custom integration, our Power BI consulting team can help you set it up properly.
For organisations looking at broader data strategy - where Python in Power BI fits alongside your data warehouse, your AI initiatives, and your reporting infrastructure - we offer business intelligence consulting that looks at the full picture, not just individual reports.
And if you want to build internal capability so your team can handle this independently, check out our AI and data training programmes - we run practical workshops that focus on the tools your team actually uses.
Python in Power BI isn't a silver bullet, but it's a genuinely useful tool that fills a real gap. The setup has some rough edges, the service-side refresh story needs improvement, and error handling could be much better. But for teams that need flexibility beyond what Power Query offers, it's well worth the investment.