Getting CSV Files into Power BI - The Boring Format That Runs Half the Country
Nobody gets excited about a .csv file. It's the plainest data format there is - rows of values, commas in between, no formatting, no formulas, no cleverness. And yet if you sit with the reporting setup of almost any Australian business, you'll find CSVs everywhere. The bank export that reconciles the accounts. The nightly dump out of an old line-of-business system that nobody wants to touch. The report someone downloads from a SaaS tool every Monday and drops in a shared folder. CSV is the format that quietly holds a lot of the country's data together.
So when a business starts building reports in Power BI, getting CSV data in is one of the first jobs. The good news is Power BI handles CSV well. The slightly annoying news is that "handles it well" hides a few traps that will waste your afternoon if you don't know they're there. This is the walkthrough I give clients before they load their first file, based on the ones we've watched go sideways.
What a CSV actually is, and why that matters
A comma separated value file is just text. Each line is a row, and within a row the values are separated by a delimiter, usually a comma. There's no data types, no formatting, no notion of what's a date and what's a number. It's all just characters until something reads it and decides how to interpret them.
That last part is the whole story. Power BI reads the file and makes guesses. Is this column a date? A whole number? Text? Most of the time it guesses right. When it guesses wrong, that's when you get a report showing invoice numbers as scientific notation, or dates that flip day and month, or a leading zero on a product code that has silently vanished. The file isn't broken. Power BI just interpreted the plain text differently to how you meant it.
Two ways in, and they behave differently
In Power BI you can bring a CSV in through two doors, and it's worth knowing which you're using.
The first is Power BI Desktop with Get Data, choosing Text/CSV. This is the one you want for anything real. You point at the file, Power BI shows you a preview, and you land in Power Query where you can shape the data before it loads. This is the proper path.
The second is the Power BI service, where in some experiences you can bring a local CSV straight in to create a dataset without touching Desktop. It's quick, and for a genuine one-off it's fine. But you lose the shaping step, and worse, a local file uploaded to the service is a point-in-time snapshot. It does not refresh. If the data changes, your report doesn't, and nobody finds out until the numbers look wrong in a meeting.
My rule with clients is simple. If the CSV will ever be loaded more than once, build it in Desktop through Power Query and point it at a file location the service can reach on a schedule. If it's a throwaway, do whatever's fastest and don't feel guilty. The mistake is treating a recurring feed like a throwaway.
The traps that actually bite
Here's where the boring format earns its reputation.
Delimiters that aren't commas. Plenty of "CSV" files exported from Australian and European systems use semicolons, because in a lot of locales the comma is the decimal separator. Open one of those expecting commas and every row lands in a single column. Power Query lets you set the delimiter explicitly, so when the auto-detect gets it wrong, you override it. Always glance at the preview before you trust it.
Encoding. If you've ever seen a stray character where an apostrophe or a dollar sign should be, that's an encoding mismatch. The file was saved in one encoding and read in another. UTF-8 is the sane default, but older systems love to export in things like Windows-1252. Power Query lets you pick the file origin/encoding. If names with accents or currency symbols look mangled, that's the first knob to turn.
Type detection on the first rows. Power BI decides column types by sampling the top slice of the file. If the first few hundred rows of a column are all whole numbers and row 900 has a decimal or a bit of text, the type it locked in early can throw an error or drop data. On big or messy files, don't just accept the auto-detected types. Set them yourself in Power Query so the whole column is treated the way you intend.
Leading zeros and long numbers. Product codes, phone numbers, BSBs, anything with a leading zero or more than 15 digits will get mangled if Power BI treats it as a number. The fix is to set that column to text early. If you let it become a number, the zero's gone and you can't get it back without re-importing.
The file that moves. Someone renames the file, or the export starts landing in a different folder, or the columns come out in a new order after a system update. Any of these breaks the refresh, and the error message rarely points at the real cause. CSV feeds are fragile precisely because they depend on a human process staying consistent, and human processes drift.
Getting refresh to actually work
This is the part people underestimate. Loading a CSV once is trivial. Keeping a report fed from CSVs up to date without someone manually re-importing is where the real design decisions live.
If the file sits on someone's laptop or a local drive, the Power BI service can't reach it, so scheduled refresh needs an on-premises data gateway installed on a machine that can see the file and stays on. If the file lives in SharePoint or OneDrive for Business, the service can often get to it directly, which is cleaner. Either way, the folder path and the file name have to stay stable, because Power Query is looking for that exact location.
A pattern that holds up much better than a single file is the folder approach. Instead of pointing at one CSV, you point Power Query at a folder and tell it to combine every file inside. Now when a new daily or weekly export drops in, it gets picked up automatically and appended. This is genuinely useful for those "new file every Monday" feeds, and it turns a manual re-import chore into something that just runs. It does depend on every file having the same structure, so it rewards a tidy export process and punishes a sloppy one.
When CSV is the right answer, and when it isn't
I want to be fair to the humble CSV here, because it gets a bad rap it doesn't fully deserve. For getting data out of a system that has no proper API, no database access, and no modern connector, a scheduled CSV export is often the most pragmatic option on the table. It's universal. Every system on earth can produce one. There's a lot to be said for a format that just works everywhere.
The problem is when CSV quietly becomes the backbone of reporting the business actually depends on. That's when the fragility starts to cost real money. A column gets renamed upstream and three reports break. The export process depends on one person running a manual download, and they go on leave. Nobody validates the file, so a partial export silently produces a report that's confidently wrong. We get called in for a fair few of these, and the underlying issue is almost never Power BI. It's that a serious reporting need is being held up by a data pipeline made of good intentions and a shared folder.
The move that fixes it is landing the data somewhere structured before Power BI ever sees it. A proper database, or a Fabric Lakehouse, or a Data Factory pipeline that pulls the source once, validates it, and serves clean tables to every report downstream. This is bread-and-butter work for our Power BI consulting team, and the shift from "Power BI reads a pile of CSVs" to "Power BI reads a governed source" is usually the single biggest reliability upgrade a reporting setup can get. Where the volumes or the number of feeds justify it, our Microsoft Fabric consultants build out the pipeline properly so nobody's babysitting a folder at 7am.
What I'd tell you to do
If you're loading a CSV today, use Power BI Desktop, go through Power Query, and set your column types deliberately instead of trusting the auto-detect. Check the delimiter and the encoding in the preview before you commit. Put the file somewhere the service can reach, and keep the path and filename boringly consistent. If you've got a recurring set of files, use the folder combine pattern rather than re-importing by hand.
And be honest with yourself about how important the report is. If it's a quick look, CSV is fine, load it and move on. If a decision worth real money hangs on those numbers, treat the CSV as a temporary bridge, not the foundation. The format will happily carry more weight than it should, right up until the day it doesn't.
If you want the mechanics straight from Microsoft, their documentation on getting data from comma separated value files covers the steps. And if you're staring at a folder full of exports wondering how to turn them into reporting you can actually trust, that's daily work for us. Get in touch and we'll give you a straight answer on whether you need a quick import or a proper rethink of where the data comes from.