Excel Unplugged

Making Importing from Folder in Power Query (or Power BI) Bulletproof

I’ve always been a big fan of the Import from Folder feature in Power Query. So much so, that I wrote my first post on the topic way back in February of 2015. It’s still one of most popular posts on Excel Unplugged and has been commented on 57 times and counting. I also did a follow up post in May of 2016 about Importing from multiple Excel files in one folder, and this one also quickly took it’s place among the 10 most read posts on Excel Unplugged. Just like with the first post, this one also got commented on quite a lot.
Since consolidating many files from a folder into one consolidated range is obviously a hot topic, but even more so because I keep getting questions on the same snags of the process, I decided to write a post on what I do to make the process “bulletproof”. This is paramount when you plan to load the result to the Power Pivot Data Model where a failed query could result in a corrupted Data Model.

1.Making sure only the right files get transformed

Imagine you get monthly reports in Excel format. To consolidate them, you plan to create a Power Query function as I described in the articles mentioned above and then apply that function to every file in a folder that you would then monthly add files to. This all sounds good in theory but in practice many things can happen in a folder. Someone might put their favourite song by the Beatles in that folder, or maybe a picture of themselves on vacation… Imagine applying your function to any of those. So, bellow is our folder, with the monthly reports but notice the irregularities.
Import Data From Folder Power Query and Power BI
1. The three non-excel files
2. The XLSX (capital letters) extension of the February report
When we do a Get Data/From File/From Folder and select the folder above, we get something like this
Import Data From Folder Power Query and Power BI
Of course, we choose Edit and enter the Query Editor. We focus our attention on the Extension column.
Import Data From Folder Power Query and Power BI
Now I’m guessing a large percentage of users would just go wild with the filters, but the smart move would be to first select the Extension column, then choose Transform/Format/UPPERCASE (could have gone for the lowercase too). With this we solve the XLSX vs. xlsx problem. The reason this is a problem that needs solving, is because Power Query is case sensitive and treats XLSX and xlsx as two different extensions. Now we know that whatever the case was, now all extensions are uppercase.
Import Data From Folder Power Query and Power BI
Now we are ready to do some filtering. And here again, things could get tricky. One might only deselect the Select All and then check the XLSX extension but that could be perceived wrongly by Power Query, so the right way is Filter/Text Filters/Equals…
Import Data From Folder Power Query and Power BI
We write .XLSX. Be careful of the dot preceding the extension.
Import Data From Folder Power Query and Power BI
And here we go. The list of our Reports. We could also Filter the Name column and say we only want files whose name contains Report and get rid of other unwanted Excel files.
Import Data From Folder Power Query and Power BI

2.Fixing transformation errors before loading

Now we are at the “Applying the function to all remaining files in the Folder” step. And since the Function is nothing more than a series of transformation steps that worked on one file, there is no guarantee that they would work on other files in a folder. All automation projects tend to become a hunt for sporadic changes in file contents. I had quite a few great VBA projects fail over one invisible space added to the name of the Sheet or something even more bizarre. So if we have a function there is a chance that it might fail to do the desired transformations on one file or the other. Let’s simulate this and show how to handle this sort of errors.
To keep things simple, we will have an anomaly of one report that will have a data on a Sheet named differently than in all the other reports. Since our function will be written to work with the regular reports it will return an error when applied to the “black sheep” file.
Just to keep things “organized” the “black sheep” will once again be the February Report. And to keep things simple, all our function is supposed to do is to get the value of cell A1 from sheet called Special Sheet from all Reports in the target folder. Invoking the function will get me this.
Import Data From Folder Power Query and Power BI
All A-Okay except for the February Report. Now there are two ways how to handle this. Number one is a Crude but Effective solution.

2.1 Removing errors

Warning: using this method will result in data loss! (Sounds almost like Power Query when you change a load destination ???? )
At this point we know that there are rows that result in an error. To make matters worse, this error is returned by one whole file. But at this point we can select the Column with errors and select Home/Remove Rows/Remove Errors.
Import Data From Folder Power Query and Power BI
And there it is…
Import Data From Folder Power Query and Power BI
No more errors and the data is ready to load. But again, this means that we lost the February data.

2.2 try … otherwise

The lower case in the title is intentional and very important since Power Query is case sensitive.
This will be a different approach. Sort of a IFERROR from Excel. We know why the function we had failed. It was because of the Sheet name. So we create a copy of that function and change the name of the Sheet to whatever it is in the February file.
Import Data From Folder Power Query and Power BI
At this point we have two functions, each fitted to a specific case.
Import Data From Folder Power Query and Power BI
But now we want to use them both in the same step. This will require some custom code change. The right tool for this is the Advanced Editor. You can find the command on the Home tab, but it seems way cooler to select it from the View tab ????
Import Data From Folder Power Query and Power BI
Now we see the entire code of the Import From Folder query. The highlighted line in the image bellow is the one returning errors.
Import Data From Folder Power Query and Power BI
This is the code

#"Invoked Custom Function" = Table.AddColumn(#"Merged Columns", "fnGetReportSpecialSheet", each fnGetReportSpecialSheet([Merged]))

It basically tells Power Query to add a column by invoking the function fnGetReportSpecialSheet (not the best name for a function I know) on column Merged that has our Filepaths to all the files in the folder. Now we will customise this by changing the last part to

Import Data From Folder Power Query and Power BI

each try fnGetReportSpecialSheet([Merged]) otherwise fnGetReportSheet1([Merged]))

So try to do the first and if it produces an error, use the second function.

Import Data From Folder Power Query and Power BI

Pure Brilliance!

Comments 4

  1. David Hoppe says:

    Great post Gasper. I am wondering, how are you liking the new Combine Files experience in Power Query? It got skippng over error files baked into the main dialog box, for example.

    1. To be honest, I still go for the “manual” way… I do like some of the process but if the files are not all the same, I haven’t had luck with it 🙂

  2. Piaskun says:

    Very nice explained 🙂
    From my side, I would add also for bullet-proof filtering:
    – filter out filename starting with “~” – this will ensure, that you will not get temporary files (if somebody will e.g. open some of source file)
    – exclude subfolders (filter on path), by default PQ list files from subfolders too

    1. Spot on! I’m almost a little ashamed, I didn’t include the ~… I will definitely edit the post as soon as I get the chance. Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.