Excel Unplugged

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

I recently posted on how to make a process of getting data from a folder bulletproof. You can read it here. Whereas that post was about getting data from multiple TXT files within one folder, this one will tell you how to  get data from folder full of Excel files. There is a key difference here since Excel files are different than TXT files in a way that they can contain multiple Sheets and Tables. So the final step of Get Data From Folder will have to use a function of more than just one parameter.

This post is obviously not regular, since it’s not Tuesday (now I feel like Julianne Moore in Laws of Attraction) J but it’s a spinoff of a post I wrote in February of 2015 called Get Data From Folder with Power Query. Almost all comments referring to the mentioned blog post asked for exactly this, how do I do it with Excel files. This requires a few extra techniques and some high level Power Query stuff, so I decided that rather than trying to fit it all in a comment section of that post, I would rather do an incognito blog post.

I will draw inspiration from the last comment posted bellow the post mentioned above that goes like this

Is there any chance to pull the values from a specific range of equally looking Excel files (all forms) and list them?

Example: The cell C9 of all forms contains one product characteristic. This characteristic shall be now listed as “Characteristic A”.

We start with a folder of Excel files (I also planted a txt document in there just for fun)

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

That look like this

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

All Files have a different number in cell C9 of a sheet called Special Sheet. If you want to follow along, you can get the files here. Unzip all files to one folder and remember the folder location since we will need that further along the line.

And just for reference, and so we can keep our bearing at all times, here are all the steps needed for this to work

  1. We must create a Query based on one of our files that will “teach” Excel how to work with our Excel Files (how to extract the right value from the desired Sheet).
  2. Next we turn that Query into a Power Query function that will take two parameters (the filepath and sheet name)
  3. That is followed by creating a new query using the Get Data From Folder command and getting to our Excel files and the right sheets by using filters
  4. To wrap things up, we use our function on the right Sheets of the right files and create the desired column.

The Base Query

So we start in a blank Excel file and launch the Power Query (Data/New Query in Excel 2016)/From File/From Excel (From Workbook in Excel 2016)

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

 

Then find the folder and choose any of the Excel files. I will choose File1.xlsx, so if you chose differently some steps may differ in view.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Then choose the Special Sheet in the Navigator followed by the Edit command on the bottom right.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

And now the fun begins J

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

So just to keep the goal in mind, we need Cell C9 which at this point is number 64 in column Column3. We will achieve this by pressing the fx button in the formula bar (if you don’t see the formula bar, you can turn it on by ticking the formula bar on the View tab)

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

And then adding the [Column3]{8} to what was there by default.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

After you enter this, you will get the desired value. The syntax is classic M, but the 8 might take you by surprise, since it was obvious that this value was in row 9. To understand this, you should know that Power Query has a 0 based system. This means that 0 parameter gives you the first value, 1 gives you the second and so on…

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Now that we have the value we need, we’re just going to rename the last step (not necessary, but will make things easier). This is done by right clicking the last step in the APPLIED STEPS window on the right and selecting Rename. The name I will give to this step is Value.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

To wrap this Base Query up, we choose the fx again and this time we write a simple formula

= Table.FromValue(Value)

Where Value in the parenthesis points to the name of the last step. If you didn’t change that, you must input the name of the last step that you see in the APPLIED STEPS.

At this point you should see this.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

This last step completes the Base Query. Don’t close the Query we created because we’re going to need it in step two, where we’re going to change this Base Query into a Power Query function.

Creating the import function

This brings us to phase 2, creating the function with two parameters, filepath and Sheet Name. To start this off, we go to View/Advanced Editor and this will open up the Advanced Editor Window.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Since our goal is to create a function that could be applied to all files, we need to remove the specific data and make it general. This specific data is highlighted below.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Ok, the first thing we do, is to add a blank row above the let command and write

(filepath,SheetName)=>

So Parenthesis, the name of the first parameter followed by comma and the name of the second parameter and closed parenthesis. Then an equal sign followed by the “greater-than” sign.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Then we need to change the filepath in line three and the name of the sheet with the parameters we named. This part is CASE SENSITIVE!!!!! So be careful to replace the parts correctly.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Now press Done in the Advanced Editor Window, and just change the name of the Query to fnImportExcel. Now you should see this:

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

After completing that, you choose Home/Close and Load,

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

You will see the function you created

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

This brings us to Step 3.

Get Data From Folder To get Excel Files

So we’re still within the same Workbook, and we’re going to make another Query to get data from folder. This time we choose Power Query (Data/New Query in Excel 2016)/From File/From Folder

And then choose the folder where you have your Excel files and choose Ok. This is what you get:

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

At this point the Get Data From Folder lists all the files within a selected folder.

First off, we need to make sure, that all further steps will only be applied Excel files as input. Remember the txt file from the beginning. It’s time to say farewell to it. To do this we do two things. First we select the Extension column and choose Transform/Format/UPPERCASE

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

This looks like a redundant step, but it’s needed to be able to process xlsx and XLSX equally. This is followed by Filtering the Extension Column to only keep the Excel Files. So we choose The Filter, then select Text Filters and Contains. Now we write XLS and with this we enable the Query to keep the xls, xlsx, xlsm and all other Excel formats.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

So now we get:

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Even if this last process seems redundant at this point, keep in mind that we want to create a query that will work in the future. Remember that Get Data From Folder will list all the files in that folder and all the steps that are about to come, will only work on Excel files. So id someone was to put a jpeg of themselves on vacation within our folder, the Get Data From Folder will pick that up. But the steps we just did will make it disappear. I wrote a whole Post on making your Get Data From Folder queries bulletproof. You can read it here.

Now you select the Name and Folder Path columns, right click on one of them and choose Remove Other Columns.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

At this point we have

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Now we need to create a filepath column that will concatenate the two columns. Go to Add
Column/Add Custom Column

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

We name the column filepath and write a simple formula same as in Excel.

= [Folder Path]&[Name]

You can get the Column Names into the formula by simply clicking them in the Available Columns list on the right. (Try to ignore the strangest of phenomenon, where the dialog box is half English and half Slovenian).

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

At this point we can get rid of all columns except for filepath. We do this the same as before, i.e. right clicking the filepath column and choosing Remove Other Columns. Now for a bit of stardust, we choose Add
Column/Add Custom Column again and this time we write a formula

= Excel.Workbook(File.Contents([filepath]))

So we insert the filepath column into the Excel.Workbook(File.Contents()). This will give you the following column

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

And by clicking the double headed arrow next to the Column name you get a few columns to choose from. We need columns…

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Name and Kind, so deselect all others and also untick the Use original column name as Prefix. After pressing Ok you should see

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Now this is great, we see all the sheets in those workbooks and the names of those Sheets. You can easily guess what comes next. We apply the filter to the Name Column selecting only those called “Special Sheet”. Now we’ve got everything we need. We have the function that requires filepath and SheetName and now we have that data for all Excel files in the folder.

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

Now for the last time we choose Add
Column/Add Custom Column and add a special formula

= fnImportExcel([filepath],[Name])

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

This means we will use the function we created on the two parameters and we get

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

And after pressing the two headed arrow, we only have the one column called value…

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

…that gives us exactly what we need:

USE POWER QUERY’S GET DATA FROM FOLDER TO GET DATA FROM MULTIPLE EXCEL FILES

So the value of cell C9 on the Special Sheet of those files in the folder. Purely brilliant! At this point we could get rid of all the columns we don’t need, and then load the data to wherever we want.