Excel Unplugged

Table.Join – The Power Tool of Power Query

The basic explanation of Table.Join function in Power Query is to use Merge Queries on two steps of one single query.
This will be a step by step tutorial on how to use Power Query to get a list of duplicate files that could be interesting for System Administrators and the main step of the process will include the Table.Join step.

So let’s just dig in…

How to

Power Query Table Join

We start with a folder called Finding duplicate Files that contains 12 files and three subfolders. The files that are duplicated within those folders are highlighted in the picture above. Some appear only in one folder, and some in several. Now let’s create a new Workbook and start writing a new query to find these duplicates.

Creating a query that finds duplicates

The path is different depending on your version of Excel.
In 2016 the path is Data/Get Data/From File/From Folder

Power Query Table Join

In 2013 and 2010 the Path is Power Query/From File/From Folder

Power Query Table Join

The first step is to pick a folder and we select the Finding Duplicate Files folder.

Power Query Table Join

At this point we get a result preview that lists all the files in the selected folder including many file properties that could come in handy in other scenarios.

Power Query Table Join

But this list of files is not the desired result, so as in 99.9% of Queries, we choose the Edit button

Power Query Table Join

This opens the Query Editor window where we will apply the desired transformations

Power Query Table Join

Editing the query

First, we get rid of the unnecessary columns. Now instead of selecting all the unwanted columns and choosing Remove Columns command, we will rather apply the Remove Other Columns command. And we use another feature of Power Query here, we select the Folder Path column first and the Name column second. Then we right click and choose Remove Other Columns.

Power Query Table Join

And there it is, notice the order of the columns… Quite ingenious right? Next up we select the Name column and apply the Transform/Format/Uppercase command. This is not necessary but it’s good practice since we are searching for duplicates and Power Query is case sensitive, this will make them “look the same” in the eyes of Power Query.

Power Query Table Join

Renaming the last step

Now an important step! We must make sure that the resulting table after we apply the last transformation will be something that we can call upon later, so for clarity purposes and practical reasons, we rename the last step to Table1. For this we right click the last step in the Applied Steps sidebar and choose the Rename command.

Power Query Table Join

Getting list of duplicate files

Now that we have the filenames and Paths of all the files, we must get a list of duplicate files. For this, we will use the Transform/Group By command on the Name Column.

Power Query Table Join

In the Group By dialog box, we choose to group by Name, the operation is Count Rows and we type the NumberOfAppearances for the name of the resulting new column where the result of the Count Rows operation will go.

Power Query Table Join

And here is the result.

Power Query Table Join

Since we are only interested in the files that appear multiple times, we use the filter get rid of those that only appear once.

Power Query Table Join

And there it is. The list of files that appear more than once. This is an OK result since we now know which files are duplicated, but we don’t know in which folders they are. That information is still saved in the Table1 step. Since we want to merge the two together (the use of the word Merge is not incidental ???? ), we should rename this last step

Power Query Table Join

Same as before, right click on the last step and choosing Rename

Power Query Table Join

And let’s rename it to Table2

Power Query Table Join

Use Table.Join 

Now the stage is set for the use of M function Table.Join. We can do this two ways, either in the Advanced editor or by simply inserting a new step via the Formula Bar. If the Formula Bar is not visible in your Query Editor, you can turn it on via the View tab as shown in a picture

Power Query Table Join

Once the Formula Bar is visible, you simply click the fx icon to insert a new step and write the following formula.
=Table.Join(Table1,”Name”,Table2,”Name”)
The syntax would be Table.Join(
Followed by the name of the first table (in our case we renamed that step to Table1 but generaly that would be
StepResult1,
This is followed by the column name by which you wish to merge the two (there could be more than one column here)
“NameOfTheColumnInStepResult1”,
After that you repeat the process for the second step (we renamed that into Table2
StepResult2,”NameOfTheColumnInStepResult2″)

This is all you need for the function to work, but there are a few additional parameters that the function can take. The most interesting is Join.Kind where you can choose from these types of joins
• JoinKind.Inner (the default type)
• JoinKind.LeftOuter
• JoinKind.RightOuter
• JoinKind.FullOuter
• JoinKind.LeftAnti
• JoinKind.RightAnti

And if these remind you of the Query Merge command, you are right, they are exactly those! If this optional is omitted, then the default value of Inner join is passed to the function. This is how we wrote the function, but imagine the scenarios with Right Anti Join.

Power Query Table Join

After we write the function, this is the result…

Power Query Table Join

Now all we have to do is to sort the file list by File Names…

Get the details

Power Query Table Join

And for each file that has duplicates, we get the list of all folders where this file is.

Power Query Table Join

A bit of rearranging of columns

Power Query Table Join

And we can load the result into Excel by using the Close & Load command

Power Query Table Join

Next week I will write about using Conditional Formatting to do a sort of Anti Fill and turning the above table into this:

Power Query Table Join

If you want more information on the Table.Join command, this is a good place to start.

Learn more

Check out our YouTube channel and subscribe for more amazing Excel tricks!

Follow us on LinkedIn.

Check out our brand new R Academy!