Excel Unplugged

Importing CSV Files with Changing Start Row

Importing-Data-with-a-Changing-Start-Row-in-Power-Query

In this post, we’re going to talk about importing CSV files with power query! This is a guest post written by John MacDougall, an Excel MVP from Canada. John runs a famous Excel blog called How To Excel where you can find tons of useful tips and tricks. In this post, he will explore the use of Power Query for Importing CSV files into Excel. But not just plain CSV files. CSV files that have different number of rows of metadata.

You might be getting your data from another system like a general ledger, CRM, admin system or any other system that has data you need to report on.

A lot of the time, these will export data as CSV files.

If the CSV files you’re importing are nicely formatted with column headers in the first row of the file, then it’s an easy process to import into Excel with power query. We can select a From Text/CSV query from the Data tab, and pick the file from the resulting file picker menu. From there we can either load the data or further edit it in power query.

If you’ve never seen or heard of power query, then you can learn about power query here. It’s essentially a tool you can use to import data into Excel and then transform the data.

Problems Importing Files with Metadata

Importing CSV Files into Excel with Power Query and Changing Metadata

Sometimes, it might be the case that these files contain metadata. This is just data about the file such as when the report was run or the period it covers. Usually, these will be in the first couple of rows of your CSV files.

In the above example, it’s not until the 10th row that we see the columns headers and the report data because in rows 1 through 9 we have our metadata information.

This is easy enough to deal with if we always have 9 rows before our column headers start. We can remove the top 9 rows in our query then promote the first row to the column headers.

But what if this changes? Maybe some CSV files will have only 8 rows because there is no account contact information in the system.

Importing CSV Files into Excel with Power Query and Changing Metadata

If we try to run our query on a file with less than 9 rows of metadata, then we’ll end up promoting the wrong row as column headers and our query will throw out an error in the next step where we try and reference any column.

Can we avoid this and make it dynamic so that no matter what row our data starts on the query will remove the correct number of rows?

Import the CSV

Importing CSV Files into Excel with Power Query and Changing Metadata

The first step is to import a single CSV file. Go to the Data tab and select a From Text/CSV. In the resulting file menu picker, navigate to the folder where your reports are stored and select one of the CSV files.

You can select any of the reports regardless of which row the column headers appear in.

Press the Edit button to open the query editor.

Importing CSV Files into Excel with Power Query and Changing Metadata

Power query will also add a Change Type step to guess the data types of the different columns. We only want the imported data as a source without any other transformation steps applied to it , so we can delete this step with a left click on the X next to the step under the Applied Steps area.

Now we can Close & Load the query as a connection only. I’ve named the query ImportData.

Get the Row Number of the Column Headings

Now we will create a query to get the row number which contains the column headings. Remember, this was something that might change depending on the file.

Importing CSV Files into Excel with Power Query and Changing Metadata

In the Queries & Connections window pane, right click on the ImportData query and select Reference from the menu. This will create a new query with the ImportData query as the source.

Importing CSV Files into Excel with Power Query and Changing Metadata

Now we should be in the query editor with our raw data that includes the metadata rows. Go to the Add Column tab and press the Index Column button. This will add an index row starting at 0 in the first row and incrementing by 1 each row.

This index column will tell us the row number that our column headers are in.

Importing CSV Files into Excel with Power Query and Changing Metadata

We can now filter the first column of the data. Left click on the filter icon in the column heading and then deselect everything except for Product. Our column headers are always named the same across our different files, so this will always result in the row containing the column headers being left.

Importing CSV Files into Excel with Power Query and Changing Metadata

Now our column headers will be in the first row. We can then right click on the first row value in the Index column and select Drill Down from the menu.

This will turn our query results into a single value which corresponds to the number of rows of metadata we need to remove from our file. We can name the query GetFirstRow and then Close & Load it as a connection only.

Remove the Metadata

We have obtained the row number which contains the column headings in the data using an index column, a filter and then drilling down on a single value. Now we can use this value to remove rows from our file.

Importing CSV Files into Excel with Power Query and Changing Metadata

We can reference the ImportData query again with a right click and selecting Reference from the menu.

Importing CSV Files into Excel with Power Query and Changing Metadata

Now we are ready to remove our metadata rows. From the Home tab press the Remove Rows button and select Remove Top Rows from the menu. When the pop up appears that asks how many rows to remove, we can enter any value as a temporary placeholder. Let’s enter 9 for now.

Importing CSV Files into Excel with Power Query and Changing Metadata

Now we need to edit the M code for the last step in the formula bar. We need to replace the placeholder 9 value that we entered with GetFirstRow. This was the name of our query that gave us the single value of the row number containing our column headers. Press Enter to confirm the edit to the formula.

Importing CSV Files into Excel with Power Query and Changing Metadata

Now we can promote the first row to column headers. Go to the Home tab and press the Use First Row as Headers button.

Change any of the data types as needed, then name the query Report and Close & Load it to a table in the Excel.

Conclusion

Within our changing data format, we were able to find and use the column heading names which remained constant to dynamically get the number of rows we needed to remove.

Whenever you’re faced with changing data formats look for the constants (I like to call these data anchors) and use them to get the results you’re after.

Now if we change the path and file name in the ImportData query to point to another report and it has a different number of metadata rows, our query will still work! We could even parameterize our query with a path and file name parameter so we can easily import any file from any folder, and we don’t have to worry about the changing rows of metadata.

Comments 4

  1. John says:

    Famous in my dreams 😂

  2. Ivan M. says:

    Thank you, Gašper and John, for for sharing of this useful information
    That is another excellent approach (along with Ken Puls’s one: “https://www.excelguru.ca/blog/2015/08/12/remove-all-rows-up-to-a-specific-value/ “)

  3. MF says:

    I had a similar situation. What I did was to “Remove Empty” in Column 3. 🙂
    Quick and Dirty, I knew. And it won’t work if my dataset contains empty rows in column 3. Luckily enough, my dataset was “good” enough for me to do this in such a “dirty” way.
    Your approach is more robust! Thanks for sharing. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: