Power BI – Extract Function
BY: David Rohlfs
When you are learning how to use Microsoft Power BI and Power Query, learning about all the different features can be exhausting. So, there are a couple of features that you should learn earlier than others and the Extract function is one of these.
There are a lot of similar reasons why you would want to use the extract function vs. the split column function. So, if you are finding that the extract function isn’t working out like you want it to, I suggest reading through EPM Strategy’s blog on the Split Column function linked below.
In this blog we will go over some of the basic ideas with the extract function and when to use it and we will go through a quick walkthrough demonstrating how the extract function works.
When to use it
The extract function has many uses, but the main objective of the function is to pull out a section of data from a cell. In this demonstration I did an Extract After Delimiter on the Full Name column to obtain only last names.
You will find that you use this a lot if there is too much information in one column and you want to focus on just a portion of that information. For instance, if you have a string of text that has the customer’s name, date of sale, and product identifier, you may want to use the extract function to grab only the data of sale from that text.
This process is very similar to the Split Column function. So, it begs the question of what the difference between the two is? The difference is that the extract function only keeps the information in the cell that you want based on the rules you apply. While a split keeps all the information from the cell but separates it into multiple columns.
When you choose the extract function dropdown, there are several options. These options are to:
- Extract text before delimiter
- Extract text after delimiter
- Extract text between delimiters
- Extract First Characters
- Extract Last Characters
- Extract Range
- Extract Length
Between these seven options you can work with your data in a way that opens a lot of options to the report builder and end user.
Inside the advanced options you might get some unique options. For instance, on the Delimiter options, you will have the option to skip delimiters. This could be helpful if you want to keep a certain amount of text. For this your data will need to be standardized.
In this walkthrough we are only going to go through the Power Query function and not any of the M Code. We will be looking at a simple example of extracting an email domain from a list of emails. This is a common scenario and a simple one to demonstrate the extract function.
First you are going to need to go to Power Query by clicking the Transform Data button on the Home Ribbon.
Now you need to select the column that you want to work in, click on the Transform ribbon, and select the dropdown arrow for Extract.
This is where you will get to choose how you want the function to react to the delimiter. For this demonstration I am just going to choose Extract Text After Delimiter.
Now we will enter a pop-out where we can specify our delimiter. Because I am working with emails I am going to put in the “@” symbol.
Now we can click OK in the bottom right corner and go to the transformed column. In here you see that only the text after my delimiter has been applied to the column, and if a row didn’t have that delimiter, it appears null. I did make a copy of the original column so that you can compare the two and what happened.
There are many reasons why you would want to use the extract function on a column, and because of this, the extract function becomes vital to learn early on. If you didn’t find that the extract function worked like you wanted it to, I suggest reading through EPM Strategy’s blog on the Split Column function. The link to that blog and a few similar EPM Strategy blogs will be below.
Links Related to this Article: