Hello,

Sign up to join our community!

Welcome Back,

Please sign in to your account!

Forgot Password,

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Sorry, you do not have permission to ask a question, You must login to ask a question.

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Home Latest Topics

  • 47k
  • 47k
Kevin Stratvert

How to use Microsoft Power Query

video
play-rounded-fill
16:35

Hi everyone, Kevin here. Today I want to show you how you can use Power Query that comes with Microsoft Excel and Power BI. First off, what is Power Query? Well, it helps you both get and also clean up your data. What do I mean by that? Well, let’s pretend that you’re a business analyst, and every month you’re on point to analyze the monthly sales data. Now one of the pains of the job is, well, first off, you have to get the data, so you go to some database, or you find the latest Excel sheet, then maybe you bring it into Microsoft Excel, and then you have to clean it up. Maybe you have to remove some columns, add some columns, run some calculations, and then finally, you’re ready to analyze it. Instead, you can let Power Query do all of that for you, and you can automate that entire process, and best of all, you don’t even have to know any coding to do this. I’ve included some sample data in the description if you’d like to follow along. All right, let’s jump on the PC and let’s see how we can use Power Query. Here I am on my PC and at the Kevin Cookie Company, I’m on point for analyzing our order data. Here you can see three years’ worth of order data and very soon 2020 will be dropping in this folder. Once again, if you want to follow along, I’ve included sample files in the description. Let’s click into one of these files to see what work is involved. Here’s the Excel sheet and it’s a pretty standard order data sheet. I see order ID, I see the customer, how many cookies were shipped, and some additional information. Now the pain is at the Kevin Cookie Company, they’re paying me the big bucks to do analysis, but I end up spending most of my time just cleaning up data. Here for example, in column B, I don’t know why they combine the customer ID together with the customer name. Also, when I look at cookies shipped, they throw a dollar sign in front of this. It should be a unit count — not a dollar amount. Revenue, cost look fine, but I need to add another column for profit. I also want to know how many days it takes to actually ship these cookies out of our warehouse. I have to go through, and I typically add columns. It’s a lot of manual effort, and worst of all, every time a new sheet comes out, I have to do all this cleanup again and again. I’m really hoping Power Query can help me today. To access Power Query in the latest version of Excel, simply go up to the top ribbon and click on data. If you have Excel 2010 or Excel 2013, you need to install an add-in to be able to use Power Query. I’ve included a link in the description where you can access that. Over on the left-hand side, Power Query falls within this category the get and transform data. If we click over on get data, we see all the different sources where we can pull data in from. It’s a vast collection of different places. Basically, if you have data somewhere, you’ll be able to pull it into Power Query. When I hover over from file, here you’ll see a few examples. One of the neat ones to call out is you can now pull in data from a PDF. You could pull in data from databases, from Kusto, from Azure, the list goes on and on of all the different places where you can get data. I mentioned that I have my three files all in a folder and very soon the 2020 data will also fall in that folder, so within the from file list, I’m going to click on the option that says from folder. Next, I see a prompt where I can specify the folder path where I have all of my Excel sheets. I could click on browse, that’ll open up a file picker where I can navigate to where those files are. I’ve already copied and pasted the folder location, so I’ll simply paste that in and then click on OK. Next, this opens up another prompt where I can see that Excel or Power Query has found three different Excel files in that folder. I had the 2017 through 2019 data, so that all looks good. Down below there are several different buttons. What can I do with them? Well, I have three different sheets here and I want to combine all of the data from those sheets, so I’ll click on the combine option, and I want to transform the data. What does transform mean and why would I click on that? Well, transform allows me to modify the data before pulling it into Microsoft Excel or into Power BI. So, as I was walking through the data, there were a few different items, for instance I wanted to get rid of the dollar sign in front of a column, I wanted to split out the customer name, that’s what I could do with transform data. I could also simply combine all the data and load it into Excel, or I could choose where I want to load it to, but then I wouldn’t have the opportunity to make changes to the data before I load it in. Here too I also have the option to simply load the data as is or I can simply transform the data, but because I have multiple sheets and I want to combine them, I’ll click on combine and transform data. This opens up another prompt where I can now combine all of my files. Up above, it allows me to choose a sample file, and by default, it simply selects the first file, which is the 2017 data. I could also change it to any other year, but I think the 2017 data is fine, so I’ll simply leave it set to the first file. Down below, I also see the worksheet within all of the different workbooks. Now all of my different workbooks have a sheet called Order Data, so I’m going to select that. Over on the right-hand side, here I can see a preview of all of my data, and it looks like it identifies it properly. So now let’s click on OK. This drops us in Power Query, and if you’ve never used Power Query before and you’re on point for analyzing data and getting data, you’re going to learn to love this interface. It might look a little bit intimidating at first, but I’m going to walk you through step-by-step how you can use it, and first I want to orient you to the experience. Let’s start off over on the left-hand side, where we see all of our different queries. You have a set of helper queries and here we see a sample file that we can transform. Down below, you also have the final query that will apply against all of our data, so when we bring it into Microsoft Excel, it’ll all be transformed, and as we walk through this example, that’ll make a lot more sense in a moment. Over on the right-hand side, we also have all of our different query settings. Down below, we see something called applied steps. The way to think of this is this is kind of like running a macro and Power Query remembers every single transformation that we make on our data. Up on top, just like with most Office applications, we see a ribbon with several different options. We’re going to use this to transform our data. In the center of the screen, I can see a sample of my data. At the Kevin Cookie Company, we sell a lot of cookies, so it’s only showing me the top 1,000 rows of our order information. Once I make all of my changes to this data and I transform it, then we’ll close it and load it and that’ll apply these changes to all of my data beyond just the top 1,000 rows. Let’s go through and let’s start transforming this data. Now I don’t need the source name for the year that it comes from, so I’m simply going to right click on this column, and I’ll click on remove. That removes that column. Here now I see the order ID and for the most part it all looks good, but there’s also an XXXXXX that appears, now it might seem a little odd to have that in here, but anytime that Kevin Cookie Company, say, we send free cookies to different events or different promotions, and so we mark that down in our order data, but when I do my analysis, I don’t want to look at this, so I usually go through Excel, and I filter it out. Let me click on this filter icon up on top, and because it’s a letter, I’ll scroll to the bottom of the list and then uncheck this. This will remove it from my data set. Next, I was complaining about the customer ID being combined with the customer name. I want to split these two. I’m going to click on this column and then when I right click on it, I can go down to the option that says split column, and I want to split it by delimiter. The delimiter is the space, the dash, and the space. So, let’s click on this. This opens up a prompt where I can specify a custom delimiter. Here it looks like there’s a dash, but I also want to insert a space in front and a space after. That looks good. Let’s now click on OK. This now split those into two separate columns. If you look at the column title though, one of them is CustomerID.1 and the other is CustomerID.2. That’s not a very descriptive name. Let me double click to change the column header name and I’ll change this to customer ID because that truly is the customer ID, and then this next column, well, that’s the customer name, so let me double click on that and I’ll type in customer name. As we continue moving over to the right in the data, I’m going to minimize the queries on the left-hand side to allow us to see more of the data here. Next I want to calculate the profit and profit is the revenue minus the cost. I’m going to select the revenue column and also select the cost column, and then let’s go up to the top ribbon, and click on add column. Within add column in the center, there’s the option for standard and I have different types of calculations that I can apply. I want to take the revenue and subtract the cost, so I’ll click on the option that says subtract. This adds another column all the way to the right-hand side of my sheet called subtraction. I want to give this a better name, so I’ll double click on the column header and this is our profit and I’ll type in profit. I want this column to appear alongside the revenue and the cost, so I’ll simply drag this column over and place it right after the cost. My revenue, cost, and profit columns, well, these are all currency and I want to apply that data type. I’m going to click on control and select the cost column and the revenue column and then right click. Within this menu, let’s go down and click on change type and I’ll change it to currency. You now see this icon here indicates what the data type is. Over on the right-hand side, you’ll see our ship date and the order date, and there’s always a little bit of lag between when a set of cookies is ordered to when we’re able to ship it out of our warehouse. I’ve heard some news that perhaps it’s taking a little bit long in some cases, so another column would help me know how many days it takes. I’ll select the ship date column and then I’ll press control and select the order date column. I selected ship date first because I want to take this ship date and then I want to subtract the order date from the ship date column. Once I have both columns selected, let’s go up to add column on the ribbon and over on the right-hand side, I can click on the option that says date. Within the date menu, there’s the option to subtract days. Let’s click on this. This adds another column over on the right-hand side and it subtracted the dates, so here I could see how long it takes, and wow, five days to get cookies out of our warehouse, that’s unacceptable, I’m going to have to talk to our warehouse team to see why it’s taking so long. Now right now this column is simply called subtraction. I’m going to double click on this and give this column a name. I called this column days to ship and that looks good. Just like before, I’m going to click on this column, and I’ll drag it over, so it sits right up alongside order date and ship date. As I’m walking through every single step here over on the right-hand side, you can see a record or a history of every single step that I took. In a sense, it’s like recording a macro in Microsoft Excel. Let’s say for instance that I have this order status column and maybe I don’t want it, I can right click on this column and then remove it. Here you see down below that it says removed columns1, but let’s say that really, I want to keep the order status there. I can click on this X to remove that step. Here now if I go over to the right-hand side, you see that the column has returned. Along with simply removing a step that I recorded, you could also click on an item, and you could move it around, if you want that transform step to happen earlier in the steps. I’ve now gone through, and I’ve made a number of transformations to my data. I’m going to go back over to the left-hand side and once again, there are all sorts of different transformations you can make. The way to think of this is as you’re going through and you’re transforming your data, you’re in a sense recording steps that will automate bringing data like this in in the future. Now that I’m done with all of my transformations, I’m going to click on home up on the top ribbon, and next let’s click on close and load. This now drops me in Microsoft Excel and if I click on the order date, here you can see that all three Excel sheets have been combined into one. It brought all three years’ worth of data together, and the great thing is I didn’t have to go through each individual sheet to make all of these changes. Here you see the customer ID is separate from the customer name. Here you see that all of them contain the days to ship. So here it’s simplified merging all of this data into one sheet. Over on the right-hand side, I also have a pane for queries and connections and here I see Power Query. If I right click on this, I can go down to show the peek and here I can see a sample of the data once again, but the thing that’s really interesting is it’s looking at this data source or this folder, so if I bring additional files into this folder, it’ll bring that data in automatically and it’ll apply all of the transformations to that data, so I no longer have to get the data or clean it up. All of that happens for me automatically. Before I pull in additional data, first I want to insert a pivot table and a pivot chart so I can start analyzing my data. To do that, I’ll select within the table, go up to insert, and then I’ll click on pivot table. I get a dialog telling me to create a pivot table. I’ll stick with all the defaults and click on OK. This opens up a pivot table. If you’ve never worked with pivot tables before, they make analysis so much easier. I’ve included a link in the description that’ll get you up to speed on how you can use pivot tables. I want to track our profit by year and once again Pivot Tables make that so easy. I can simply click on profit, and I’ll drag it down to values. Next, I want to track the profit by the order date, so I’ll drag that into rows. Here immediately I can see what our profit was by year. I now want to insert a visual that shows me what our profit is by year. I’ll click into the table and an option that says PivotTable Analyze appears on the top ribbon. Over on the right-hand side, I can insert a PivotChart. Let’s click on that. For the chart, I want to insert a line chart. This one looks good. Let’s click on OK. Here now I can visualize what our profit is by year. I was just notified that our 2020 data has arrived, and it’s true, when I go into my folder where I had all of my previous data, here I see 2020. Now, in the past I would have had to go through, and I’d have to do all those same changes and transformations to the 2020 file, but because I used Power Query, and because I’m looking at this folder, I don’t have to do anything extra. Instead, I’m going to minimize the folder view. Here I see the three years. I’ll simply click into this table, go up to PivotTable Analyze, and then under this, I can click on refresh all. When I click on refresh all, here I see that it automatically pulls in 2020. So, all of the changes that I did to 2017, 2018, and 2019, it automatically takes that Power Query, and it runs it against 2020. Here too, if I go back to the original sheet with all of my data, if I click on the order date, here you see now that 2020 data has been included. So Power Query not only gets my data, but it makes all of the same changes and transformations to that data. Now I could spend my time on more valuable things like analyzing this data and trying to make sense of it. All right, that was a quick look at how you can both get and also clean up your data using Power Query. If this video helped you, please give it a thumbs up. To see more videos like this, hit that subscribe button. If you want to see me cover any other topics, leave a comment down below. All right, that’s all I had for you today, I hope you enjoyed, and I hope to see you next time. Bye.

Related Topics

You must login to add an answer.

Hide picture