in this lecture we're going to be covering off power query power query is
a powerful tool for transforming all of your data however even if your data
requires no transformation it can save you a lot of time if you have to refresh
any of your data inside your Excel workbooks the two most typical ways of
refreshing data in Excel is either copy and paste or opening your files and
repeating all of your steps by setting up your workbooks with power query
you'll be able to simply click a refresh button to update everything in your
models in Excel 2016 power query has now been built into Excel has moved to the
data tab and has been renamed to get data in Excel 2010 and Excel 2013 power
query came as a free add-in from Microsoft and is downloadable from their
site if you're using Excel 2010 or 2013 you can get a link to the Microsoft site
for the download in the lecture notes to use power query we simply come over to
get data and we can see here that we can extract data from a number of file
formats here we have an Excel workbook a number of different text file formats
databases from Facebook or from a number of other sources for this example we're
going to use a simple text file which is stored in a CSV file format we're going
to come over to our example file here employees 2017 and here we can see a
preview of our data to carry out any cleaning or transformation operations on
our data we can click the edit button but we're going to cover that off in a
separate lecture for now we're just going to go ahead and click load you can
see here that our data has automatically been added to a table for easy use our
table has been given a name employees 2017 which was the name of our file if I
want to generalize this a little bit more I could rename this so that this
table could be reused for data from different periods say employees 2016 or
employees 2018 and still have our name reference make sense throughout our
different formulas over here we the query this contains the import
specifications for getting our data into Excel the nice thing about this is that
anytime all data gets updated all I have to do is come over here and click the
refresh button and that will refresh now the things to keep in mind here are that
first of all the data is pointed to a specific file so if your file name
changes you're gonna have to update this now it's quite common in various
workflows that your files get renamed with either versions or dates attached
to them to simplify your workflow I recommend always outputting the file
with the same file name into the same location and simply adding an extra copy
of the file often to an archive folder which does contain your date for easy
reference to the different files now if you do need to change where this file is
pointed to you can come over here and click Edit this will take us to the
query editor most of these options to do with cleaning and transformation so
we'll cover that a little bit later but for now you'll notice over here that we
have some properties including the name of our query again I can adjust this to
be something a little bit more generic down here we have all the different
steps which have taken place on our data so the first one here is specifying
where our data comes from promoted headers basically means that the first
row of data contains the column names and change type adds these 1.2 ABC which
basically specifies this is a numeric number with decimal places and these are
text fields so to update the settings for our file location I can come back up
to source I can click this little cog icon here which is the settings for this
parameter and I can click browse here to change the file that I'm pointing to
after this has all been adjusted you can come over here and click OK now if this
data doesn't refresh straight away you can come over here and click refresh
preview now there's one other setting I want to talk about here and this is
sitting under the advanced editor clicking on advanced
editor will take us to a script which has been created for us one of the
settings here columns equals to 14 basically meant that when we imported
our original dataset it contained 14 columns now the problem
with this being part of our import specification is that if we have less
columns or more columns then our data is not going to import correctly it's quite
often where you have a data set and then you realize that you need to add
additional columns to it you go back at the fields now here if you ended up with
say 15 or 16 columns those would be then truncated down to 14 columns again this
was not actually a problem in power query back in Excel 2010 in Excel 2013
however in Excel 2016 Microsoft has decided to hard-code this value as part
of the script that's created to make power query more flexible and more
easily refreshable when you update the number of columns and your data sources
come over to here select columns equals 14 make sure that you also select the
comma and simply press Delete I'm gonna click done and then close and load to
exit power query now anytime our data is updated I can simply come and click the
refresh button here or the refresh button at the top here in another
lecture we're gonna cover off data cleaning and transformations using power
query however this right here is the
predominant way that I use power query because I do my data transformations
predominantly in R I think power query can be a really great way of loading
that data into Excel so that users who only have Excel are able to easily
consume and pivot that data while also being able to refresh the workbook when
the data gets updated
Không có nhận xét nào:
Đăng nhận xét