Data Manipulation With Excel Author Degroote School of Business
Data Manipulation With Excel Author Degroote School of Business
August 2014
www.MarqueeGroup.ca
Table of Contents
Section Page
Important Settings 2
Data Manipulation
Basic Tools 6
Functions 7
Auto Filter 8
Subtotal Function 10
Advanced Filter 11
VLookup Function 14
Absolute References 15
Conditional Formatting 16
Text Functions 17
Paste Special 18
Grouping Rows and Columns 19
Database Functions 20
Editing Options
Calculation Options
– Check the “Automatic” box in the Calculation section.
– This setting ensures that all formulas calculate automatically.
– If this option is set to “Manual”, nothing will happen when you enter a
formula until you press the F9 key to manually recalculate the spreadsheet.
– You may want to set this option to “Manual” if you are working on an
extremely large file because it could take Excel a few seconds to recalculate
the spreadsheet every time you make a change.
– Turn off the “Iteration” check box (the “Iteration” check box should only be
turned on when you are working on a spreadsheet that contains an
intentional circular reference).
• The settings on the “Excel Options” menu affect all spreadsheets that
are currently open. Therefore, if a setting is changed, the change will
take place on all the Excel files that are open.
Sorting Data
– The Sorting Tool is an important, albeit overused tool.
– Many people sort their data numerous times when there are often more
effective ways to extract the desired output.
Excel 2003 Excel 2010
Data Sort Data [A] Sort [SS]
Home [H] Sort & Filter [S]
– Select any cell within your data and then run the Sort tool.
– The sort tool can also be found on the Home tab
– When a Filter is on, the triangle on the drop down button will turn blue
– The row numbers on the left side of the screen will also turn blue when a
filter is on
– To reset the filters:
Excel 2003 Excel 2010
Data Filter Show All Data [A] Clear [C]
Common Pitfalls
– There are two common pitfalls with the AutoFilter tool:
1. The Data needs to be filtered in place. Every time the drop down
menus are changed, the filtered subset will change. When using an
AutoFilter, if you want to create a subset of data and then keep the
subset, you will need to manually copy the subset elsewhere.
2. The AutoFilter cannot run sophisticated criteria.
– If either of the two pitfalls above becomes an impediment to using the
AutoFilter tool, the solution is to use Excel’s Advanced Filter (see
subsequent pages).
• The Advanced Filter is a very important tool because it allows you to:
– filter data based on more elaborate criteria than the AutoFilter; and
– place the filtered subset in a new location to keep the original data intact.
Table 1 is a list of raw data that contains the In Table 2, a few blank rows were added above
name, university and age of various students. the data and the column headings were
repeated.
Table 3 Table 4
A B C A B C
1 Name University Age 1 Name Age Age
2 Western > 24 2 > 24 <= 32
3 Queens > 20 3
4 4
5 Name University Age 5 Name University Age
6 Michelle Western 24 6 Michelle Western 24
7 Paul McGill 32 7 Paul McGill 32
8 Ken Queens 21 8 Ken Queens 21
9 Cindy Western 26 9 Cindy Western 26
10 Susie McGill 29 10 Susie McGill 29
11 Sophie Queens 33 11 Sophie Queens 33
In Table 3, the user has made two separate In Table 4, the user has requested a list of all
requests. The user would like to see a list of all students who are >24 AND <=32. To filter in
students who are either 1. at Western and between a range, repeat the column heading
>24; OR 2. at Queens and >20. The filtered in the criteria section twice. The filtered subset
subset will contain Ken, Cindy and Sophie. will contain Paul, Cindy and Susie.
© 2006 The Marquee Group Inc.
13
VLookup Function
• Lookup Functions are very useful for extracting information from large
tables
= VLOOKUP(Lookup_value, Table_array, Col_index_num, Range_lookup)
– VLOOKUP (vertical lookup) searches for a value in the leftmost column of a
table, and then returns a value in the same row from a column you specify
– To insert the dollar signs more quickly while editing a formula, use the F4
key to toggle between the various absolute referencing options
= Right(text,num_chars) Returns x number of characters from a cell, starting from the right
= Find(find_text,within_text,start_num) Finds one text string within another text string and returns the
number of the starting position of the found string (case sensitive)
= Proper(text) Converts a text string to proper case so that the first letter in each
word is uppercase and all other letters are lowercase
= Trim(text) Removes all spaces from a text string except for single spaces
between words
• Instead, you should use the GROUPING function if you don’t want
certain rows or columns to be visible
– The GROUPING function puts handles with buttons around each of the
grouped rows or columns so that you can instantly realize that certain rows
or columns are not visible
− For example, if there was a spreadsheet that contained a listing of all the
employees at a given company, it might be nice to know the following:
1. What was the total compensation to all employees in the finance department?
2. What is the average age of employees in the marketing department?
− Most Excel users solve these types of questions by first sorting or filtering
the data and then calculating a sum or average on the subset.
− If there are a lot of queries, it can take a long time to first sort or filter the
data before answering each question.
● The Database functions all use the same syntax, as described in the
following DSUM Function:
=DSUM(Database,Field,Criteria)
− Database: This is the entire range of data. This range must have column
headings.
− Field: This is the column heading of the data that you would like to
perform the mathematical operation on (i.e. sum, count,
average). You can either click on the cell that contains the
column heading, or retype the column heading in the function
surrounded by quotation marks.
− Criteria: This is another set of cells that tells the database function
which criteria to use in order to calculate the correct answer.
The criteria section should be set up exactly the same way as
the criteria section for an Advanced Filter (see next 2 pages).
3. Copy the column headings to a blank row a few rows above the raw data.
This will be the criteria section for the database functions.
4. Within the criteria section, place criteria on top of one another to use the
OR logical operator. Place criteria beside one another to use the AND
logical operator.
Table 1 is a list of raw data that contains the In Table 2, a few blank rows were added
name, university and age of various students. above the data and the column headings were
repeated.
Table 3 Table 4
A B C A B C
1 Name University Age 1 Name University Age
2 Rotman 2 Ivey >24
3 3
4 Name University Age 4 Name University Age
5 John Rotman 22 5 John Rotman 22
6 Peter McMaster 25 6 Peter McMaster 25
7 Miranda Ivey 27 7 Miranda Ivey 27
8 David Rotman 21 8 David Rotman 21
9 Katie McMaster 23 9 Katie McMaster 23
10 Sarah Ivey 24 10 Sarah Ivey 24