Lab 02 Manual
Lab 02 Manual
Business Intelligence
Learning Objectives:
Outcomes:
Upon successful completion of this lab, students will be able to:
➢ Import Data Effectively
➢ Execute basic data transformation tasks such as filtering, sorting, and merging using
Power Query.
➢ Cleanse and Shape Data
➢ Identify and address common data quality issues, including handling missing
values and data formatting, to prepare data for analysis.
Introduction:
One of the key features of Power BI is its ability to connect to a wide range of data sources,
enabling you to gather data for analysis and reporting. To do this, Power BI provides several "Get
Data" options, each tailored to specific data source types. In this lab, we will explore these options
and understand when to use them.
Here's how Power Query Editor appears after a data connection is established:
1. In the ribbon, many buttons are now active to interact with the data in the query.
2. In the left pane, queries are listed and available for selection, viewing, and shaping.
3. In the center pane, data from the selected query is displayed and available for shaping.
4. The Query Settings pane appears, listing the query's properties and applied steps.
Each of these four areas will be explained later: the ribbon, the Queries pane, the Data view, and
the Query Settings pane.
Data Cleaning:
Data cleaning stands as an essential step in working with data, as it often arrives with
inconsistencies, missing values, and errors. Some fundamental functionalities you should master
in Power BI for data cleaning include removing duplicates, filling in missing values, and correcting
data types.
Column Renaming:
Column renaming can be a cumbersome task, but it can significantly enhance the clarity of your
reports. In Power Query Editor, rename columns to make them more comprehensible. Simply
right-click on a column header and select "Rename..." (Refer to the image above for guidance).
Data Type Conversion:
Power BI generally does a commendable job of recognizing data types during the initial data
source load. However, in cases where this information is incorrect, it is advisable to navigate to
the "Change Type" option in your Change Pane, located on the rightmost side of the screen. Ensure
you are at this step and update any incorrect data types. This is especially crucial if you have
numeric fields that should be text-based, such as when you need to retain leading zeros. Use the
"Change Type" feature in Power Query Editor to convert columns to the appropriate data types,
like changing a text column to a date.
Sorting Data:
Arrange data in ascending or descending order to enhance readability and analysis. You can sort
columns by clicking the dropdown menu for the specific column and choosing "Sort Ascending"
or "Sort Descending."
Aggregating Data:
Group data by specific columns and perform aggregations like sum, average, or count using the
"Group By" function. This is particularly useful for summarizing data. You can access the "Group
By" feature either in the Ribbon or by right-clicking on a column. If you need to group by multiple
columns, either highlight all of them or add them to your grouping in the pop-up window.
Replace Values:
Lab Tasks:
• Load the dataset named as, ‘Sales Dataset for lab 2”
• Open your dataset in Power Query Editor. Perform the following data cleaning tasks
• Remove any duplicate rows.
• Remove any blank rows
• Fill in missing values (e.g., for ProductColor or ProductSize if applicable).
• Correct any data type mismatches, such as ensuring ProductCost and ProductPrice are set as
numeric values
• Rename Column 1 to something more meaningful based on its contents.
• Rename Column Model to ProductModel.
• Sort the dataset by ProductPrice in descending order to see the most expensive products at the
top.
• Use Group by ProductSubcategoryKey and calculate the average ProductCost and ProductPrice
for each subcategory.
• Aggregate the total count of products per ProductStyle
• In the ProductDescription column: Replace all instances of 0 with ‘No description given’.