0% found this document useful (0 votes)
13 views

Lab 02 Manual

Uploaded by

us5669126
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Lab 02 Manual

Uploaded by

us5669126
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Lab 02 Data warehousing and

Business Intelligence

Faculty of Computer Science and


Engineering FCSE Ghulam Ishaq Khan
Institute of Engineering and Technology,
Pakistan
Import data from different sources into Power BI Desktop
Perform data transformation using Power Query

Learning Objectives:

➢ Understand Data Import in Power BI


➢ Navigate Power BI Desktop Interface
➢ Import Data from Multiple Sources
➢ Perform Basic Data Transformations
➢ Cleanse and Shape Data

Outcomes:
Upon successful completion of this lab, students will be able to:
➢ Import Data Effectively
➢ Execute basic data transformation tasks such as filtering, sorting, and merging using
Power Query.
➢ Cleanse and Shape Data
➢ Identify and address common data quality issues, including handling missing
values and data formatting, to prepare data for analysis.
Introduction:
One of the key features of Power BI is its ability to connect to a wide range of data sources,
enabling you to gather data for analysis and reporting. To do this, Power BI provides several "Get
Data" options, each tailored to specific data source types. In this lab, we will explore these options
and understand when to use them.

"Get Data" Options:


Excel: You can import data from Excel workbooks, which is useful when working with structured
data saved in Excel files.
CSV: Use this option to import data from Comma-Separated Values (CSV) files, a common format
for data exchange.
Folder: Import multiple files from a folder. This is handy for scenarios where data is spread across
multiple files in a consistent format.
SQL Server Database: Connect to an SQL Server database to access data stored in relational
databases.
Oracle Database: Import data from Oracle databases, commonly used in enterprise settings.
Web: Retrieve data from web sources such as HTML tables, web services, or data accessible
through URLs.
SharePoint Online: Connect to SharePoint Online lists and libraries to access data stored on
SharePoint.
Azure: Import data from various Azure services like Azure SQL Database, Azure Blob Storage,
and Azure Data Lake Storage.
Online Services: Access data from online services such as Dynamics 365, Salesforce, or Google
Analytics.
More: Explore additional data sources, including databases, file formats, and online services,
through the "More" option.
Selecting the Right "Get Data" Option:
• The choice of "Get Data" option depends on the source of your data. Always select the
option that corresponds to the type of data you are working with.
• Consider data format, location, and accessibility when choosing an option.
• Power BI provides intuitive dialog boxes for each data source type to guide you through
the connection process.
Import vs. DirectQuery:
• When connecting to your chosen data source, you'll also need to decide between "Import"
and "DirectQuery."
• "Import" loads data into Power BI's internal data model, while "DirectQuery" connects to
the source in real-time.
• Consider your data characteristics, performance, storage, and data modeling requirements
when making this choice.

Here's how Power Query Editor appears after a data connection is established:
1. In the ribbon, many buttons are now active to interact with the data in the query.
2. In the left pane, queries are listed and available for selection, viewing, and shaping.
3. In the center pane, data from the selected query is displayed and available for shaping.
4. The Query Settings pane appears, listing the query's properties and applied steps.

Each of these four areas will be explained later: the ribbon, the Queries pane, the Data view, and
the Query Settings pane.
Data Cleaning:
Data cleaning stands as an essential step in working with data, as it often arrives with
inconsistencies, missing values, and errors. Some fundamental functionalities you should master
in Power BI for data cleaning include removing duplicates, filling in missing values, and correcting
data types.
Column Renaming:
Column renaming can be a cumbersome task, but it can significantly enhance the clarity of your
reports. In Power Query Editor, rename columns to make them more comprehensible. Simply
right-click on a column header and select "Rename..." (Refer to the image above for guidance).
Data Type Conversion:
Power BI generally does a commendable job of recognizing data types during the initial data
source load. However, in cases where this information is incorrect, it is advisable to navigate to
the "Change Type" option in your Change Pane, located on the rightmost side of the screen. Ensure
you are at this step and update any incorrect data types. This is especially crucial if you have
numeric fields that should be text-based, such as when you need to retain leading zeros. Use the
"Change Type" feature in Power Query Editor to convert columns to the appropriate data types,
like changing a text column to a date.
Sorting Data:
Arrange data in ascending or descending order to enhance readability and analysis. You can sort
columns by clicking the dropdown menu for the specific column and choosing "Sort Ascending"
or "Sort Descending."
Aggregating Data:
Group data by specific columns and perform aggregations like sum, average, or count using the
"Group By" function. This is particularly useful for summarizing data. You can access the "Group
By" feature either in the Ribbon or by right-clicking on a column. If you need to group by multiple
columns, either highlight all of them or add them to your grouping in the pop-up window.
Replace Values:
Lab Tasks:
• Load the dataset named as, ‘Sales Dataset for lab 2”
• Open your dataset in Power Query Editor. Perform the following data cleaning tasks
• Remove any duplicate rows.
• Remove any blank rows
• Fill in missing values (e.g., for ProductColor or ProductSize if applicable).
• Correct any data type mismatches, such as ensuring ProductCost and ProductPrice are set as
numeric values
• Rename Column 1 to something more meaningful based on its contents.
• Rename Column Model to ProductModel.
• Sort the dataset by ProductPrice in descending order to see the most expensive products at the
top.
• Use Group by ProductSubcategoryKey and calculate the average ProductCost and ProductPrice
for each subcategory.
• Aggregate the total count of products per ProductStyle
• In the ProductDescription column: Replace all instances of 0 with ‘No description given’.

You might also like