Kasi Puneeth Ram Report
Kasi Puneeth Ram Report
(AUTONOMOUS)
R.V.S Nagar, Chittoor – 517 127. (A.P)
(Approved by AICTE, New Delhi, Affiliated to JNTUA, Anantapur)
(Accredited by NBA, New Delhi & NAAC A+, Bangalore)
(An ISO 9001:2000 Certified Institution)
2023-2024
INTERNSHIP REPORT
A report submitted in partial fulfilment of the requirements for the Award of Degree of
BACHELOR OF TECHNOLOGY
IN
CERTIFICATE
I wish to convey my gratitude and sincere thanks to all members for their support and
cooperation rendered for successful submission of report.
Finally, I would like to express my sincere thanks to all teaching, non-teaching faculty
members, our parents, and friends and for all those who have supported usto complete
the internship successfully.
The last reported AGM (Annual General Meeting) of Ybi Foundation, per our records, was held on
30 September, 2022.
Ybi Foundation has two directors - Alok Yadav and Arushi Yadav.
Chapter 4: Python for Everyone and database Feedback and Review Assignment
4.1 Feedback and Review
CHAPTER 9: CONCLUSION
Learning Objec ves/Internship Objec ves
• Internships are generally thought of to be reserved for college students looking to gain experience
in a particular field.
• However, a wide array of people can benefit from Training Internships in order to receive real world
experience and develop their skills. An objective for this position should emphasize the skills you
already possess in the area and your interest in learning more
• Internships are utilized in a number of different career fields, including architecture, engineering,
healthcare, economics, advertising and many more.
• Some internship is used to allow individuals to perform scientific research while others are
specifically designed to allow people to gain first-hand experience working.
• Utilizing Internship, make sure to highlight any special skills or talents that can make you stand
apart from the rest of the applicants so that you have an improved chance of landing the position
internships is a great way to build your resume and develop skills that can be emphasized in your
resume for future jobs. When you are applying for Training.
WEEKLY OVER VIEW OF INTERNSHIP ACTIVITIES
1ST WEEK
DATE DAY NAME OF THE MODULE/TOPICS COMPLETED
27-05-2024 Monday Introduction to Artificial Intelligence (AI)
28-05-2024 Tuesday Introduction to Generative Adversarial Networks (GenAI)
29-05-2024 Wednesday Instructions to Complete Internship
30-05-2024 Thursday Scope of AI and Data Skills
31-05-2024 Friday Upgrade Your Internship
01-06-2024 Saturday Review and Practice
02-06-2024 Sunday Review and Practice
2ND WEEK
DATE DAY NAME OF THE MODULE/TOPICS COMPLETED
03-06-2024 Monday Plans and Upgrades
04-06-2024 Tuesday Upgrade your Course
05-06-2024 Wednesday Frequently Asked Questions
06-06-2024 Thursday Review and Practice
07-06-2024 Friday Review and Practice
08-06-2024 Saturday Complete any pending tasks and review
09-06-2024 Sunday Complete any pending tasks and review
3RD WEEK
DATE DAY NAME OF THE MODULE/TOPICS COMPLETED
10-06-2024 Monday Essential Learning Module: Introduction
11-06-2024 Tuesday Essential Learning Module: Deep Dive
12-06-2024 Wednesday Practice and Application
13-06-2024 Thursday Practice and Application
14-06-2024 Friday Practice and Application
15-06-2024 Saturday Internship Class 1: Introduction to Python
16-06-2024 Sunday Internship Class 2: Introduction to Google Colab
4TH WEEK
5TH WEEK
6th WEEK
DATE DAY NAME OF THE MODULE/TOPICS COMPLETED
01-07-2024 Monday Project Hub: Introduction
02-07-2024 Tuesday Project Work
03-07-2024 Wednesday Project Work
04-07-2024 Thursday Project Work
05-07-2024 Friday Project Work
06-07-2024 Saturday Project Work
7TH WEEK
8TH WEEK
DATE DAY NAME OF THE MODULE/TOPICS COMPLETED
08-07-2024 Monday FINAL EXAM
Data Collection
Data Cleaning and Preprocessing
Exploratory Data Analysis (EDA)
Feature Engineering
Machine Learning
Model Evaluation and Validation:
Big Data and Distributed Computing
Deep Learning
Data Visualization
Ethics and Privacy
Domain Knowledge
Communication Skills
1.2 What are the jobs roles/domains in the data science and skills to learn
JOB ROLES:
EXAMPLE: DATA ANALYST DOMAIN/TOOLS: Data analysts play a critical role in
organizations by collecting, analyzing, and interpreting data to provide insights and support decision-
making. To excel as a data analyst, you should possess a combination of technical, analytical, and
communication skills. Here are some key skills that data analysts typically need
Web Query
You can import data directly from a website using the Web Query feature.Go to the "Data"
tab, select "Get Data," and choose "From Web."Enter the URL of the website and follow the
wizard to select the data you want to import.
Database Connection
If your data is in a database, you can establish a connection to the database and import data.Go
to the "Data" tab, select "Get Data," and choose "From Database."Follow the wizard to
connect to your database and import the desired data.
Power Query:
Power Query is a powerful tool for data transformation and can be used to import data from
various sources.Go to the "Data" tab, select "Get Data," and choose "Get Data" again to open
the Power Query Editor.In the Power Query Editor, you can connect to various data sources,
perform transformations, and load the data into Excel.
Pivot Tables:
Create Pivot Table: Use the Pivot Table function to create a summary of a large dataset, allowing you
to rearrange, summarize, and analyse data dynamically.
Pivot Charts: Once you create a Pivot Table, you can create a Pivot Chart based on that table to
visualize the summarized data.
Charts and Graphs:
Insert Chart: Use the Insert Chart function to create various types of charts such as bar charts, line
charts, pie charts, etc.
Customize Charts: Customize the appearance and formatting of charts to better represent data using
the Chart Tools in Excel.
Combo Chart: Create a combo chart to display multiple sets of data in one chart, with different types
of data represented on different axes.
Data Validation:
Data Validation: Use Data Validation to set restrictions on what type of data can be entered into a cell,
ensuring data consistency and accuracy.
Text Functions: LEFT, RIGHT, MID: Use these functions to extract specific portions of text from
cells.
CONCATENATE, CONCAT: Combine text from multiple cells into one cell using these functions.
IF Function:
IF: Use the IF function to perform conditional operations based on specified criteria.
Logical Functions:
AND, OR, NOT: Use these functions to perform logical operations.
New Rule: For more advanced or custom rules, you can select "New Rule" in the "Conditional
Formatting" menu. This allows you to define your own formatting rule using formulas.
Data Validation: Data validation is a feature that helps control what type of data can be entered
into a cell or range. It ensures data accuracy and consistency by restricting the input to specific
criteria. Here's how to use data validation:
Relational Databases:
Tables: Data in a relational database is organized into tables. Each table consists of rows and columns,
where each column represents an attribute, and each row represents a record.Relationships: Tables in
a relational database can be related to each other through common columns, creating relationships.
SQL Commands:
ALTER: Modifies the structure of an existing database object (e.g., adding a new column to a table).
Data Types:
SQL supports various data types, such as INTEGER, VARCHAR (variable-length character strings),
DATE, FLOAT, etc. These data types define the kind of data that can be stored in a column.
Constraints: Constraints are rules defined on a column or a table to enforce data integrity. Common
constraints include PRIMARY KEY (uniquely identifies each record), FOREIGN KEY (establishes
a link between tables), NOT NULL, CHECK, etc.
Indexes:
Indexes improve the speed of data retrieval operations on a database table. They provide a quick
lookup mechanism for specific columns.
Querying:
SQL uses the SELECT statement to query data from one or more tables. It supports filtering, sorting,
and grouping of data using clauses like WHERE, ORDER BY, and GROUP BY.
Joins:
Joins are used to combine rows from two or more tables based on related columns. Common types
include INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN.
Transactions:
A transaction is a sequence of one or more SQL statements that are executed as a single unit.
Transactions ensure data consistency and integrity. Common transaction commands include
COMMIT and ROLLBACK.
Views:
A view is a virtual table based on the result of a SELECT query. It allows users to query and
manipulate data as if it were a regular table.
SQL is a powerful language for managing and interacting with relational databases, and it's widely
used in software development, data analysis, and various other fields. Understanding these
fundamental concepts will help you work effectively with SQL databases.
1. Normalization:
Purpose: Normalization is the process of organizing data to reduce redundancy and improve data
integrity.
Normal Forms: A database is typically designed to conform to certain normal forms (e.g., First
Normal Form, Second Normal Form, Third Normal Form) to ensure efficient data storage and
minimize data anomalies.
Stored Procedures: These are precompiled SQL statements that can be executed with a single call.
They are stored in the database and can accept parameters, making them reusable and efficient.
Functions: Similar to stored procedures, functions return a value, but they are often used for
calculations and are part of SQL expressions.
3. Triggers:
Definition: Triggers are sets of instructions that are automatically executed ("triggered") in response
to certain events on a particular table or view. Use Cases: Triggers are commonly used for enforcing
business rules, auditing changes, or maintaining data integrity.
4. ACID Properties:
Atomicity, Consistency, Isolation, Durability (ACID):
Atomicity: Ensures that a transaction is treated as a single, indivisible unit, either fully completed or
fully rolled back.
Consistency: Guarantees that a transaction brings the database from one valid state to another.
Isolation: Ensures that the execution of transactions is independent of each other.
Durability: Once a transaction is committed, its effects are permanent and survive subsequent system
failures.
Transaction Control Statements: In addition to COMMIT and ROLLBACK, SQL provides statements
like SAVEPOINT to manage transactions more flexibly.
Concurrency Control: Mechanisms like locks and isolation levels help manage concurrent access to
the database, preventing issues such as lost updates or inconsistent reads.
User Accounts and Roles: SQL databases have a security model that includes user accounts and roles.
Users can be granted specific privileges to control access to data and functionality. Authentication
Methods: Databases support various authentication methods, such as username/password, integrated
security, and more.
7. Dynamic SQL:
Dynamic SQL: In some scenarios, you may need to dynamically generate and execute SQL
statements within your applications. Dynamic SQL allows you to construct SQL statements at
runtime.
NoSQL Databases: In contrast to traditional SQL databases, NoSQL databases offer different data
models (e.g., document-oriented, key-value, graph) and are designed to handle large volumes of
unstructured or semi-structured data.
9. SQL Standards: ANSI SQL: SQL is based on standards defined by the American National
Standards Institute (ANSI). Different database management systems (DBMS) may implement SQL
in slightly different ways, but adherence to ANSI SQL ensures a degree of portability
3.3 SQL Clause, SQL Functions
Aggregate Functions: Perform calculations across multiple rows and return a single result:
Tuple Characteristics:
Immutable: Once a tuple is created, you cannot modify its elements.
Indexing and Slicing: Similar to lists, you can access elements by index and perform slicing
operations.
my_tuple = (1, 2, 3, 4, 5)
print(my_tuple[1]) # Result: 2
print(my_tuple[2:4]) # Result: (3, 4)
Dictionary Methods:
keys(): Returns a list of all keys in the dictionary.
my_dict = {"name": "John", "age": 25, "city": "New York"}
keys = my_dict.keys()
# Result: dict_keys(['name', 'age', 'city'])
4.3 OOPs
Certainly! Object-Oriented Programming (OOP) is a programming paradigm that uses objects—
instances of classes—as a fundamental building block for creating and organizing software. Here's
an overview of key concepts in OOP:
Classes and Objects:
Class:
A class is a blueprint or template for creating objects.
It defines the attributes (data) and methods (functions) common to all objects of that type.
class Car:
def __init__(self, make, model):
self.make = make
self.model = model
def start_engine(self):
print("Engine started!")
Object:
An object is an instance of a class. It represents a real-world entity and is created from the class
blueprint.
my_car = Car("Toyota", "Camry")
Encapsulation:
Encapsulation is the bundling of data (attributes) and the methods that operate on that data into a
single unit (class).
It helps in hiding the internal details of the object and exposing only what is
necessary.
Inheritance:
Inheritance allows a class (subclass or derived class) to inherit the properties and methods of another
class (superclass or base class).
It promotes code reusability and establishes an "is-a" relationship between classes.
class ElectricCar(Car):
def __init__(self, make, model, battery_capacity):
super().__init__(make, model)
self.battery_capacity = battery_capacity
def charge_battery(self):
print("Battery charging...")
Polymorphism:
Polymorphism allows objects of different classes to be treated as objects of a common base class.
It enables a single interface to represent different types of objects.
def describe_vehicle(vehicle):
print(f"{vehicle.make} {vehicle.model}")
car = Car("Toyota", "Camry")
electric_car = ElectricCar("Tesla", "Model S", "100 kWh")
describe_vehicle(car) # Output: Toyota Camry
describe_vehicle(electric_car) # Output: Tesla Model S
Abstraction:
Abstraction involves simplifying complex systems by modeling classes based on the essential
properties and behaviors.
It hides the unnecessary details while revealing the necessary ones.
6. Composition:
Composition is a way to combine objects of different classes to create more complex objects.
It allows for creating relationships between objects without the need for inheritance.
class Engine:
def start(self):
print("Engine started!")
class Car:
def __init__(self):
self.engine = Engine()
def start_engine(self):
self.engine.start()
Encapsulation, Inheritance, and Polymorphism (EIP):
These three principles together are often referred to as EIP and are considered the three main pillars
of OOP.
They guide the design and implementation of classes and their relationships.
Example: OOP in Python
class Animal:
def __init__(self, name):
self.name = name
def make_sound(self):
pass
class Dog(Animal):
def make_sound(self):
return "Woof!"
class Cat(Animal):
def make_sound(self):
return "Meow!"
dog = Dog("Buddy")
cat = Cat("Whiskers")
print(dog.make_sound()) # Output: Woof!
print(cat.make_sound()) # Output: Meow!
In this example, Animal is the base class, and Dog and Cat are subclasses. They demonstrate
inheritance and polymorphism.
Understanding and applying OOP concepts can lead to more organized, modular, and maintainable
code.
6.4 Python Strings
String Basics:
Strings are created using single (‘) or double (“) quotes.
My_string_single = ‘Hello, World!’
my_string_double = “Hello, World!”
Triple-quoted strings can be used for multiline strings.
Multiline_string = ‘’’This is a
multiline
string.’’’
2. String Operations:
Concatenation:
str1 = “Hello”
str2 = “World”
result = str1 + “ “ + str2
# Result: “Hello World”
Repetition:
repeated_str = “abc” * 3
# Result: “abcabcabc”
Indexing and Slicing:
my_string = “Python”
print(my_string[0]) # Result: ‘P’
print(my_string[1:4]) # Result: ‘yth’
3. String Methods:
len(): Returns the length of the string.
Length = len(“Hello”)
# Result: 5
My_string = “ Hello “
stripped_string = my_string.strip()
# Result: “Hello”
4. String Formatting:
Using % Operator:
name = “John”
age = 25
message = “My name is %s and I am %d years old.” % (name, age)
Using format():
name = “John”
age = 25
message = “My name is {} and I am {} years old.”.format(name, age)
Using f-strings (Python 3.6 and above):
name = “John”
age = 25
message = f”My name is {name} and I am {age} years old.”
3. Array Operations:
Element-wise Operations:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Element-wise addition
result = a + b
Matrix Operations:
matrix_a = np.array([[1, 2], [3, 4]])
matrix_b = np.array([[5, 6], [7, 8]])
# Matrix multiplication
result_matrix = np.dot(matrix_a, matrix_b)
8. Random Module:
NumPy has a random module for generating random numbers and arrays.
rand_arr = np.random.rand(2, 3) # Random values in a given shape
9. Statistical Functions:
NumPy provides various functions for calculating statistics on arrays.
mean_val = np.mean(arr)
std_dev = np.std(arr)
Operations on NumPy arrays encompass a wide range of functionalities for data manipulation,
computation, and analysis. Here's a comprehensive overview of various operations you can perform
on NumPy arrays:
2. Statistical Operations: NumPy provides functions for calculating various statistics from
arrays.
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
mean_value = np.mean(arr)
median_value = np.median(arr)
std_dev = np.std(arr)
variance = np.var(arr)
sum_value = np.sum(arr)
product = np.prod(arr)
3. Aggregation Functions:
These functions perform operations on entire arrays or along a particular axis.
import numpy as np
arr = np.array([[1, 2], [3, 4]])
total_sum = np.sum(arr)
column_sum = np.sum(arr, axis=0)
row_sum = np.sum(arr, axis=1)
4. Array Comparison and Boolean Operations:
Performing comparisons and generating boolean arrays.
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([2, 2, 2])
element_comparison = arr1 == arr2
logical_and = np.logical_and(arr1 > 1, arr2 == 2)
logical_or = np.logical_or(arr1 > 2, arr2 == 2)
logical_not = np.logical_not(arr1 > 1)
import numpy as np
arr = np.array([[1, 2], [3, 4]])
reshaped_arr = arr.reshape(4) # Convert to 1D array
flattened_arr = arr.flatten() # Flatten to 1D array
6. Sorting:.
import numpy as np
arr = np.array([3, 1, 2])
sorted_arr = np.sort(arr)
reverse_sorted_arr = np.sort(arr)[::-1]
These are fundamental operations on NumPy arrays. Utilizing these operations allows for efficient
and powerful data manipulation and computation, making NumPy a cornerstone in scientific
computing and data analysis.
Indexing and slicing in NumPy allow you to access and manipulate specific elements or ranges of
elements in an array. Here's a comprehensive guide to indexing and slicing in NumPy:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr[0, 1]) # Output: 2 (row 0, column 1)
2. Slicing: Slicing allows you to extract a portion of an array. The basic syntax is start:stop:step.
1D Array:
import numpy as np
arr = np.array([10, 20, 30, 40, 50])
print(arr[1:4]) # Output: [20 30 40]
print(arr[::2]) # Output: [10 30 50]
3. Boolean Indexing: Boolean indexing allows you to filter elements based on a condition.
import numpy as np
arr = np.array([10, 20, 30, 40, 50])
# Boolean condition
condition = arr > 30
# Applying the condition
filtered_arr = arr[condition]
print(filtered_arr) # Output: [40 50]
4. Integer Array Indexing: You can use integer arrays to extract specific elements.
import numpy as np
arr = np.array([10, 20, 30, 40, 50])
# Integer array for indexing
indices = np.array([1, 3])
# Accessing elements using the integer array
selected_elements = arr[indices]
print(selected_elements) # Output: [20 40]
6.1 Tableau: Tableau is a powerful and popular data visualization tool that allows you to create
interactive and shareable dashboards and reports. Here are some basic concepts and steps to get started
with Tableau:
Installing and Setting Up Tableau: Installation: Download and install Tableau Desktop from the
Tableau website. You can choose a trial version or a licensed version based on your needs.
Connecting to Data:
Data Sources: Tableau can connect to various data sources like Excel, CSV, databases (SQL, MySQL,
etc.), cloud-based sources, and more.
Connecting to Data: Open Tableau and click on "Connect to Data." Choose the appropriate data
source, and follow the prompts to connect.
Data Preparation and Cleaning:Tableau allows for basic data preparation, cleaning, and
transformation within the tool itself. You can rename fields, create calculated fields, pivot data, etc.
Creating a Visualization: Drag and drop dimensions and measures into the Rows and Columns shelves
to create a visualization.
Common Visualizations: Tableau offers various visualization options like bar charts, line charts,
scatter plots, maps, and more.
Formatting: Customize the appearance of your dashboard by formatting colors, fonts, and layout.
Interactivity:
Filtering: Allow users to interact with your data by adding filters.
Actions: Create interactive actions between different sheets or dashboards.
Advanced Features:
Parameters and Calculations: Use parameters and calculated fields for more complex analyses.
Advanced Visualizations: Explore advanced visualizations like dual-axis charts, trend lines, and
forecasting.
Mapping: Utilize Tableau's mapping capabilities for geographic data visualization.
Tableau is a versatile tool with a lot of capabilities. Starting with the basics and gradually exploring
its features and functionalities will help you create powerful and insightful data visualizations.
In Tableau, metadata refers to the information about the data itself. It includes details about the
structure, properties, and characteristics of the data you're working with. Understanding metadata is
crucial for effective data analysis and visualization. Here's how metadata is handled in Tableau:
Viewing Metadata for Data Sources: When you connect to a data source in Tableau, you can view
metadata related to that data source. This includes information about tables, columns, data types, and
other properties.
Data Pane: The Data pane in Tableau displays the metadata of the connected data source, including
dimensions (categorical data), measures (quantitative data), and other relevant information.
Field Metadata: Within Tableau, you can access metadata for each field (column) in your data source.
Field Properties: Right-click on a field in the Data pane and select "Describe" to view field properties.
This provides details like the data type, minimum and maximum values, and more.
Custom Field Names and Aliases: You can customize field names and aliases to make them more
descriptive and meaningful. This doesn't change the actual data but provides a clear representation.
Data Types and Roles: Tableau assigns data types and roles to each field based on the initial analysis
of the data source. However, you can manually override these assignments based on your
understanding of the data.
Data Type: You can change the data type assigned to a field. For example, you can change a numerical
field to a date field if required.
Role: Assign roles like dimension (discrete data) or measure (continuous data) to fields.
Calculated Fields and Metadata: When creating calculated fields in Tableau, metadata plays a role in
defining the properties of the new field.
Field Properties in Calculations: When creating a calculated field, Tableau allows you to specify the
field's properties, including data type and aggregation.
Metadata Grid: The Metadata Grid in Tableau allows you to view and modify field properties and
aliases for a data source.
Chart Suggestions: Tableau offers chart suggestions based on the metadata. For example, it might
suggest a bar chart for a categorical variable and a line chart for a time series.
Tableau offers a wide range of charts and visualization options to help users represent their data in a
meaningful and insightful way. Here are some common types of Tableau charts:
Bar Chart: Bar charts represent data using rectangular bars, with the length of each bar proportional
to the value it represents. Bar charts are effective for comparing discrete categories.
Line Chart: Line charts display data points connected by lines, useful for showing trends or changes
over a continuous range, such as time.
Area Chart: Area charts are similar to line charts, but the area under the line is filled, making it useful
for comparing proportions over time.
Scatter Plot: Scatter plots represent individual data points with dots on a graph, making them useful
for showing relationships or correlations between two numerical variables.
Histogram: Histograms provide a visual representation of the distribution of a dataset, showing the
frequency of values within specific ranges (bins).
Pie Chart: Pie charts display data as a circular graph divided into slices, where each slice represents
a proportion of the whole.
Heat Map: Heat maps use color to represent data values in a matrix, making it easier to identify
patterns and variations.
Tree Map: Tree maps represent hierarchical data in a nested, rectangular layout. The size of each
rectangle is proportional to the data it represents.
Bubble Chart: Bubble charts display data points using bubbles, where the size of the bubble represents
a third numerical variable.
Gantt Chart: Gantt charts visualize project timelines, showing the start and end times of various tasks
or activities.
Box Plot (Box and Whisker Plot): Box plots display the distribution of data based on quartiles, helping
to identify outliers and distribution patterns.
Bullet Graph:Bullet graphs are used to display performance data, comparing a primary measure to a
target measure and additional measures.
Waterfall Chart: Waterfall charts show how an initial value is increased or decreased by a series of
intermediate values, often used for financial data analysis.
Packed Bubble Chart: Packed bubble charts are similar to bubble charts but with bubbles packed
tightly to visualize hierarchical data.
Dual-Axis Chart: Dual-axis charts combine two different chart types in a single chart, allowing for
better comparison of data.
Radar Chart: Radar charts display data in a circular pattern, useful for comparing multiple quantitative
variables.
Map Chart: Tableau offers different map charts, including symbol maps, filled maps, and heat maps,
to visualize data geographically.
Visual analytics in Tableau involves using Tableau's powerful features and tools to visually explore,
analyse and gain insights from data. It allows users to create interactive and insightful visualizations
that help in understanding complex data patterns, trends, and relationships. Here are the key aspects
of visual analytics in Tableau:
Drag-and-Drop Interface: Tableau offers an intuitive drag-and-drop interface, allowing users to easily
connect to data sources and drag dimensions and measures onto the canvas to create visualizations.
Quick Visualization Creation: Users can quickly create various types of visualizations like bar charts,
line charts, pie charts, scatter plots, and more by simply dragging and dropping data fields onto the
canvas.
Interactive Dashboards: Users can create interactive dashboards by combining multiple visualizations
onto a single canvas. Interactivity allows users to filter and highlight specific data points dynamically.
Filters and Highlighting: Tableau provides options to filter data based on dimensions or measures,
enabling users to focus on specific subsets of data for analysis. Users can also highlight data points
or groups.
Parameters and Calculated Fields: Tableau allows users to create parameters and calculated fields to
perform complex calculations and customize visualizations dynamically.
Data Blending and Joining: Users can blend or join data from different sources, allowing for a unified
view of disparate datasets and facilitating comprehensive analysis.
Annotations and Annotations Pane: Users can add annotations to visualizations to provide additional
context or explanations. The Annotations pane allows for easy management and customization of
annotations.
Tableau Story Points: Tableau Story Points enable users to create a sequence of visualizations that
tell a story or present a narrative, providing a guided analytical experience.
Mapping and Geospatial Analysis: Tableau allows users to plot geographical data on maps, enabling
geospatial analysis and insights.
Integration with Advanced Analytics: Tableau integrates with advanced analytics platforms and tools,
allowing users to incorporate predictive analytics, machine learning models, and statistical analysis
into their visualizations.
Publishing and Sharing: Users can publish their visualizations and dashboards to Tableau Server or
Tableau Online, making them accessible to others for viewing and interaction.
Data Alerts and Subscriptions: Users can set up data alerts to receive notifications when specific
conditions in the data are met. Subscriptions allow scheduled delivery of dashboards via email.
Real-Time Data Analysis: Tableau supports real-time data analysis, allowing users to visualize and
analyze streaming data for timely decision-making.
Chapter 6: Project Hub (power bi)
7.1 POWER BI
Power BI is a popular business intelligence tool developed by Microsoft that allows users to visualize
and share insights from their data. Here are the basic concepts and features of Power BI:
Power BI Desktop: Power BI Desktop is a free application that you install on your local machine. It's
used to create reports and visualizations from various data sources.
Data Sources and Connectors: Power BI can connect to a wide array of data sources, including
databases (SQL Server, MySQL, Oracle), Excel files, SharePoint lists, Salesforce, Google Analytics,
and more. These connections are facilitated through connectors.
Data Transformation and Modelling: Power BI Desktop allows you to clean, transform, and model
your data using Power Query and Power Pivot. You can shape your data, create relationships between
tables, and define measures.
Data Visualization: Power BI offers a wide range of visualization options such as bar charts, line
charts, pie charts, maps, tables, matrices, and custom visuals. Users can drag and drop data fields onto
the canvas to create interactive visualizations.
Reports and Pages: Reports in Power BI are collections of visuals that are displayed together on a
page. You can have multiple pages within a report to organize your visuals.
Dashboards: Dashboards in Power BI are a collection of visuals from a single report or multiple
reports. They provide a consolidated view of important metrics and KPIs.
Power Query (Get & Transform Data): Power Query is a powerful data connection and transformation
tool in Power BI. It allows you to shape and clean your data before loading it into Power BI.
Power Pivot (Data Modelling): Power Pivot is an in-memory data modeling engine. It enables users
to model large sets of data, create relationships, and define calculated columns and measures.
DAX (Data Analysis Expressions): DAX is a formula language used in Power BI to create calculated
columns and measures. It's similar to Excel functions but tailored for Power BI's tabular modelling.
Row-Level Security (RLS): RLS allows you to restrict access to rows of data based on the viewer's
role or identity, ensuring data security and privacy.
Q&A (Natural Language Processing): Power BI has a Q&A feature that allows users to ask questions
about their data using natural language and receive visualizations as answers.
Power BI Service: The Power BI service (PowerBI.com) is a cloud-based platform where you can
publish, share, and access Power BI reports and dashboards. It allows collaboration and real-time
updates.
Gateway: Power BI Gateway allows for a secure connection between Power BI services and on-
premises data sources, enabling data refreshes and real-time dashboards.
Power BI Mobile: Power BI Mobile enables users to view and interact with their Power BI content
on mobile devices, making it accessible anytime, anywhere.
Power BI's user-friendly interface and powerful features make it a popular choice for data analysts,
business analysts, and decision-makers to derive valuable insights from their data and drive informed
business decisions.
Power BI Interface: Power BI provides an intuitive and user-friendly interface designed to streamline
the process of creating, analyzing, and visualizing data. Here are key components:
Ribbon: Similar to Microsoft Office applications, Power BI has a ribbon at the top providing access
to various tools and features.
Canvas: This is the central area where you create visualizations by dragging fields from the data pane.
Visualizations Pane: On the right side, you have the visualizations pane, where you can select and
configure the type of visualization you want to create.
Fields Pane: Also on the right, the fields pane displays the fields available from your data source. You
can drag and drop these fields to create visualizations.
Pages Tab: You can have multiple pages within a report, allowing you to organize your visuals
effectively.
Filters Pane: Allows you to add filters to your report to interactively slice and dice your data.
Visualization Tools: Various visualization tools are available to enhance and customize your visuals,
such as formatting options, analytics, and more.
Modelling Tools: These tools enable data modelling operations such as creating relationships,
defining measures, and managing data categories.
Data Sources: Power BI can connect to a wide range of data sources including databases (SQL Server,
MySQL, Oracle), files (Excel, CSV), cloud-based sources (Azure SQL Database, Google Analytics),
and more.
Power Query (Get Data): Power Query is a tool used to connect, transform, and clean data from
various sources. It helps prepare the data for analysis.
Data Load: Once the data is transformed, you load it into Power BI for modeling and visualization.
Power BI Desktop keeps a model of the data in memory.
Data Modelling (Power Pivot): Power BI allows you to create relationships between tables, define
hierarchies, create calculated columns, and write DAX expressions to enhance the data model.
Data Refresh: After loading the data, you can configure refresh settings to keep your data up-to-date
by scheduling regular refreshes. This is crucial for live or frequently updated data.
DirectQuery: Power BI supports DirectQuery mode where it queries the underlying data source in
real-time instead of importing data. This is useful for large datasets.
Power BI Gateway: The Power BI Gateway allows for secure data refreshes for on-premises data
sources and live connections to data models in the Power BI service.
7.3 Data Transformation: Data transformation in Power BI involves cleaning, shaping, and
organizing data to make it suitable for analysis and visualization. Power BI provides a powerful tool
called Power Query to perform these transformation tasks.
Connecting to Data:
a. Data Source Settings: Review and modify data source settings like server details, authentication,
and database selection.
b. Navigator: In the Navigator window, choose the specific data tables or views you want to load.
Data Cleaning and Transformation Steps: In the Power Query Editor, you'll find various options for
data transformation and cleaning:
Removing Columns or Rows: Right-click on a column or row header and choose to remove.
Changing Data Types: Select a column, right-click, and choose "Change Type" to change data types.
Filtering Rows: Use filter options to remove unwanted rows based on conditions.
Remove unnecessary columns: Splitting and Merging Columns: Split a column into multiple columns
based on a delimiter. Merge multiple columns into one.
Grouping and Aggregating Data: Group rows to perform aggregations (sum, average, etc.) on grouped
data.
Pivoting and Unpivoting Data: Change the structure of the data by pivoting columns or unpivoting
rows.
Duplicating or Reference Data: Create a duplicate of a query or create a reference to the same data
source.
Handling Null or Blank Values:
Once the necessary transformations are applied, click "Close & Apply" to load the transformed data
into Power BI.
After loading the data into Power BI, you can refresh it to reflect any changes in the source data.
Data Transformation with DAX (Data Analysis Expressions): In the data model, you can further
transform data using DAX formulas, creating calculated columns and measures.
Power BI is a business analytics service provided by Microsoft that lets you visualize your data and
share insights. It converts data from different sources to build interactive dashboards and Business
Intelligence reports.
Power BI can access vast volumes of data from multiple sources. It allows you to view, analyze, and
visualize vast quantities of data that cannot be opened in Excel. Some of the important data sources
available for Power BI are Excel, CSV, XML,
JSON, pdf, etc. Power BI uses powerful compression algorithms to import and cache the data within
the.PBIX file.
Power BI makes things visually appealing. It has an easy drag and drops
functionality, with features that allow you to copy all formatting across similar visualizations.
Power BI helps to gather, analyze, publish, and share Excel business data. Anyone familiar with Office
365 can easily connect Excel queries, data models, and reports to Power BI Dashboards.
CHAPTER 8: FINAL PROJECT
The same applies to your team. If your team isn’t hitting the company’s revenue goals, you can use
sales reports to find gaps to improve your sales process.
With regular sales reporting, your C-suite or managers can quickly iterate on what drives the
company's growth. You can also track and adjust sales tactics that are performing below par.
Monitoring and showing the sales performance of each team member motivates them to do more.
Gamifying performance results can challenge other team members to quit settling for average
performance. Put another way, sales reporting can create healthy competition and push your sales
team to aim for the “best” outcomes.
That’s more fun than relying on clunky spreadsheets, right? Here’s a quick video on how you can
gather sales data for your reports.
It’s true, especially with sales reporting. When you create attractive visuals, your audience won’t have
to wade through spreadsheets with lots of numbers. This saves their time and allows you to quickly
communicate the insights in your report.
The best part? You can generate engaging visuals directly on HubSpot. Think pie charts, bar charts,
line charts, and more.
To accurately forecast these, ensure your reps are doing their due diligence to guarantee a realistic
sales pipeline.
This is an example of what a pipeline report looks like in HubSpot Sales Hub. You’ll notice each
stage of the pipeline and where opportunities are within it. You can even add forecasted deal amounts
to see the worth of each deal and its proximity to closing.
Understanding the sales pipeline stages where your team excels and needs help. You can also identify
the specific actions your reps should take to move prospects through each stage of your pipeline, the
number of prospects in the pipeline, and how close your team is getting to their targets.
By monitoring your conversion rate, you can identify where your team excels or underperforms in
the sales lifecycle. If your team consistently has a high conversion rate of turning leads into
opportunities, you can scale the strategies that are already working. Otherwise, you can start finding
areas for improvement.
This report is also a litmus test for the strengths and weaknesses of individual reps. If a rep is
performing below par, looking into their conversion rate helps you uncover why.
Revealing the efficacy of your overall sales strategy on an operational or team-wide scale. It also
measures the effectiveness of your sales team at converting leads into customers.
Average Deal Size Report
Your average deal size helps in predicting revenue. For instance, if your revenue target is $200k per
quarter and your average deal size is $20k, it means you have to land 10 deals to hit your quarterly
target.
The average deal size report provides the basis for your reps' quotas and lets them know how many
deals they're expected to land. It also allows you to set expectations and milestones for your sales
cycle. Ultimately, it might seem like a no-brainer, but it's still worth a reminder — always monitor
your average deal size because it’s vital to your sales operations.
Setting expectations for each rep, creating weekly and monthly milestones, tracking the performance
of each rep, and gauging the overall success of your company’s sales strategy.
When considering the metric, establish an ideal timeframe to use as a benchmark. One of those
benchmarks is how long it takes a rep to work through your sales cycle. If you find some reps with
much longer sales cycles compared to their peers, you can evaluate their efforts and identify areas for
coaching.
If all your reps can’t keep pace with your target average sales cycle length, then it's probably time to
take an objective look at your operations. You might find flaws in your approach, training, or
management style, and these insights can help you fix the issues. To enable your reps to see how
they're performing with real-time visualization dashboards, tools like Datapine can help.
Knowing if your reps are closing deals at a similar rate as their peers. You can also create contests to
foster healthy competition and unify your team to work towards a common goal.
That said, some marketing collateral may be irrelevant to your rep’s prospects. With this report, you’ll
know which marketing content works. Communicating this information to your marketing team gives
them the insights they need to create more useful content.
Sales enablement platform SoloFire tracks how many people have used a piece of collateral, how
many times they’ve interacted with it, and for how long.
Determining which marketing collateral gets the most traction with prospects and collateral that could
use a refresh.
Won and Lost Deals Analysis Report
To understand the state of your business, you shouldn’t track only deals in progress. You should track
deals you win and lose.
Perhaps prospects go crazy for specific features that you offer. Or, you notice that there’s a preference
for a competitor’s product. Both trends provide an overall picture of your product’s overall strengths
and weaknesses.
This is also a good way to spot under- and over-performers. For example, two reps who have the same
average quota attainment could both appear to be stellar but differ wildly in actual performance.
If your data reveals that one rep spends a lot of time helping others get deals across the finish line
while still maintaining high attainment, you have a great manager candidate on your hands.
On the flip side, records could reveal that a second rep has the same attainment as the first, but relies
on other teammates to run demos or closing calls.
There’s always a story behind the numbers. Analyzing won and lost deals by rep will reveal it.
Evaluating performance against variables like company size, product type, sales reps, and sales teams.
There might be an issue with your pricing, service, product quality, product features, or delivery. You
may also identify misalignment during the sales process, or some other aspect of the customer
experience.
If your report shows higher than normal churn, speak to your customers to understand their challenges
and fix them. This can improve your customer retention rate and overall customer experience.
Closely monitoring trends in churned customers so you empower your team to fix bad patterns
throughout the sales process.
Ideally, you want your reps to close a healthy number of deals compared to the number of prospects
they meet with. If they meet with ten per day, but close none, this report will allow you to understand
why and propose better closing techniques. If the opposite is true, you can find what’s working and
share those tactics with the team.
The sales call report can also help you segment data. For example, if a certain industry is responding
well to your products and services, you could advise your team to narrow down their call list. You
can then prioritize the highest converting segment.
Identifying the most effective tactics for closing deals, setting daily call benchmarks for new hires,
and iterating on your sales closing techniques.
Five minutes is short, and if you’re far from meeting this time, the best thing to do is track your
progress. You won’t move from a 48-hour lead response time to five minutes overnight. But by
making strategic decisions and prioritizing your team’s workload, you can attain this goal.
Measuring the average time it takes sales reps to follow up with a lead. Plus, you can compare this
metric to industry benchmarks.
Revenue Report
As a nice complement to the average deal size report, a revenue report can help you and your reps
see how their work impacts the bottom line.
Seeing a breakdown of new business and renewals, as well as the reps who contributed to each. To
get the most out of this report, you’ll want to first set your sales and revenue goals.
Many sales teams focus on identifying potential clients and closing deals, leaving little time for
detailed reporting. The good news is that your team can use several powerful templates to expedite
your sales reporting.
What you can calculate with the Sales Metrics Calculator:
Average Deal Size
Win Rate
Demo: Close Ratio
Quota Setting Calculator
Commission Calculator
Customer Acquisition Cost (CAC)
Customer Lifetime Value (CLV)
CAC-to-CLV
Revenue by Product
Customer Retention Rate
Revenue Churn
Employee Turnover Rate
So the store has higher profits from the product sales for consumers, but the profit from corporate
product sales is better in the sales-to-profit ratio. Let’s have a look at it to validate our findings:
CHAPTER 9: CONCLUSION
In conclusion, the utilization of Python and data analytics for store sales and profit analysis emerges
as a transformative strategy for businesses navigating today's dynamic market landscape. This
analytical approach becomes a cornerstone for informed decision-making, enabling organizations to
optimize operations, refine pricing strategies, enhance marketing efforts, and improve overall
efficiency in inventory management.
Python, with its rich ecosystem of libraries and automation capabilities, plays a pivotal role in this
process. From data manipulation using Pandas to visualization with Matplotlib and Seaborn, and
numerical operations facilitated by NumPy, Python provides a robust toolkit for businesses to glean
actionable insights. The emphasis on key metrics such as average deal size, win rate, and customer
acquisition cost offers a nuanced understanding of market dynamics, allowing businesses to align
their strategies for maximum impact.
The interconnected nature of internal and external factors, as evidenced by metrics like employee
turnover and customer retention rates, underscores the holistic approach required for sustained
success. By embracing Python-driven analysis, businesses position themselves not just to navigate
uncertainties but to thrive amidst them. This integration of technology with decision-making becomes
a transformative journey, where data-driven insights serve as a guide for adaptive strategies and well-
informed choices.
As businesses continue to face the challenges of an evolving landscape, the synergy between data
analytics and strategic decision-making becomes increasingly indispensable. The future belongs to
those who can harness the power of Python and data analytics to adapt, evolve, and prosper based on
the valuable insights derived from their store sales and profit analysis. In this era of data-driven
excellence, Python serves not only as a technological enabler but as a strategic imperative for
businesses aiming for sustainable growth and resilience.