0% found this document useful (0 votes)

91 views46 pages

Aparna INTERN REPORT 12

Done

Uploaded by

Akshada Shingavi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views46 pages

Aparna INTERN REPORT 12

Done

Uploaded by

Akshada Shingavi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 46

INTERNSHIP REPORT

A report submitted in partial fulfillment of the requirements for the Award of

Degree of
BACHELOR OF ENGINEERING
In
COMPUTER ENGINEERING
By
Aparna Jitendra Gangishetty
Exam Seat No.: T191014216
Roll No. : 12

Under Supervision of
Prof. R. A. Ghadage

SHRI CHHATRAPATI
SHIVAJI MAHARAJ
COLLEGE OF ENGINEERING
(Duration: 1st Jan 2024 to 15th Feb 2024)

DEPARTMENT OF COMPUTER ENGINEERING

SHRI CHHATRAPATI SHIVAJI MAHARAJ COLLEGE

OF ENGINEERING, NEPTI
Approved by AICTE, affiliated to SPPU, Pune

1
DEPARTMENT OF COMPUTER ENGINEERING
SHRI CHHATRAPATI SHIVAJI MAHARAJ COLLEGE
OF ENGINEERING, NEPTI

CERTIFICATE

This is to certify that the “Internship Report” submitted by Aparna Gangishetty is work
done by her and submitted during 2023 – 2024 academic year (Duration: 1st Jan, 2024 to
15th Feb 2024), in partial fulfillment of the requirements for the award of the degree of
BACHELOR OF ENGINEERING in COMPUTER ENGINEERING, at SHRI
CHHATRAPATI SHIVAJI MAHARAJ COLLEGE OF ENGINEERING, NEPTI,
AHMEDNAGAR.

Internship Coordinator HOD of Computer Principal

Prof. R. A. Ghadage Prof. V.V. Jagtap Dr. Y.R. Kharde

2
INTERNSHIP CERTIFICATE

3
ACKNOWLEDGEMENT

First, I would like to thank Innover Infotech for giving me the opportunity
to do an internship within the organization.

I also would like all the people that worked along with me in Innovar
Infotech with their patience and openness they created an enjoyable working
environment.

It is indeed with a great sense of pleasure and immense sense of gratitude

that I acknowledge the help of these individuals.

I am highly indebted to Hon.Principal Dr.Y.R. Kharde for the facilities

provided to accomplish this internship.

I would like to thank my Head of the Department Prof. V. V. Jagtap for her
constructive criticism throughout my internship.

I would like to thank Prof. R.A.Ghadage College internship coordinator

Department of Computer Engineering for their support and advices to get and
complete internshipin above said organization.

I am extremely great full to my department staff members and friends who

helped me in successful completion of this internship.

Aparna Gangishetty

4
ABSTRACT

This internship in data analytics with Python provides hands-on experience in exploring and
interpreting data using Python programming language. Participants will delve into the world of
data to extract valuable insights and make informed decisions. Throughout the internship, you
will learn to manipulate and analyze datasets, uncover patterns, and create visualizations to
communicate findings effectively.

The program begins with a solid foundation in Python, ensuring participants are comfortable
with programming basics. As the internship progresses, emphasis is placed on applying these
skills to real-world data scenarios. You will gain proficiency in popular data analytics libraries
such as Pandas and Numpy, enabling you to clean, transform, and manipulate data efficiently.

Moreover, participants will be introduced to statistical concepts, enabling them to draw

meaningful conclusions from data. Practical projects and case studies will reinforce theoretical
knowledge, fostering a comprehensive understanding of data analytics principles. The internship
also covers the fundamentals of machine learning, offering insights into predictive analytics and
data-driven decision-making.

By the end of the internship, participants will have developed a portfolio showcasing their ability
to analyze data and derive actionable insights. This hands-on experience with Python in the
context of data analytics equips interns with valuable skills sought after in today's data-driven
industries. Whether you are a beginner or have some experience in programming, this internship
provides a supportive environment to enhance your data analytics capabilities using Python. Join
us to unlock the power of data and become proficient in leveraging Python for effective data
analysis.

5
Methodologies:

 Data Collection:- Gather relevant data from various sources, ensuring data
quality and integrity.

 Data Cleaning:- Identify and rectify errors, missing values, and inconsistencies in
the dataset. Clean data is crucial for accurate analysis.

 Exploratory Data Analysis (EDA):- Conduct preliminary analysis to understand the

data's characteristics, distributions, and relationships. Visualization techniques are
often used here.

 Data Transformation:- Prepare the data for analysis by encoding categorical

variables, scaling features, and handling outliers, if necessary.

 Data Exploration:- Start by exploring the available datasets. Understand the

structure, variables, and potential insights. Visualization tools like Python's
Matplotlib or Seaborn can assist in this process.

 Learn Relevant Tools:- Master data analytics tools like Python (pandas, numpy), R,
or SQL. Familiarize yourself with data visualization tools such as Tableau or Power
BI.

 Statistical Analysis:- Apply statistical methods to draw meaningful conclusions from

data. Understand basic concepts like hypothesis testing, regression analysis, and
correlation.

 Machine Learning:- Depending on the role, learn the basics of machine learning.
Scikit- learn for Python is a good starting point. Understand algorithms like
regression, clustering, and classification.

 Model Building:- Depending on the objectives, apply appropriate data analysis

methods, such as regression, clustering, or deep learning.

 Deployment:- Implement the insights into real-world solutions or strategies, ensuring

that the data-driven recommendations are implemented.

6
Benefits of Internship

 Informed Decision-Making: Demonstrate how your data analysis

contributes to informed and data-driven decision-making, helping the
company make strategic choices backed by evidence.
 Cost Reduction: Showcase instances where your insights have led to
cost-saving measures or optimized processes, ultimately improving the
company's financial efficiency.
 Revenue Enhancement: Illustrate how your analytics efforts have
contributed to revenue growth, whether through identifying new
opportunities, optimizing pricing, or improving customer retention.
 Improved Operational Efficiency: Highlight areas where your analysis
has streamlined operations, enhanced workflow efficiency, or reduced
bottlenecks, leading to increased productivity.
 Risk Mitigation: Showcase instances where your analysis has identified
and addressed potential risks, contributing to the company's overall risk
management strategy.
 Employee Productivity and Satisfaction: If applicable, show how your
work positively impacts the workforce, whether through improved
employee satisfaction, engagement, or productivity.

7
INDEX

1. Introduction..........................................................................................10

2. Analysis.......................................................................................................11

3. Software requirements specifications...........................................................12

4. Technology............................................................................................13

PYTHON...............................................................................................14

PANDAS LIBRARY........................................................................15

SEABORN LIBRARY......................................................................15

MATPLOT LIBRARY......................................................................16

TABLEAU.........................................................................................16

5. Project Description....................................................................................17

6. Screenshots................................................................................................29

7. Conclusion..................................................................................................36

8
Learning Objectives/Internship Objectives

 Internships are generally thought of to be reserved for college students looking to gain
experience in a particular field. However, a wide array of people can benefit from Training
Internships in order to receive real world experience and develop their skills.
 An objective for this position should emphasize the skills you already possess in the area and
your interest in learning more
 Internships are utilized in a number of different career fields, including architecture,
engineering, healthcare, economics, advertising and many more.
 Some internship is used to allow individuals to perform scientific research while others are
specifically designed to allow people to gain first-hand experience working.
 Utilizing internships is a great way to build your resume and develop skills that can be
emphasized in your resume for future jobs. When you are applying for a Training Internship,
make sure to highlight any special skills or talents that can make you stand apart from the
rest of the applicants so that you have an improved chance of landing the position.

9
1. INTRODUCTION

In the dynamic landscape of today's data-driven world, the ability to harness the power of
information is a key driver of success. This report encapsulates my enriching experience during
the Data Analytics with Python internship, where I embarked on a journey to explore, analyze,
and extract meaningful insights from diverse datasets.

Over the course of the internship, I delved into the realm of Python programming,
leveraging powerful libraries such as Pandas, NumPy, and Matplotlib to manipulate, visualize,
and analyze data. From the intricacies of data cleaning to the art of exploratory data analysis
(EDA), each phase of the internship contributed to my growth as a data analyst.

This report aims to provide a comprehensive overview of the learning objectives achieved,
the skills acquired, and the practical applications encountered during the internship. It showcases
the journey from foundational concepts to advanced analytics techniques, illustrating how
Python became the language of choice for unraveling patterns, making predictions, and
communicating findings.

Join me as I navigate through the intricacies of statistical analysis, delve into the world of
machine learning applications, and reflect on the ethical considerations inherent in the field of
data analytics. Through this report, I aim to convey not just the technical aspects of the internship
but also the holistic development of problem-solving skills, teamwork, and effective
communication.

May this documentation serve as a testament to the transformative power of Python in the
realm of data analytics and inspire others to embark on their own data-driven explorations.

10
2. ANALYSIS

Existing System:

A data science project may involve automating data collection and integrating data from
different sources. Advanced machine learning algorithms can be used to analyze the data and
generate insights that are more accurate and actionable. Interactive dashboards or visualizations
can be developed to present the results in a more user-friendly format, allowing stakeholders to
explore the data and gain deeper insights.

Overall, the goal of a data science project is to improve the existing system by leveraging the
latest technology and techniques in data science to generate more accurate insights, streamline
processes, and ultimately drive better decision-making.

11
3. SOFTWARE REQUIREMENTS SPECIFICATIONS

3.1 System configurations

The software requirement specification can produce at the culmination
of the analysis task. The function and performance allocated to
software as part of system engineering are refined by established a
complete information description, a detailed functional description, a
representation of system behavior, and indication of performance and
design constrain, appropriate validate criteria, and other information
pertinent to requirements.

3.2 Software Requirements:

• Operating system : Windows

• Coding Language : PYTHON

• Database : EXCEL SHEET

• Editor : ANACONDA JUPYTER

3.3 Hardware Requirement:

• System : Laptop

• Storage : 512 GB SSD

• RAM : 8 GB RAM

12
4. TECHNOLOGY

4.1 SQL(Structured Query Language)

Structured Query Language (SQL) is a powerful domain-specific language used for managing
and manipulating relational databases. It serves as the standard language for interacting with
relational database management systems (RDBMS) and is essential for tasks related to data
definition, data manipulation, and data control.

Key Concepts:

Data Definition Language (DDL):

CREATE: Used to create database objects like tables, indexes, and views.

ALTER: Modifies the structure of existing database objects.

DROP: Deletes database objects.

Data Manipulation Language (DML):

SELECT: Retrieves data from one or more tables.

INSERT: Adds new records to a table.

UPDATE: Modifies existing records in a table.

DELETE: Removes records from a table.

Data Control Language (DCL):

GRANT: Provides specific privileges to database users.

REVOKE: Removes specific privileges from database users.

Query Language:SQL is particularly known for its query capabilities, allowing users to retrieve
and filter data based on specified conditions.

13
Joins:

SQL supports various types of joins (e.g., INNER JOIN, LEFT JOIN, RIGHT JOIN) to
combine rows from different tables based on related columns.

Constraints:

SQL enables the definition of constraints such as PRIMARY KEY, FOREIGN KEY, UNIQUE,
and CHECK to maintain data integrity.

Functions:

SQL provides a rich set of functions for manipulating data during queries, including
mathematical functions, string functions, and date functions.

4.2 PYTHON:

Python is a high-level, interpreted programming language that is widely used in various

fields, including data science, web development, scientific computing, and artificial
intelligence/machine learning. It was first released in 1991 and has since become one of the most
popular programming languages in the world.

Python is a general-purpose language that is designed to be easy to read and write, which
makes it a popular choice for beginners as well as experienced programmers. It has a simple
syntax and is easy to learn, which allows developers to quickly prototype and test ideas. Python
also has a vast library of pre-built modules and packages, which makes it easy to implement
complex algorithms and data structures.

One of the key strengths of Python is its flexibility and versatility. It can be used for a wide
range of applications, from simple scripts to complex applications. It is also platform-
independent, which means that Python code can be run on different operating systems, such as
Windows, Linux, and macOS.

14
4.3 PANDAS LIBRARY:

Pandas is a powerful and popular open-source library for data manipulation and analysis in
Python. It provides easy-to-use data structures, tools for data analysis, and data cleaning
functions. The library is built on top of the NumPy library and provides more high-level data
manipulation functionalities that allow for more efficient and fast analysis.

Pandas provides two main data structures, the Series and DataFrame objects. A Series is a
one- dimensional array-like object that can hold any data type, while a DataFrame is a two-
dimensional table-like data structure that can store data of different types. These two data
structures are the building blocks of pandas and are widely used in data manipulation and
analysis.

Pandas also provides many powerful functions and methods for data manipulation, such as
filtering, merging, grouping, and pivoting data. It also offers tools for data cleaning, including
handling missing data, data imputation, and data normalization.

4.4 SEABORN LIBRARY:

Seaborn is a Python data visualization library based on Matplotlib that provides a high-level
interface for creating informative and attractive statistical graphics. It provides a range of
advanced visualization functions that allow for the creation of complex and insightful
visualizations in just a few lines of code.

One of the key strengths of Seaborn is its ability to create complex visualizations with ease.
It provides a range of built-in functions for creating statistical graphics, such as bar plots, scatter
plots, heatmaps, and more. These functions allow for the creation of informative and
aesthetically pleasing visualizations that can help to uncover patterns and relationships in data.

Another strength of Seaborn is its integration with Pandas data structures. Seaborn can
directly accept Pandas data frames as input, making it easy to use with data science workflows.
It also provides many options for customization and formatting of visualizations.

15
4.5 MATPLOT LIBRARY:

Matplotlib is a Python library for creating visualizations and plots. It is one of the most
widely used visualization libraries in the Python ecosystem and provides a comprehensive set
of tools for creating static, animated, and interactive visualizations.

Matplotlib provides a wide range of plot types, including line plots, scatter plots, bar plots,
histograms, and more. It also provides many customization options, allowing users to adjust plot
properties such as color, font size, axis limits, and labels. This makes it easy to create high-
quality, publication-ready visualizations.

One of the strengths of Matplotlib is its integration with NumPy, which allows for efficient
handling and plotting of large datasets. Matplotlib can also be used in conjunction with other
libraries such as Pandas and Seaborn to create even more sophisticated visualizations.

4.6 TABLEAU:

Tableau is a powerful data visualization and business intelligence tool that allows users to
connect, visualize, and share data in a compelling and interactive way. It supports various data
sources, offers a drag-and-drop interface for creating dashboards and reports, and enables users
to explore insights from their data. Tableau is widely used for its user-friendly design, flexibility,
and ability to handle large datasets, making it a popular choice for data analysis and decision
making.

16
5. Project Description:

SQL(Structured Query Language)

Myntra.

Myntra is a one stop shop for all your fashion and lifestyle needs. Being India's largest e-
commerce store for fashion and lifestyle products, Myntra aims at providing a hassle free and
enjoyable shopping experience to shoppers across the country with the widest range of brands
and products on its portal. The brand is making a conscious effort to bring the power of fashion
to shoppers with an array of the latest and trendiest products available in the country.

Key Fields:

Custid: Unique identifier for each customer.

Custname: Full name of the customer placing the order.
Custcity: Location where the customer resides for precise delivery coordination.
Phoneno: Contact number of the customer for communication purposes.
Gender: Gender to identify female,male
Pincode: particular pincode of city
Custreview: review of customer likes product or not

17
Project Goals:

• Efficient Order Management: Develop a database schema that allows for the efficient
recording and retrieval of customer orders, facilitating quick and accurate order
processing.

• Customer Relationship Management (CRM): Implement features to store and manage

customer details, enabling personalized services and efficient communication.

• Restaurant Integration: Establish relationships between customers and the associated

restaurants, creating a seamless linkage for order fulfillment.

• Price and Inventory Tracking: Enable real-time tracking of prices for different food
items and maintain inventory levels to avoid discrepancies.

• Analytics and Reporting: Implement SQL queries to derive insights such as popular
food items, customer preferences, and order trends, supporting informed decision-
making for business growth.

• Data Security and Integrity: Implement robust data security measures to protect
customer information and ensure the integrity of the database.

This project aims to demonstrate the power of SQL in creating a comprehensive and well-
organized products delivery system, fostering efficiency and enhancing the overall experience
for both customers.

18
Project 1 by SQL :-
create table cust(Custid int primary key,Custname varchar(50),Custcity varchar(50),Phoneno
int,Gender text,Pincode int,Custreview text);

1. Insert into cust(Custid,Custname,Custcity,Phoneno,Gender,Pincode,Custreview)

values(101,'om','pune',876543,'M',411000,'nice');
values(102,'piya','nagar',887564,'F',414001,'Awesome');
values(103,'salman','nashik',774352,'M',424109,'good');
values(104,'kajol','mumbai',875434,'F',400001,'best');
values(105,'seema','ratnagiri',904352,'F',415202,'bad');
values(106,'riya','sangli',786544,'F',415301,'Awesome');
values(107,'sham','pune',923453,'M',411000,'better');
values(108,'mahi','nagar',887664,'F',414001,'Awesome');
values(109,'adhya','pune',945322,'M',411000,'worst');
values(110,'rani','nashik',775464,'F',422001,'beautiful');
values(111,'ram','nagpur',743352,'M',440001,'nice');
values(112,'leena','kolhapur',886574,'F',415101,'good');
values(113,'arohi','ahmednagar',898546,'F',414001,'nice');
values(114,'janhavi','pune',784434,'F',410301,'worst');
values(115,'kiara','ratnagiri',785463,'F',416713,'poor');
values(116,'kartik','ahmednagar',903244,'M',414001,'nice');
values(117,'aliya','ahmednagar',854234,'F',414001,'very good');
values(118,'riya','nashik',834256,'F',422001,'satisfactory');
values(119,'aditya','mumbai',876543,'M',400104,'best');
values(120,'virat','pune',943527,'M',410301,'nice');
values(121,'varun','mumbai',907654,'M',400104,'very poor');
values(122,'sidharth','sangli',935274,'M',415301,'worst');
values(123,'diya','nashik',867543,'F',422001,'Awesome');
values(124,'abhi','ahmednagar',876244,'M',414001,'good');
values(125,'karan','mumbai',943244,'M',400001,'satisfactory');
values(126,'rohit','nagpur',897765,'M',440001,'poor');
values(127,'raju','pune',997543,'M',410301,'bad quality');
values(128,'sid','ahmednagar',965433,'M',414001,'good');
values(129,'anu','pune',945388,'F',410301,'very nice');
values(130,'akshay','sangli',986645,'M',415301,'bad');
values(131,'abhishek','sangli',987765,'M',415301,'poor');
values(132,'sam','nashik',887654,'M',422001,'satisfactory');
values(133,'swapnil','kolhapur',976543,'M',415101,'bad quality');
values(134,'ansh','nagar',976543,'M',414001,'good');
values(135,'anjali','mumbai',905463,'F',400001,'best');
values(136,'teena','aurangabad',998766,'F',423701,'worst');
values(137,'madhu','nagpur',887645,'F',440001,'nice');
values(138,'rahul','nagpur',986645,'M',440001,'very good');
values(139,'aarti','kolhapur',985534,'F',415101,'Awesome');

19
values(140,'hema','nanded',897765,'F',431601,'good');
values(141,'riya','pune',997834,'F',443201,'satisfactory');
values(142,'aliya','sangli',998534,'F',415301,'best');
values(143,'akshay','solapur',986643,'M',413001,'poor');
values(144,'riya','kolapur',887654,'F',415101,'worst');
values(145,'shahid','pune',976653,'M',413502,'satisfactory');

To view the table use the query:

Select *from cust;

20
Questions :-

select count(Custid)
from cust
where Gender='F';

select Custreview,count(Custreview)
from `cust`
group by Custreview;

select Gender,count(*)
from `cust`
group by Gender;

21
select *,
CASE
WHEN Custcity = 'mumbai' THEN 'Tier 1'
WHEN Custcity = 'pune' THEN 'Tier 1'
WHEN Custcity = 'nagpur' THEN 'Tier 2'
WHEN Custcity = 'nashik' THEN 'Tier 2'
WHEN Custcity = 'ahmednagar' THEN 'Tier
3' ELSE 'Tier 4'
END as city_Rank
from cust;

product table

create table Product(prid int,Custid int,orederid int primary key,prname varchar(50),prdetails

varchar(50),orderdate date,size varchar(50),quantity int,price float,totamt double,status
varchar(50),prreturn text,payment varchar(20),ratings float);

insert into
product(prid,Custid,ordered,iprname,prdetails,orderdate,size,quantity,price,totamt,status,p
rreturn,payment,ratings)
values(1001,102,21103,'kurti','printed three quarter sleeves','23-02-
23','XL',2,200,400,'Delivered','no','online',4.3);
values(1004,105,21233,'facewash','vtamin c face serum','21-06-
21','300ml',1,400,400,'Delivered','no','cash',3.3);
values(1011,103,21234,'curtaitns','polyster single door curtain','11-01-
22','2.15m*1.3m',1,799,799,'Delivered','no','cash',4.1);
values(1111,103,21024,'saree','saree with zari border silk','01-11-
21','onesize',2,527,1054,'Pending','no','online',4.1);
values(1213,111,20124,'T-shirt','cotton printed','11-10-

22
22','XXL',3,319,957,'Delivered','NO','cash',4-5);
values(1141,109,21145,'Airpods','white bluetooth headset','09-03-
22','onesize',1,899,899,'Delivered','returned','cash',2.1);
values(1434,117,20435,'hair serum','anti frizz hair serum','04-05-
23','150ml',2,300,600,'Pending','no','cash',4.8);
values(1124,140,23414,'flats','open toe women flats','11-08-
23','38',1,589,589,'Delivered','no','online',3.5);
values(1432,122,21432,'jeans','men slim fit streachable jeans','05-
03- 23',36,1,799,799,'Delivered','returned','cash',2.9);
values(1552,114,21984,'t-shirt','women printed casual tshirt','21-07-
23','3XL',1,550,550,'Delivered','returned','online',2.2);
values(1110,100,10424,'bangles','oxidised beaded bangles','17-06-
23',2.8,2,200,400,'Delivered','no','online',4.5);
values(1324,118,19654,'trousers','women parallel trouser','13-01-
24',32,1,720,720,'Pending','no','online',3.8);
values(1230,137,11756,'cushion cover','cotton square cushion cover','18-
06- 22','X16',2,340,680,'Delivered','no','online',5.5);
values(1432,130,23412,'dinner set','27pcs printed dinnerset','20-08-
22','onesize',1,1704,1704,'Delivered','returned','online',2.5);
values(1332,123,39452,'idol set','silver brass radhakrishna idol','25-10-
23','onesize',1,850,850,'Delivered','no','cash',4.9);
values(1240,142,18764,'necklace','gold plated layered','17-09-
23','onesize',2,350,700,'Delivered','no','cash',4.3);
values(1111,149,39874,'saree','zari border silk saree','27-12-
23','onesize',1,550,550,'Delivered','no','online',5.5);
values(1563,122,15432,'ethic wear','printed kurti with dupatta','24-05-
23','M',1,1759,1759,'Delivered','no','online',2.5);
values(1650,116,14374,'men kurta','mirror work cotton kurta','08-11-
23','XL',2,1200,2400,'Pending','no','online',4.9);
values(1124,131,22134,'flats','open toe women flats','27-11-
23',36,1,589,589,'Delivered','returned','cash',2.5);
values(1010,129,21765,'bedsheet','floral printed','02-03-23','double
XL',3,350,1050,'Delivered','no','cash',4.9);

To view the table use the query:

Select *from product;

23
select min(price)
from product

select max(price)
from product

24
select sum(price)
from product

select avg(price)
from product

select prname,quantity
from product
where quantity >2;

select prname,quantity,max(price) as tot_price

from `product`
group by
prname,quantity order by
tot_price desc limit 5;

25
select c.*,p.prname,ratings
from cust c
join product p
on c.Custid=p.Custid;

select c.*,p.quantity,status
from cust c
left join product p
on c.Custid=p.Custid;

select p.* ,Custname,Custreview

from cust c
right join product p
on c.Custid=p.Custid;

26
PYTHON PROJECT

Global Youtube Statastic dataset

In this Python project, we delve into the fascinating world of exploratory data analysis (EDA)
using the Anaconda distribution.
This dataset unveils the statistics of the most subscribed YouTube channels. A collection of
YouTube giants, this dataset offers a perfect avenue to analyze and gain valuable insights from
the luminaries of the platform. With comprehensive details on top creators,subscriber counts,
video views, upload frequency, country of origin, earnings, and more.

Dataset Overview:
The dataset consists of information related to YouTube channels,Youtubers, including features
such as head, tail, description, index, and additional details crucial for understanding the dataset's
structure.

Operations Conducted:

Head and Tail Operation:

Utilize the head() and tail() functions to inspect the initial and final rows of the dataset, gaining a
quick overview of its structure.

Describe and Info Operation:

Employ the describe() function to generate statistical summaries, offering insights into central
tendencies and spread.
Utilize the info() method to obtain a concise summary of the dataset, including data types and
missing values.

Unique operation:
generate an array that only includes unique elements from series
Index Operation:
Explore the dataset's index to understand its organization and uniqueness

27
Boxplot Operation:
Implement boxplot visualizations to identify the spread, central tendency, and outliers in
numerical variables, providing a robust understanding of the dataset's statistical characteristics.

Plt.show()
show the plot of specific column

Pivot table()
pivot table uesd to transfer columns into rows

Heatmap
heatmap is used to dispaly relationship between variables in a tabular dataset

sns.jointplot
displays relationship between two variables and distribution of individuals of each variable

28
6. Screenshots

29
30
31
32
33
Tableau Project

The "IPL Dashboard" is an immersive data visualization project crafted using Tableau, aimed
at unlocking valuable insights within the realm of IPL team data. This project focuses on
harnessing the power of visual analytics to provide a comprehensive overview of various facets
of team statistics and operations.

Objective:
The primary goal of the project is to transform raw IPL data into an interactive and visually
appealing dashboard that allows stakeholders to gain meaningful insights. From enrollment
trends and team performance to resource allocation and playing demographics, the dashboard
provides a holistic view of the teams landscape.

Introduction:

The Indian Premier League (IPL) is a Twenty20 cricket league founded in 2008 and held
annually. The league features participation from national and international players, with eight
teams representing eight Indian cities that compete in a double round-robin format during the
league stages, followed by playoffs. Over the years, the IPL has emerged as one of the most-
watched and most-attended live sporting events globally.

Business Objective:

As a data analyst at IPL, I create Tableau dashboards for news reports and feeds. Recently,
the Sports Editor asked me to build an interactive dashboard featuring IPL statistics for their
upcoming newsletter. The dashboard will provide customizable filters for interactivity and
display visual representations created in Tableau.

In order to complete this task, I need to use data set file

matches.csv: which contains match-level information for every match held in IPL from 2008 to
2017

34
SCREENSHOT OF PROJECT
IPL DATASET

35
7. CONCLUSION

My data analytics internship equipped me with valuable skills in SQL, Python, and
Tableau. I gained hands-on experience in querying databases, manipulating data using Python,
and creating insightful visualizations in Tableau. This internship not only enhanced my technical
proficiency but also provided me with a practical understanding of how these tools synergize to
derive meaningful insights from data. I gain very much knowledge from this internship. I am
now well- prepared to apply these skills in real-world scenarios, making a meaningful
contribution to data- driven decision-making processes.

36
WEEKLY DIARY

37
WEEKLY DIARY

FOR

INDUSTRIAL TRAINING

Name Of Industry :- Innover Infotech

From :- 1st Jan, 2024 to 15th Feb,2024

Name Of Supervisor/ Faculty :- Prof. R.A. Ghadage

Designation Of Supervisor/Faculty :- Assistant Professor

Name Of Student :- Aparna Gangishetty Enrollment/Seat Number :- T191014216

Branch Of Engineering :- Computer Engineering

Name Of College :- Shri Chhatrapati Shivaji Maharaj College Of Engineering.

Special Instruction to Students :-

1)Write Down The Daily Activity On The Same Day

2) Make Note Of The Important Actual Activity Only.

3) Summarize At The Week End.

4) Add Extra Sheet If Needed For Daily Or Weekly Activity Report

38
Week 1 :- From 01 Jan To 06 Jan 2024

Day Activities Carried Out

Monday, 01 Jan 2024 On this Day, We got knowledge about the company and all the
activities which is to be carried out in the entire Internship.

Tuesday, 02 Jan 2024 On this Day, We get explanation about the companies project and
overall internship work.

Wednesday, 03 Jan 2024 On this Day, We gain all the knowledge about the concept of Excel
related to the Data Analytics.

Thursday, 04 Jan 2024 On this Day, We learned the Aggregation Functions (Min, Max, Sum,
Count, Average) and advanced formulas of Excel.

Friday, 05 Jan 2024 On this Day, We learned all the knowledge about statistics which is
required for Data Analytics.

Saturday, 06 Jan 2024 On this Day, We Learned about the concept of Outliers and Skewness
(Left Skewness & Right Skewness).

Summarize At The Week End :- Introduction to company and there work , gain knowledge about
excel and data analytics.

Signature of Company Representative Signature of Internship Coordinator

39
Week 2 :- From 08 Jan To 13 Jan 2024

Day Activities Carried Out

Monday, 08 Jan 2024 On this Day, We get brief introduction of the SQL (Structured
Query Language) and how it is helpful in data analytics.

Tuesday, 09 Jan 2024 DDL (Data Definition Language) :- Create, Drop,

Alter, Truncate
DML( Data Manipulation Language) :- Insert, Update, Delete
DQL (Data Query Language) :- Select

Wednesday, 10 Jan 2024 On this Day, We work on select statement, Where clause, Having
clause and Know the difference between Where clause and
Having clause

Thursday, 11 Jan 2024 On this Day, we work on group by function and order by function
in SQL.

Friday, 12 Jan 2024 On this Day, We get overview of limit and Aggregation Functions
(Min, Max, Sum, Count, Average).

On this Day, We practically implement the concept of limit and

Saturday, 13 Jan 2024 Aggregation Functions.

Summarize At The Week End :- We Implemented all the concept of sql(Aggregation function, group
by, limit by which is used for the data analytics.

Signature of Company Representative Signature of Internship Coordinator

40
Week 3 :- From 15 Jan To 20 Jan 2024

Day Activities Carried Out

Monday, 15 Jan 2024 On this Day, We get knowledge about the subquery in the SQL with data
analytics.

Tuesday, 16 Jan 2024 On this Day, We practically work on subquery (A query inside another
query).

Wednesday, 17 Jan 2024 On this Day, We come to know about Joins and the types of joins in SQL.

Thursday, 18 Jan 2024 On this Day, We work on the concept of Joins (Inner join, Outer join,
Left join, Right join).

Friday, 19 Jan 2024 On this Day, We work on Operators (Arithmetic, Bitwise operator,
Logical operator, increment and decrement operator).

Saturday, 20 Jan 2024 On this Day, We work on the case statement (If, If-Else, Else-If )

Summarize At The Week End :-We successfully implemented subqueries ,joins(inner join, outer join,
left and right join) operators and case statements.

Signature of Company Representative Signature of Internship Coordinator

41
Week 4 :- From 22 Jan To 28 Jan 2024

Day Activities Carried Out

Monday, 22 Jan 2024 On this Day, We get introduction of python (variables, class,
functions, loop, OOP’s).

Tuesday, 23 Jan 2024 On this Day, We get Overview of Anaconda Navigator and Jupyter
Notebook and done the installation of Anaconda Navigator.

Wednesday, 24 Jan 2024 On this Day, We come to know about python collection objects
(strings, list, tuple and dictionary).

Thursday, 25 Jan 2024 On this Day, We Downloaded the dataset required for data analytics in
python.

Saturday, 27 Jan 2024 On this Day, We work on datasets using Pandas and Numpy libraries.

On this Day, We working on Outliers(Which does not follow the

Sunday, 28 Jan 2024 pattern and trend of data) and Boxplot using Matplot library.

Summarize At The Week End :- We successfully gain knowledge about the concept of
python and different libraries (Numpy, Pandas and Matplot) in python.

Signature of Company Representative Signature of Internship Coordinator

42
Week 5 :- From 30 Jan To 04 Feb 2024

Day Activities Carried Out

Monday, 30 Jan 2024 On this Day, We get the overall introduction of the Tableau and
downloaded Tableau in our system.

Tuesday, 31 Jan 2024 On this Day, We imported different datasets in Tableau which is used
for data analytics.

Wednesday, 01 Feb 2024 On this Day, We get to know about the Tableau interface and different
chart types (Pie Chart, Histogram, Boxplot).

Thursday, 02 Feb 2024 On this Day, We implemented mapping of datasets and perform visual
analytics of it.

Friday, 03 Feb 2024 On this Day, We Work on different questions and calculations on the
dataset.

On this Day, We created dashboard on the tableau and publish that

Saturday, 04 Feb 2024 dashboard publicly and generated our own link of that dashboard.

Summarize At The Week End :- We successfully gain knowledge about the concept of
Tableau and created our own dashboard to publish our work.

Signature of Company Representative Signature of Internship Coordinator

43
Week 6 :- From 30 Jan To 04 Feb 2024

Day Activities Carried Out

Monday, 06 Feb 2024 On this Day, We start build in our SQL project, perform all the research about
that project.

Tuesday, 07 Feb 2024 On this Day, We start to implement that project (Food Delivery System) and
completed it Successfully.

Wednesday, 08 Feb 2024 On this Day, We start build in our Python project, perform all the research about
that project.

Thursday, 09 Feb 2024 On this Day, We start to implement that project (Vehicle System) and
completed it Successfully.

Friday, 10 Feb 2024 On this Day, We start build in our Tableau project, perform all the research
about that project and Publish our dashboard about project in our tableau
account.
https://public.tableau.com/views/IPLDashboard_17093605462120/Dashboard1?:l
anguage=en-US&:sid=&:display_count=n&:origin=viz_share_link

On this Day, We start to implement that project (University Data) and

Saturday, 11 Feb 2024 completed it Successfully.

Summarize At The Week End :- We successfully complete our projects of python, Tableau,
SQL.

Signature of Company Representative Signature of Internship Coordinator

44
45
46

Internship Report On Ai
No ratings yet
Internship Report On Ai
32 pages
115 SQL Interview Questions and Answers
100% (1)
115 SQL Interview Questions and Answers
34 pages
Pranita Dane - IBM - Internship Project Submission - Data Analytics
No ratings yet
Pranita Dane - IBM - Internship Project Submission - Data Analytics
28 pages
Codsoft Report
No ratings yet
Codsoft Report
26 pages
E-COMMERCE AND DIGITAL MARKETING Lab Manual
No ratings yet
E-COMMERCE AND DIGITAL MARKETING Lab Manual
20 pages
Case Study On Dbms & Rdbms
No ratings yet
Case Study On Dbms & Rdbms
36 pages
Internpe Final
No ratings yet
Internpe Final
35 pages
NOTES OF Python Ok
No ratings yet
NOTES OF Python Ok
73 pages
PDF Sentimental Analysis Project Documentation
No ratings yet
PDF Sentimental Analysis Project Documentation
74 pages
Sarthak 091 Report
No ratings yet
Sarthak 091 Report
32 pages
2A Report
No ratings yet
2A Report
29 pages
SQL Report
No ratings yet
SQL Report
59 pages
Celonis Report 1276
0% (1)
Celonis Report 1276
31 pages
Summer Internship Report On: Aws Data Engineering (Topic)
No ratings yet
Summer Internship Report On: Aws Data Engineering (Topic)
21 pages
Final B.tech. Internship Report Sample Format-3
100% (1)
Final B.tech. Internship Report Sample Format-3
13 pages
Furniture Management System Project Report
60% (15)
Furniture Management System Project Report
56 pages
Aniket Gulati Younity Certificate
No ratings yet
Aniket Gulati Younity Certificate
1 page
Agile Technologies 21CS641 Module 1
No ratings yet
Agile Technologies 21CS641 Module 1
19 pages
Internship Report (200490111006)
No ratings yet
Internship Report (200490111006)
41 pages
Infolabz Report Format Daml
No ratings yet
Infolabz Report Format Daml
8 pages
SQL Report
No ratings yet
SQL Report
52 pages
TE - Internship Report Format (Word)
No ratings yet
TE - Internship Report Format (Word)
19 pages
Major Project Documentation Final 2
No ratings yet
Major Project Documentation Final 2
62 pages
STQA SEM III SPPU MAR APR 2023 FINAL PDF-1 - Removed
No ratings yet
STQA SEM III SPPU MAR APR 2023 FINAL PDF-1 - Removed
35 pages
Walchand College of Engineering, Sangli.: (An Autonomous Institute)
No ratings yet
Walchand College of Engineering, Sangli.: (An Autonomous Institute)
12 pages
Synopsis On
No ratings yet
Synopsis On
8 pages
CS 441 Handouts
No ratings yet
CS 441 Handouts
300 pages
Project Synopsis of Python
No ratings yet
Project Synopsis of Python
6 pages
Intern Report
No ratings yet
Intern Report
27 pages
Tranning Project Report
No ratings yet
Tranning Project Report
25 pages
Exit Exam Fundamentals of Database System 1 3
100% (1)
Exit Exam Fundamentals of Database System 1 3
5 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
Vandana Internship Report
No ratings yet
Vandana Internship Report
48 pages
Busa2001 2023 Sem2 Newcastle
No ratings yet
Busa2001 2023 Sem2 Newcastle
6 pages
Mini Project Report - Format (2023-24) (AI)
No ratings yet
Mini Project Report - Format (2023-24) (AI)
17 pages
Python
100% (1)
Python
8 pages
Department of Information Technology: Assignment Aim: Content Beyond Syllabus
100% (1)
Department of Information Technology: Assignment Aim: Content Beyond Syllabus
2 pages
Internship - Report Nithin
No ratings yet
Internship - Report Nithin
25 pages
Celonis Internship Report
No ratings yet
Celonis Internship Report
11 pages
Onlinepay
No ratings yet
Onlinepay
23 pages
Final Stqa Miniproject 2-1
No ratings yet
Final Stqa Miniproject 2-1
14 pages
LP3 - ML Mini-Project Report Format Shreeyas
No ratings yet
LP3 - ML Mini-Project Report Format Shreeyas
13 pages
B.tech 20-21 Internship
No ratings yet
B.tech 20-21 Internship
9 pages
Module 4. Planning Projects - PM
100% (1)
Module 4. Planning Projects - PM
39 pages
Student Result Management System: Major Project ON
100% (1)
Student Result Management System: Major Project ON
57 pages
Airline Search Engine Project
No ratings yet
Airline Search Engine Project
28 pages
DS&BD Lab Manul
No ratings yet
DS&BD Lab Manul
98 pages
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
No ratings yet
A Report of 08 Weeks Industrial Training At: ASPEXX Health Solution Pvt. LTD
74 pages
Visvesvaraya Technological University: "Car Rental Management System"
No ratings yet
Visvesvaraya Technological University: "Car Rental Management System"
31 pages
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
No ratings yet
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
4 pages
Journal App Report
No ratings yet
Journal App Report
37 pages
Big Data
No ratings yet
Big Data
30 pages
ProAzure Internship Report
No ratings yet
ProAzure Internship Report
34 pages
Pooja Intership2
No ratings yet
Pooja Intership2
35 pages
Halstead's Operators and Operands in C, C++, JAVA (By Indranil Nandy)
100% (6)
Halstead's Operators and Operands in C, C++, JAVA (By Indranil Nandy)
5 pages
Paperless Hospital Management
No ratings yet
Paperless Hospital Management
6 pages
Vreportinterm Nsihp
No ratings yet
Vreportinterm Nsihp
28 pages
APMC Prachi Synopsis
No ratings yet
APMC Prachi Synopsis
6 pages
Data Mining & Business Intelligence (2170715) : Unit-5 Concept Description and Association Rule Mining
No ratings yet
Data Mining & Business Intelligence (2170715) : Unit-5 Concept Description and Association Rule Mining
39 pages
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
No ratings yet
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
15 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
4 pages
CB 17 Black Book
No ratings yet
CB 17 Black Book
47 pages
Project Report On Online Test: Department of Computer Science Engg
No ratings yet
Project Report On Online Test: Department of Computer Science Engg
31 pages
Banking Management System
No ratings yet
Banking Management System
38 pages
ABAP RESTful Application Programming
No ratings yet
ABAP RESTful Application Programming
6 pages
Big Data - Road Map
No ratings yet
Big Data - Road Map
22 pages
Talend Quick Book
No ratings yet
Talend Quick Book
38 pages
Emerging Chapter 2
No ratings yet
Emerging Chapter 2
26 pages
Hadoop Features 2
No ratings yet
Hadoop Features 2
3 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
ICDL Advanced Using Databases 2013 2.0 - University of Dar Es Salaam Computing Centre - Mlimani Campus
No ratings yet
ICDL Advanced Using Databases 2013 2.0 - University of Dar Es Salaam Computing Centre - Mlimani Campus
138 pages
W3S SQL
No ratings yet
W3S SQL
13 pages
AWS Event Bridge Deep Dive
No ratings yet
AWS Event Bridge Deep Dive
22 pages
Cia I Business Data Management
No ratings yet
Cia I Business Data Management
4 pages
Chapter 3
No ratings yet
Chapter 3
30 pages
SQL Lab No.03 SQL Aggregate Functions: Department Department Department
100% (1)
SQL Lab No.03 SQL Aggregate Functions: Department Department Department
4 pages
CS8492 - DBMS - Unit 2
No ratings yet
CS8492 - DBMS - Unit 2
18 pages
? FIT2094 Exam Notes ?
No ratings yet
? FIT2094 Exam Notes ?
26 pages
Second Semester
No ratings yet
Second Semester
36 pages
DBAS Sample Exam MS
No ratings yet
DBAS Sample Exam MS
17 pages
Upload Catalogue Item Images in Iprocurement
No ratings yet
Upload Catalogue Item Images in Iprocurement
42 pages
Best of Oracle 2018
No ratings yet
Best of Oracle 2018
80 pages
Integrative Programming C# - An OOP Language That Supports Data Encapsulation, Inheritance Polymorphism, and Method
No ratings yet
Integrative Programming C# - An OOP Language That Supports Data Encapsulation, Inheritance Polymorphism, and Method
6 pages
Assignment 1 - Node Modules, Express, MongoDB and REST API
No ratings yet
Assignment 1 - Node Modules, Express, MongoDB and REST API
6 pages
Unit !. Database System Concepts
No ratings yet
Unit !. Database System Concepts
29 pages
CV Cloud Operations Engineer Umme Ammara-2
No ratings yet
CV Cloud Operations Engineer Umme Ammara-2
2 pages
Leavemanagement Report
No ratings yet
Leavemanagement Report
85 pages
Reverse KT: Product Testing
No ratings yet
Reverse KT: Product Testing
3 pages
BC2402 Week 6 Class Exercises
No ratings yet
BC2402 Week 6 Class Exercises
4 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet