0% found this document useful (0 votes)

88 views42 pages

Data Science Lab Manual

The document discusses working with NumPy arrays and Pandas dataframes in Python. It shows how to create arrays and dataframes from lists, dictionaries, other arrays and series. Array slicing and basic operations like shape and size are demonstrated. Dataframes can be constructed from 2D arrays, dictionaries, other dataframes and series.

Uploaded by

HANISHA SAALIH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views42 pages

Data Science Lab Manual

Uploaded by

HANISHA SAALIH

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Ex no : 1 Download, install and explore the features of Numpy,

Date : scipy,jupyter and Pandas Package

AIM

PROCEDURE
1.Setting up your machine for data science in
Python2.Download and Install Anaconda
3.Installing Anaconda on Windows

For problem solvers, installing and using the Anaconda distribution of Python. This section details the
installation of the Anaconda distribution of Python on Windows 10. I think the Anaconda distribution of
Python is the best option for problem solvers who want to use Python. Anaconda is free (although the
download is large which can take time) and can be installed on school or work computers where you
don't have administrator access or the ability to install new programs. Anaconda comes bundled with
about 600 packages pre- installed including NumPy, Matplotlib andSymPy.

Follow the steps below to install the Anaconda distribution of Python on Windows.

1. Visit Anaconda.com/downloads
2. Select Windows
3. Download the .exe installer
4. Open and run the .exe installer
5. Open the Anaconda Prompt and run some Python code
Feature of python package:

1. Pandas

Pandas is a free Python software library for data analysis and data handling. It was created as a
community library project and initially released around 2008. Pandas provides various high-
performance
and easy-to-use data structures and operations for manipulating data in the form of numerical tables and
time series. Pandas also has multiple tools for reading and writing data between in-memory data
structures and different file formats. In short, it is perfect for quick and easy data manipulation, data
aggregation, reading, and writing the data as well as data visualization. Pandas can also take in data
from different types of files such as CSV, excel etc.or a SQL database and create a Python object
known as a data frame. A data frame contains rows and columns and it can be used for data
manipulation with operations such as join, merge, groupby, concatenate etc.

NumPy is a free Python software library for numerical computing on data that can be in the form of
large arrays and multi-dimensional matrices. These multidimensional matrices are the main objects in
NumPy where their dimensions are called axes and the number of axes is called a rank. NumPy also
provides various tools to work with these arrays and high-level mathematical functions to manipulate
this data with linear algebra, Fourier transforms, random number crunchings, etc. Some of the basic
array operations that can be performed using NumPy include adding, slicing, multiplying, flattening,
reshaping, and indexing the arrays. Other advanced functions include stacking the arrays, splitting them
into sections, broadcasting arrays, etc

2. NumPy

SciPy is a free software library for scientific computing and technical computing on the data. It was
created as a community library project and initially released around 2001. SciPy library is built on the
NumPy array object and it is part of the NumPy stack which also includes other scientific computing
libraries and tools such as Matplotlib, SymPy, pandas etc. This NumPy stack has users which also use
comparable applications such as GNU Octave, MATLAB, GNU Octave, Scilab, etc. SciPy allows for
various scientific computing tasks that handle data optimization, data integration, data interpolation,
and data modification using linear algebra, Fourier transforms, random number generation, special
functions, etc. Just like NumPy, the multidimensional matrices are the main objects in SciPy, which are
provided by the NumPy module itself.

Python Libraries for Data Visualization

1. Matplotlib

Matplotlib is a data visualization library and 2-D plotting library of Python It was initially released in
2003 and it is the most popular and widely-used plotting library in the Python community. It comes with
an interactive environment across multiple platforms. Matplotlib can be used in Python scripts, the
Python and IPython shells, the Jupyter notebook, web application servers etc. It can be used to embed
plots into applications using various GUI toolkits like Tkinter, GTK+, wxPython, Qt, etc. So you can
use Matplotlib to create plots, bar charts, pie charts, histograms, scatterplots, error charts, power
spectra, stemplots, and whatever other visualization charts you want! The Pyplot module also provides a
MATLAB-like interface that is just as versatile and useful as MATLAB while being totally free and
open source.
2. Seaborn
Seaborn is a Python data visualization library that is based on Matplotlib and closely integrated with
the numpy and pandas data structures. Seaborn has various dataset-oriented plotting functions that
operate on data frames and arrays that have whole datasets within them. Then it internally performs the
necessary statistical aggregation and mapping functions to create informative plots that the user desires.
It is a high-level interface for creating beautiful and informative statistical graphics that are integral to
exploring and understanding data. The Seaborn data graphics can include bar charts, pie charts,
histograms, scatterplots, error charts, etc. Seaborn also has various tools for choosing color palettes that
can reveal patterns in the data.
3. Plotly
Plotly is a free open-source graphing library that can be used to form data visualizations. Plotly
(plotly.py) is built on top of the Plotly JavaScript library (plotly.js) and can be used to create web-
based data visualizations that can be displayed in Jupyter notebooks or web applications using Dash or
saved as individual HTML files. Plotly provides more than 40 unique chart types like scatter plots,
histograms, line charts, bar charts, pie charts, error bars, box plots, multiple axes, sparklines,
dendrograms, 3-D charts, etc. Plotly also provides contour plots, which are not that common in otherdata
visualization libraries. In addition to all this, Plotly can be used offline with no internet connection.

RESULTS
Ex no: 2 Working with Numpy arrays
Date:

AIM

ALGORITHM
PROGRAM
import numpy as np
# Creating array object
arr = np.array( [[ 1, 2,3],[ 4, 2,
5]] )# Printing type of arr
object print("Array is of type: ",
type(arr))# Printing array
dimensions (axes)
print("No. of dimensions: ", arr.ndim)
# Printing shape of array
print("Shape of array: ",
arr.shape)
# Printing size (total number of elements) of
arrayprint("Size of array: ", arr.size)
# Printing type of elements in array
print("Array stores elements of type: ", arr.dtype)
OUTPUT
Array is of type: <class 'numpy.ndarray'>
No. ofdimensions: 2
Shape of array:
(2, 3)Size of
array: 6
Array stores elements of type: int32

RESULTS
Ex no: 2a Working with Numpy arrays
Date:

AIM

ALGORITHM
PROGRAM
Program to Perform Array Slicing
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print(a)
print("After slicing") print(a[1:])
OUTPUT:
Our array is:
[[1 2 3]
[3 4 5]
[4 5 6]]
The items in the second column
are:[2 4 5]
The items in the second row
are:[3 4 5]
The items column 1 onwards
are:[[2 3]
[4 5]
[5 6]]

RESULTS
Ex no: 2b Working with Numpy arrays
Date:

AIM

ALGORITHM
PROGRAM

Program to Perform Array Slicing

import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print('Our array
is:' )print(a)
# this returns array of items in the second column
print('The items in the second column
are:' )print(a[...,1])
print('\n' )
# Now we will slice all items from the second
rowprint ('The items in the second row are:' )
print(a[1,...])
print('\n' )
# Now we will slice all items from column 1
onwardsprint('The items column 1 onwards are:')
print(a[...,1:])
OUTPUT:
Our array is:
[[1 2 3]
[3 4 5]
[4 5 6]]
The items in the second column
are:[2 4 5]
The items in the second row
are:[3 4 5]
The items column 1 onwards
are:[[2 3]
[4 5]
[5 6]]

RESULT:
Ex no: 3 Create a Dataframe using a list of elements
Date:

AIM

ALGORITHM
PROGRAM

import numpy as npimport pandas as pd

data = np.array([['','Col1','Col2'],['Row1',1,2],['Row2',3,4]])
print(pd.DataFrame(data=data[1:,1:],index = data[1:,0],
columns=data[0,1:]))# Take a 2D array as input to your DataFrame
my_2darray = np.array([[1, 2, 3], [4, 5,
6]])print(pd.DataFrame(my_2darray))
# Take a dictionary as input to your
DataFramemy_dict = {1: ['1', '3'], 2: ['1',
'2'], 3: ['2', '4']}
print(pd.DataFrame(my_dict))
# Take a DataFrame as input to your DataFrame
my_df = pd.DataFrame(data=[4,5,6,7], index=range(0,4),
columns=['A'])print(pd.DataFrame(my_df))
# Take a Series as input to your DataFrame
my_series = pd.Series({"United Kingdom":"London", "India":"New Delhi", "United
States":"Washington", "Belgium":"Brussels"})
print(pd.DataFrame(my_series))
df = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6]]))
# Use the `shape`
propertyprint(df.shape)
# Or use the `len()` function with the `index`
propertyprint(len(df.index)
# Take a DataFrame as input to your DataFrame
my_df = pd.DataFrame(data=[4,5,6,7], index=range(0,4),
columns=['A'])print(pd.DataFrame(my_df))
# Take a Series as input to your DataFrame
my_series = pd.Series({"United Kingdom":"London", "India":"New Delhi", "United
States":"Washington", "Belgium":"Brussels"})
print(pd.DataFrame(my_series))
df = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6]]))
OUTPUT:

Col1
Col2Row1 1
2
Row2 3 4

0 1 2
0 12 3
1 45 6

1 2 3
0 1 1 2
1 3 2 4
A
0 4
1 5
2 6
3 7
0
United Kingdom
London
India New Delhi
United States
Washington Belgium
Brussels
(2, 3)
2

RESULT:
Ex no: 4 Descriptive analytics on the Iris data set
Date:

AIM

ALGORITHM
PROGRAM
# download iris.csv from https://datahub.io/machine-
learning/irisimport pandas as pd
# Reading CSV file
df=pd.read_csv('iris_csv.c
sv')#Printing top 5 rows
df.head()
df.shape
df.info()
df.describe()
df.isnull().su
m()
df.value_counts('class')
OUTPUT:
RangeIndex: 150 entries, 0 to
149 Data columns (total 5
columns):
# Column Non-Null Count Dtype

0 sepallength 150 non-null float64

1 sepalwidth 150 non-null float64
2 petallength 150 non-null float64
3 petalwidth 150 non-null float64
4 class 150 non-null
object dtypes: float64(4),
object(1) memory usage: 6.0+
KB

RESULT:
Ex no: 5(a) Univariate analysis on UCI diabetes dataset
Date:

AIM

ALGORITHM
PROGRAM
#univariate Analysis of diabetes
datasetimport pandas as pd
import numpy as
np import
statistics as st
#Load the
dataset
df=pd.read_csv("diabetes_csv.csv")print("MEAN:\n",df.mean(numeric_only=True))
print("MEDIAN:\n",df.median(numeric_only=True))
print("MODE:\n",df.mode(numeric_only=True))
print("STANDARD
DEVITION:\n",df.std(numeric_only=True))
print("VARIANCE:\n",df.var(numeric_only=True))
print("SKEWNESS:\n",df.skew(numeric_only=True))
print("KURTOSIS:\n:",df.kurtosis(numeric_only=True))
OUTPUT:

MEAN:
preg 3.845052
plas 120.894531
pres 69.105469
skin 20.536458
insu 79.799479
mass 31.992578
pedi 0.471876
age 33.240885
dtype: floa

MEDIAN:
preg 3.0000
plas 117.0000
pres 72.0000

skin 23.0000
insu 30.5000
mass 32.0000
pedi 0.3725
age 29.0000
dtype:
float64
MODE:
preg plas pres skin insu mass pedi age
0 1.0 99 70.0 0.0 0.0 32.0 0.254 22.0
1 NaN 100 NaN NaN NaN NaN 0.258 NaN
STANDARD DEVITION:
preg 3.369578
plas 31.972618
pres 19.355807
skin 15.952218
insu 115.244002
mass 7.884160
pedi 0.331329
age 11.760232
dtype:
float64
VARIAN
CE:
preg 11.354056
plas 1022.248314
pres 374.647271
skin 254.473245
insu 13281.180078
mass 62.159984
pedi 0.109779
age 138.303046
dtype: float64

SKEWNESS:
preg 0.901674
plas 0.173754
pres -1.843608
skin 0.109372
insu 2.272251
mass -0.428982
pedi 1.919911
age 1.129597
dtype: float64

KURTOSIS:
: preg 0.159220
plas 0.640780
pres 5.180157
skin -0.520072
insu 7.214260
mass 3.290443
pedi 5.594954
age 0.643159
dtype: float64

RESULT:
Ex.No: 5(b) Bivariate analysis: Linear regression modeling
Date:

AIM:

ALGORITHM:
PROGRAM:
# Importing all the libraries
import matplotlib.pyplot
as pltimport numpy as np
import pandas as pd
from sklearn import datasets, linear_model
from sklearn.metrics import
mean_squared_error#loading dataset
diabetes =
datasets.load_diabetes()
diabetes.keys()
# to find the content of data
df = pd.DataFrame(diabetes['data'],columns = diabetes['feature_names'])
#putting our data in a Dataframe
x = df
y = diabetes['target']
from sklearn.model_selection import
train_test_split#to split our data into training
and testing set
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size = 0.3,random_state =
101)#splitting our data
#importing Model
from sklearn import linear_model
model =
linear_model.LinearRegression()
model.fit(x_train, y_train)
# Training data is used always
# Prediction of testset result of the Prepared
Modely_pre = model
puts the test feature value to get the label value which are predicted by the model#Cross Validation
Scores
from sklearn.model_selection import
cross_val_score #importing
scores = cross_val_score(model,x,y,scoring="neg_mean_squared_error",
cv=10)rmse_scores=np.sqrt(-scores).mean()
#calculating root mean sq. of the resulted scores of
arrayprint("Cross validation",rmse_scores)
#Checking predictions acuracy by r2 Scores (value lies between 0

to 1)from sklearn.metrics import r2_score

print("r2:",r2_score(y_test, y_pre))
#Calculating Root Mean Square
Error
mse=mean_squared_error(y_test,
y_pre)rmse=np.sqrt(mse)
print("RMSE:",rmse)
#Getting Weights and Intercept of
Modelprint("Weights:",model.coef_)
print("\nIntercept",model.intercept_)
OUTPUT:
Cross validation 54.40461553640237
r2: 0.4576767417719556
RMSE: 58.009275047552
Weights: [ -8.02566358 -308.83945001 583.63074324 299.9976184 -360.68940198
95.14235214 -93.03306818 118.15005596 662.12887711 26.07401648]

Intercept 153.72029738615726

RESULT:
Ex.No: 5(c) Bivariate analysis: Logistic regression modeling
Date:

AIM:

ALGORITHM:
PROGRAM:
import matplotlib.pyplot
as pltimport numpy as np
import pandas as pd
from sklearn import
datasets#, linear_model
from sklearn.metrics import
mean_squared_error
diabetes=datasets.load_diabetes()
diabetes.keys()
#to find the content of data
df=pd.DataFrame(diabetes[‘data’],columns=diabetes[‘feature_names’])
#putting our data in a dataframe
x=df
y=diabetes[‘tar
get’]
from sklearn.model_selection import train_test_split

#to spli our data into training and testing set

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=101
) #splitting our data
#importing Model
from sklearn.linear_model import
LogisticRegression#Built Model
model=LogisticRegressi
on()model.fit(x_train,
y_train)
#prediction of testset result of the prepared
modely_pre=model.predict(x_test)
#Checking predictions accuracy by r2 Scores(value lies between 0
to 1)from sklearn.metrics import r2_score
print(‘r^2:’,r2_score(y_test,y_pre))

#calculating root mean square error

mse=mean_squared_error(y_test,y_pre)
rmse=np.sqrt(mse)
print(‘RMSE:’rmse)
OUTPUT:

r^2: -0.44401265478624397
RMSE: 94.65723681369009

RESULT:
Ex.No: 6 Explore Various Plotting Functions
Date:

AIM:

ALGORITHM:
PROGRAM:

import numpy
as np import
pandas as pd
import matplotlib.pyplot
as pltimport seaborn as
sns
df=pd.read_csv('Heart.csv
')
df

#Normal Curve : Age

Variable
f,ax=plt.subplots(figsize=(1
0,6))x=df['Age']
ax=sns.distplot(x,bins=10)
plt.title('Normal curve')
plt.show()

#Density and contour

plots
x=np.linspace(0,5,50)
y=np.linspace(0,5,40)
X, Y = np.meshgrid(x,y)
z=np.sin(X)**10+np.cos(10+Y*X)*np.co
s(X)plt.contour(x,y,z)
plt.title('Density and Contour
Plots')plt.show()

#Correlation Plot Pair plot

sns.pairplot(data=df,vars=['Age','RestBP','Chol'])
plt.title('Correlation Plot Pair plot')
plt.show()
#Histogram
#df.hist(figsize=(12,12),layout=(5,
3))data=np.random.randn(1000)
plt.hist(data,bins=10)
plt.title('Histogram')
plt.show()

#Scatterplot to visualize the relationship between age and trestbps

variablef,ax=plt.subplots(figsize=(8,6))
ax=sns.scatterplot(x='Age',y='RestBP',data=df)
plt.title('Scatter
plot')plt.show()

#Histogram
#df.hist(figsize=(12,12),layout=(5,
3))data=np.random.randn(1000)
plt.hist(data,bins=10)
plt.title('Histogram')
plt.show()
OUTPUT
RESULT:
Ex.No: 7 Three Dimensional Plotting
Date:

AIM:
.

ALGORITHM:
PROGRAM:
import numpy
as np import
pandas as pd
import matplotlib.pyplot
as plt
df=pd.read_csv('Heart.csv
') fig=plt.figure()
#syntax for 3-D
Projection
ax=plt.axes(projection='3
d')#defining all 3 axes
x=df['Age']
x=pd.Series(x,name='Age
Variable')y=df['Sex']
y=pd.Series(y,name='sex
variable') z=df['Chol']
z=pd.Series(z,name='cholesterol
variable')#Plotting
ax.plot3D(x,y,z,'green')
ax.set_title('3D line plot Heart disease
dataset')plt.show()
OUTPUT:

RESULT:
Ex.No: 8 Visualizing Geographic Data with Basemap
Date:

AIM:

ALGORITHM:
PROGRAM:

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import
Basemap fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='lcc',
resolution=None,width=8E6,
height=8E6,
lat_0=20,
lon_0=78,)
m.etopo(scale=0.7,
alpha=0.7)
# Map (long, lat) to (x, y) for
plottingx, y = m(80, 13)
plt.plot(x, y, 'ok', markersize=5)
plt.text(x, y, ' Chennai',
fontsize=20);
OUTPUT:

RESULT:

Final Fds Manual Print
No ratings yet
Final Fds Manual Print
55 pages
Python Abstract
No ratings yet
Python Abstract
7 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
Python Data Analysis with NumPy
No ratings yet
Python Data Analysis with NumPy
112 pages
Python Module 5
No ratings yet
Python Module 5
43 pages
NumPy: Python Array Library Guide
No ratings yet
NumPy: Python Array Library Guide
11 pages
FDS Record Last
No ratings yet
FDS Record Last
61 pages
Fds Record
No ratings yet
Fds Record
69 pages
Data Visualization1
No ratings yet
Data Visualization1
52 pages
Cs3361 Data Science Laboratory
No ratings yet
Cs3361 Data Science Laboratory
139 pages
Unit 4
No ratings yet
Unit 4
105 pages
NumPy and Pandas: Essential Python Libraries
No ratings yet
NumPy and Pandas: Essential Python Libraries
72 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
31 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Asm 135233
No ratings yet
Asm 135233
3 pages
Unit Vi
No ratings yet
Unit Vi
60 pages
Final Fds Manual
No ratings yet
Final Fds Manual
77 pages
RAW Data
No ratings yet
RAW Data
22 pages
Foundation of Data Science Lab Manual
No ratings yet
Foundation of Data Science Lab Manual
31 pages
Q-Step WS 06112019 Data Analysis and Visualisation With Python
No ratings yet
Q-Step WS 06112019 Data Analysis and Visualisation With Python
76 pages
IRJET Scientific Computing and Data Anal
No ratings yet
IRJET Scientific Computing and Data Anal
13 pages
Lab Manual Fds
No ratings yet
Lab Manual Fds
44 pages
Introduction To Python Libraries
No ratings yet
Introduction To Python Libraries
13 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
75 pages
Unit 5 Python Packages 240127 185930
No ratings yet
Unit 5 Python Packages 240127 185930
34 pages
Scipy, Matplotlib, Pandas
No ratings yet
Scipy, Matplotlib, Pandas
16 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Lab 2, Python Numpy - LUMS
No ratings yet
Lab 2, Python Numpy - LUMS
4 pages
6 Numpy VI
No ratings yet
6 Numpy VI
126 pages
Data Analysis with Python Libraries
No ratings yet
Data Analysis with Python Libraries
29 pages
FINAL FDS MANUAL Print
No ratings yet
FINAL FDS MANUAL Print
55 pages
Numpy Data Analysis and Visualisation With Python
No ratings yet
Numpy Data Analysis and Visualisation With Python
75 pages
Packages
No ratings yet
Packages
37 pages
Ch2 Numpy Pandas
No ratings yet
Ch2 Numpy Pandas
87 pages
Module 4
No ratings yet
Module 4
4 pages
Print
No ratings yet
Print
296 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
36 pages
What Is Numpy?: Aim: Study Python Libraries: Numpy, Pandas, Matplotlib, Scikitlearn With Student Dataset
No ratings yet
What Is Numpy?: Aim: Study Python Libraries: Numpy, Pandas, Matplotlib, Scikitlearn With Student Dataset
18 pages
FDS Lab Meterial CS3361
No ratings yet
FDS Lab Meterial CS3361
30 pages
Essential Python Libraries
100% (1)
Essential Python Libraries
41 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
63 pages
Lab 2 DWM
No ratings yet
Lab 2 DWM
13 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Unit-V Python - BCC402
No ratings yet
Unit-V Python - BCC402
20 pages
Lecture 2 - NumPy I
No ratings yet
Lecture 2 - NumPy I
12 pages
Lab 2, Python Numpy
No ratings yet
Lab 2, Python Numpy
9 pages
Ty B Tech - Bda - Ai315 - Lab Manual
No ratings yet
Ty B Tech - Bda - Ai315 - Lab Manual
52 pages
Unit 5
No ratings yet
Unit 5
28 pages
DAY6 Pandas Seaborn
No ratings yet
DAY6 Pandas Seaborn
97 pages
Cs3361-Data Science Lab Manual
No ratings yet
Cs3361-Data Science Lab Manual
44 pages
Lab-3 AI
No ratings yet
Lab-3 AI
21 pages
Implementation of Python Basic Libraries Such As Numpy
No ratings yet
Implementation of Python Basic Libraries Such As Numpy
6 pages
Nptel Presentation
No ratings yet
Nptel Presentation
24 pages
Numpy Basics Introduction To
No ratings yet
Numpy Basics Introduction To
35 pages
NumPy Basics and Operations Guide
No ratings yet
NumPy Basics and Operations Guide
53 pages
Fdsa Lab Manual Final
No ratings yet
Fdsa Lab Manual Final
70 pages
De&v Lab Manual
No ratings yet
De&v Lab Manual
91 pages
(R17 Ai
No ratings yet
(R17 Ai
156 pages
Algorithm Manual
No ratings yet
Algorithm Manual
51 pages
0 Gpu Computing I Give It
No ratings yet
0 Gpu Computing I Give It
57 pages
DC LP
No ratings yet
DC LP
3 pages
View of Early Risk Identification of Cardiac Disease Prediction Using Data Mining and Deep Learning Technique
No ratings yet
View of Early Risk Identification of Cardiac Disease Prediction Using Data Mining and Deep Learning Technique
16 pages
CSM Lab Manual for Students
No ratings yet
CSM Lab Manual for Students
36 pages
Cns Question Paper
No ratings yet
Cns Question Paper
2 pages
Heart Disease Prediction Using Data Mining Techniques IJERTV10IS020083
No ratings yet
Heart Disease Prediction Using Data Mining Techniques IJERTV10IS020083
7 pages
Cs3452 Theory of Computation
No ratings yet
Cs3452 Theory of Computation
43 pages
Unit V
No ratings yet
Unit V
48 pages
Data Structures MANUAL Writing
No ratings yet
Data Structures MANUAL Writing
89 pages
Symmetric Key Cryptography Basics
No ratings yet
Symmetric Key Cryptography Basics
54 pages
Dpco Lab Manua
No ratings yet
Dpco Lab Manua
54 pages
IT8074 - Service Oriented Architecture
No ratings yet
IT8074 - Service Oriented Architecture
196 pages
Oops Unit Iv
No ratings yet
Oops Unit Iv
20 pages
JAVA Min
No ratings yet
JAVA Min
312 pages
Oops Unit Ii
No ratings yet
Oops Unit Ii
18 pages
PCI DSS Compliance Declaration Guide
No ratings yet
PCI DSS Compliance Declaration Guide
2 pages
321D LCR Excavator Electrical System: Harness and Wire Electrical Schematic Symbols
100% (1)
321D LCR Excavator Electrical System: Harness and Wire Electrical Schematic Symbols
2 pages
Question One: 40 Marks: Jomo Kenyatta University of Agriculture and Technology University Examinations 2016/2017
No ratings yet
Question One: 40 Marks: Jomo Kenyatta University of Agriculture and Technology University Examinations 2016/2017
3 pages
SmartRide FAQs - Mobile Program PDF
No ratings yet
SmartRide FAQs - Mobile Program PDF
3 pages
Panaconic Econavia AC
No ratings yet
Panaconic Econavia AC
12 pages
Comparison CODESYS V
No ratings yet
Comparison CODESYS V
10 pages
Record
No ratings yet
Record
115 pages
Adi Statie Lipit Weller-Wecp-20
No ratings yet
Adi Statie Lipit Weller-Wecp-20
1 page
Mobile Packet Data Service (RS-232)
No ratings yet
Mobile Packet Data Service (RS-232)
23 pages
Lab Book
No ratings yet
Lab Book
250 pages
Systems Analysis and Design in A Changing World, Fourth Edition
No ratings yet
Systems Analysis and Design in A Changing World, Fourth Edition
41 pages
SOP-00344 Issue Management Using JIRA (Roadmap)
No ratings yet
SOP-00344 Issue Management Using JIRA (Roadmap)
54 pages
Borre Mans
100% (1)
Borre Mans
115 pages
Solar Energy: History and Innovations
No ratings yet
Solar Energy: History and Innovations
20 pages
SE CER CSA-US Conformity Fronius Primo 10.0-1 - 15.0-1 208-240 EN US
No ratings yet
SE CER CSA-US Conformity Fronius Primo 10.0-1 - 15.0-1 208-240 EN US
4 pages
Mobile Governance in India - Development and Challenges
No ratings yet
Mobile Governance in India - Development and Challenges
3 pages
INS0008 FordAOD Kickdown
No ratings yet
INS0008 FordAOD Kickdown
4 pages
PureBallast PB 3.2 600 Ex
No ratings yet
PureBallast PB 3.2 600 Ex
118 pages
Enterprise Content Management: Ayman Al-Massri
No ratings yet
Enterprise Content Management: Ayman Al-Massri
21 pages
Power Quality Issues & Solutions
No ratings yet
Power Quality Issues & Solutions
1 page
Jensen VM9311 Car Stereo Manual
No ratings yet
Jensen VM9311 Car Stereo Manual
90 pages
Dell Optiplex 7010 DT Service Manualbbb7
No ratings yet
Dell Optiplex 7010 DT Service Manualbbb7
67 pages
Audi VW Parts Catalog 2022
No ratings yet
Audi VW Parts Catalog 2022
3 pages
CRM.03.01 BPD - City Planning Omnichannel
No ratings yet
CRM.03.01 BPD - City Planning Omnichannel
12 pages
Midshire Business Systems - Sharp MX-M623U MX-M753U - Multifunction Mono Printers
No ratings yet
Midshire Business Systems - Sharp MX-M623U MX-M753U - Multifunction Mono Printers
12 pages
Python - Adv - 2 - Jupyter Notebook (Student)
No ratings yet
Python - Adv - 2 - Jupyter Notebook (Student)
28 pages
Radio Link Timeout and AMR: The Effect of AMR On Dropped Calls
No ratings yet
Radio Link Timeout and AMR: The Effect of AMR On Dropped Calls
5 pages
Portable Smart Phone Charger Using Human
No ratings yet
Portable Smart Phone Charger Using Human
6 pages
CurrentLocationDemo PrNo31
No ratings yet
CurrentLocationDemo PrNo31
2 pages

Data Science Lab Manual

Uploaded by

Data Science Lab Manual

Uploaded by

Ex no : 1 Download, install and explore the features of Numpy,

Date : scipy,jupyter and Pandas Package

Python Libraries for Data Visualization

Program to Perform Array Slicing

import numpy as npimport pandas as pd

0 sepallength 150 non-null float64

to 1)from sklearn.metrics import r2_score

#to spli our data into training and testing set

#calculating root mean square error

#Normal Curve : Age

#Density and contour

#Correlation Plot Pair plot

#Scatterplot to visualize the relationship between age and trestbps

You might also like