0% found this document useful (0 votes)

24 views17 pages

Eda Code Snippets

The document provides a comprehensive collection of code snippets for Exploratory Data Analysis (EDA) using Pandas, NumPy, Matplotlib, and Seaborn. It includes over 30 operations for each library, covering data manipulation, statistical analysis, and various visualizations. Each snippet is accompanied by a brief description of its function, making it a useful reference for data analysis tasks.

Uploaded by

Nakshi Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views17 pages

Eda Code Snippets

Uploaded by

Nakshi Shah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Comprehensive EDA Code Snippets with

Descriptions

Pandas Snippets (30+ Operations)

• Display first 5 rows of DataFrame:
df . head ()

• Display last 5 rows of DataFrame:

df . tail ()

• Get summary statistics:

df . describe ()

• Find the mean of a column:

df [ ’ column_name ’]. mean ()

• Find the median of a column:

df [ ’ column_name ’]. median ()

• Find the mode of a column:

df [ ’ column_name ’]. mode () [0]

• Calculate the variance of a column:

df [ ’ column_name ’]. var ()

• Find the standard deviation of a column:

df [ ’ column_name ’]. std ()

1
• Find the covariance matrix:
df . cov ()

• Calculate the correlation matrix:

df . corr ()

• Find unique values in a column:

df [ ’ column_name ’]. unique ()

• Find value counts in a column:

df [ ’ column_name ’]. value_counts ()

• Rename a column:
df . rename ( columns ={ ’ old_name ’: ’ new_name ’} ,
inplace = True )

• Filter rows based on condition:

df [ df [ ’ column_name ’] > 10]

• Group by a column and compute mean:

df . groupby ( ’ column_name ’) . mean ()

• Create a new column based on operation:

df [ ’ new_col ’] = df [ ’ col1 ’] + df [ ’ col2 ’]

• Drop rows with missing values:

df . dropna ( inplace = True )

• Fill missing values with mean:

df [ ’ column_name ’]. fillna ( df [ ’ column_name ’].
mean () , inplace = True )

• Filter rows by multiple conditions:

df [( df [ ’ col1 ’] > 10) & ( df [ ’ col2 ’] == ’ value ’)
]

2
• Reset index of DataFrame:
df . reset_index ( drop = True , inplace = True )

• Sort DataFrame by column:

df . sort_values ( by = ’ column_name ’ , ascending =
False )

• Check for missing values:

df . isnull () . sum ()

• Convert column to datetime:

df [ ’ column_name ’] = pd . to_datetime ( df [ ’
column_name ’])

• Create pivot table:

df . pivot_table ( values = ’ col1 ’ , index = ’ col2 ’ ,
columns = ’ col3 ’)

• Find duplicates in a column:

df [ df . duplicated ([ ’ column_name ’]) ]

• Drop duplicates:
df . drop_duplicates ( inplace = True )

• Apply function to a column:

df [ ’ new_column ’] = df [ ’ column_name ’]. apply (
lambda x : x *2)

• Create dummy variables for categorical columns:

pd . get_dummies ( df [ ’ category_column ’] ,
drop_first = True )

3
NumPy Snippets (30+ Operations)
• Create an array:
np . array ([1 , 2 , 3])

• Create a zeros array:

np . zeros ((3 , 3) )

• Create an identity matrix:

np . eye (3)

• Generate random numbers:

np . random . rand (3 , 3)

• Generate random integers:

np . random . randint (0 , 100 , size =(5 , 5) )

• Find the mean of an array:

np . mean ( arr )

• Find the median of an array:

np . median ( arr )

• Find the variance of an array:

np . var ( arr )

• Find the standard deviation:

np . std ( arr )

• Reshape an array:
np . reshape ( arr , ( rows , cols ) )

• Find the dot product of two arrays:

np . dot ( arr1 , arr2 )

4
• Transpose an array:
arr . T

• Find the inverse of a matrix:

np . linalg . inv ( arr )

• Find eigenvalues and eigenvectors:

np . linalg . eig ( arr )

• Find the determinant of a matrix:

np . linalg . det ( arr )

• Sort an array:
np . sort ( arr )

• Concatenate arrays:
np . concatenate (( arr1 , arr2 ) , axis =0)

• Find the cumulative sum:

np . cumsum ( arr )

• Find the cumulative product:

np . cumprod ( arr )

• Get array of unique values:

np . unique ( arr )

• Find indices of non-zero elements:

np . nonzero ( arr )

• Check if any values in array are true:

np . any ( arr )

• Check if all values in array are true:

5
np . all ( arr )

• Find max element in array:

np . max ( arr )

• Find min element in array:

np . min ( arr )

• Get array of random permutations:

np . random . permutation ( arr )

• Generate random samples from normal distribution:

np . random . normal ( loc =0.0 , scale =1.0 , size =(3 ,
3) )

• Find the percentile of array:

np . percentile ( arr , 90)

article listings color

Comprehensive EDA Code Snippets with Descriptions
article listings color
Comprehensive EDA Code Snippets with Descriptions

Pandas Snippets (30+ Operations)

• Display first 5 rows of DataFrame:
df . head ()

• Display last 5 rows of DataFrame:

df . tail ()

• Get summary statistics:

df . describe ()

• Find the mean of a column:

6
df [ ’ column_name ’]. mean ()

• Find the median of a column:

df [ ’ column_name ’]. median ()

• Find the mode of a column:

df [ ’ column_name ’]. mode () [0]

• Find the variance of a column:

df [ ’ column_name ’]. var ()

• Find the standard deviation of a column:

df [ ’ column_name ’]. std ()

• Find the covariance matrix:

df . cov ()

• Calculate the correlation matrix:

df . corr ()

• Find unique values in a column:

df [ ’ column_name ’]. unique ()

• Find value counts in a column:

df [ ’ column_name ’]. value_counts ()

• Rename a column:
df . rename ( columns ={ ’ old_name ’: ’ new_name ’} ,
inplace = True )

• Filter rows based on condition:

df [ df [ ’ column_name ’] > 10]

• Group by a column and compute mean:

7
df . groupby ( ’ column_name ’) . mean ()

• Create a new column based on operation:

df [ ’ new_col ’] = df [ ’ col1 ’] + df [ ’ col2 ’]

• Drop rows with missing values:

df . dropna ( inplace = True )

• Fill missing values with mean:

df [ ’ column_name ’]. fillna ( df [ ’ column_name ’].
mean () , inplace = True )

• Filter rows by multiple conditions:

df [( df [ ’ col1 ’] > 10) & ( df [ ’ col2 ’] == ’ value ’)
]

• Reset index of DataFrame:

df . reset_index ( drop = True , inplace = True )

• Sort DataFrame by column:

df . sort_values ( by = ’ column_name ’ , ascending =
False )

• Check for missing values:

df . isnull () . sum ()

• Convert column to datetime:

df [ ’ column_name ’] = pd . to_datetime ( df [ ’
column_name ’])

• Create pivot table:

df . pivot_table ( values = ’ col1 ’ , index = ’ col2 ’ ,
columns = ’ col3 ’)

• Find duplicates in a column:

8
df [ df . duplicated ([ ’ column_name ’]) ]

• Drop duplicates:
df . drop_duplicates ( inplace = True )

• Apply function to a column:

df [ ’ new_column ’] = df [ ’ column_name ’]. apply (
lambda x : x *2)

• Create dummy variables for categorical columns:

pd . get_dummies ( df [ ’ category_column ’] ,
drop_first = True )

• Select specific columns:

df [[ ’ column1 ’ , ’ column2 ’]]

• Calculate cumulative sum:

df [ ’ column_name ’]. cumsum ()

• Create a rolling average:

df [ ’ column_name ’]. rolling ( window =5) . mean ()

• Join two DataFrames:

pd . merge ( df1 , df2 , on = ’ common_column ’)

• Concatenate two DataFrames:

pd . concat ([ df1 , df2 ] , axis =1)

NumPy Snippets (30+ Operations)

• Create an array:
np . array ([1 , 2 , 3])

• Create a zeros array:

9
np . zeros ((3 , 3) )

• Create an identity matrix:

np . eye (3)

• Generate random numbers:

np . random . rand (3 , 3)

• Generate random integers:

np . random . randint (0 , 100 , size =(5 , 5) )

• Find the mean of an array:

np . mean ( arr )

• Find the median of an array:

np . median ( arr )

• Find the variance of an array:

np . var ( arr )

• Find the standard deviation:

np . std ( arr )

• Reshape an array:
np . reshape ( arr , ( rows , cols ) )

• Find the dot product of two arrays:

np . dot ( arr1 , arr2 )

• Transpose an array:
arr . T

• Find the inverse of a matrix:

np . linalg . inv ( arr )

10
• Find eigenvalues and eigenvectors:
np . linalg . eig ( arr )

• Find the determinant of a matrix:

np . linalg . det ( arr )

• Sort an array:
np . sort ( arr )

• Find the cumulative sum:

np . cumsum ( arr )

• Find the cumulative product:

np . cumprod ( arr )

• Concatenate two arrays:

np . concatenate (( arr1 , arr2 ) , axis =0)

• Find the maximum value in an array:

np . max ( arr )

• Find the minimum value in an array:

np . min ( arr )

• Find the index of the maximum value:

np . argmax ( arr )

• Find the index of the minimum value:

np . argmin ( arr )

• Create an array of ones:

np . ones ((3 , 3) )

• Flatten an array:

11
arr . flatten ()

• Find the shape of an array:

arr . shape

• Find the rank of a matrix:

np . linalg . matrix_rank ( arr )

• Find the trace of a matrix:

np . trace ( arr )

• Repeat elements of an array:

np . tile ( arr , (2 , 2) )

• Slice an array:
arr [1:3]

Matplotlib Snippets (30+ Visualizations)

• Create a simple line plot:
plt . plot (x , y )
plt . show ()

• Set plot title and labels:

plt . title ( ’ Title ’)
plt . xlabel ( ’X - axis ’)
plt . ylabel ( ’Y - axis ’)

• Create a bar chart:

plt . bar (x , y )

• Create a scatter plot:

plt . scatter (x , y )

12
• Create a histogram:
plt . hist ( data , bins =10)

• Create a box plot:

plt . boxplot ( data )

• Set axis limits:

plt . xlim (0 , 10)
plt . ylim (0 , 100)

• Display grid on the plot:

plt . grid ( True )

• Create a subplot:
plt . subplot (2 , 1 , 1)
plt . plot (x , y )

• Save a plot as image:

plt . savefig ( ’ plot . png ’)

• Change line style and color:

plt . plot (x , y , linestyle = ’ - - ’ , color = ’r ’)

• Create a pie chart:

plt . pie ( sizes , labels = labels )

• Change figure size:

plt . figure ( figsize =(8 , 6) )

• Create a filled plot:

plt . fill_between (x , y1 , y2 )

• Create a heatmap:
plt . imshow ( data , cmap = ’ hot ’)

13
• Add legend to the plot:
plt . legend ([ ’ Label1 ’ , ’ Label2 ’])

• Annotate a point on plot:

plt . annotate ( ’ Point ’ , xy =( x , y ) , xytext =( x +1 ,
y +10) ,
arrowprops = dict ( facecolor = ’ black
’) )

• Create a violin plot:

plt . violinplot ( data )

• Create a stacked bar chart:

plt . bar (x , y1 , label = ’ Y1 ’)
plt . bar (x , y2 , bottom = y1 , label = ’ Y2 ’)
plt . legend ()

• Set logarithmic scale:

plt . xscale ( ’ log ’)

• Change marker style:

plt . plot (x , y , marker = ’o ’)

• Plot a function:
x = np . linspace (0 , 10 , 100)
plt . plot (x , np . sin ( x ) )

• Set axis aspect ratio:

plt . gca () . set_aspect ( ’ equal ’ , adjustable = ’ box
’)

• Fill under a line plot:

plt . fill (x , y )

• Create a polar plot:

plt . subplot ( projection = ’ polar ’)
plt . plot ( theta , r )

14
• Create a quiver plot:
plt . quiver (x , y , u , v )

• Create a contour plot:

plt . contour (X , Y , Z )

• Add text to plot:

plt . text (1 , 1 , ’ Text ’ , fontsize =12)

• Draw a horizontal line:

plt . axhline ( y =0.5 , color = ’r ’)

• Draw a vertical line:

plt . axvline ( x =0.5 , color = ’g ’)

• Create a 3D plot:
ax = plt . axes ( projection = ’3d ’)
ax . plot3D (x , y , z )

Seaborn Snippets (30+ Visualizations)

• Create a seaborn scatter plot:
sns . scatterplot ( x = ’ col1 ’ , y = ’ col2 ’ , data = df )

• Create a seaborn line plot:

sns . lineplot ( x = ’ col1 ’ , y = ’ col2 ’ , data = df )

• Create a seaborn bar plot:

sns . barplot ( x = ’ col1 ’ , y = ’ col2 ’ , data = df )

• Create a seaborn box plot:

sns . boxplot ( x = ’ col1 ’ , y = ’ col2 ’ , data = df )

• Create a seaborn histogram:

15
sns . histplot ( df [ ’ column ’] , kde = True )
\ item \ textbf { Add a legend :}
\ begin { lstlisting }
plt . legend ([ ’ Label1 ’ , ’ Label2 ’])

• Create a stacked bar chart:

plt . bar (x , y1 , label = ’ Y1 ’)
plt . bar (x , y2 , bottom = y1 , label = ’ Y2 ’)
plt . legend ()

• Save a figure:
plt . savefig ( ’ figure . png ’)

• Create a pie chart:

plt . pie ( sizes , labels = labels , autopct = ’%1.1 f
%% ’)

• Create a 3D plot:
from mpl_toolkits . mplot3d import Axes3D
fig = plt . figure ()
ax = fig . add_subplot (111 , projection = ’3d ’)
ax . scatter (x , y , z )

• Create a contour plot:

plt . contour (X , Y , Z )

• Create a heatmap:
plt . imshow ( data , cmap = ’ hot ’ , interpolation = ’
nearest ’)
plt . colorbar ()

• Change figure size:

plt . figure ( figsize =(10 , 5) )

• Add annotations:
plt . annotate ( ’ Point ’ , xy =( x , y ) , xytext =( x +1 ,
y +1) , arrowprops = dict ( facecolor = ’ black ’ ,
arrowstyle = ’ - > ’) )

16
• Create a violin plot:
plt . violinplot ( data )

• Create a pair plot using Seaborn:

import seaborn as sns
sns . pairplot ( df )

• Customize tick marks:

plt . xticks ( rotation =45)

• Create a polar plot:

plt . polar ( theta , r )

• Create a histogram with density:

plt . hist ( data , density = True , bins =10)

• Create a filled area plot:

plt . fill_between (x , y1 , y2 )

• Overlay multiple plots:

plt . plot (x , y1 , label = ’ Y1 ’)
plt . plot (x , y2 , label = ’ Y2 ’)
plt . legend ()

• Show the plot:

plt . show ()

Python Data Analysis Cheat Sheet
100% (3)
Python Data Analysis Cheat Sheet
9 pages
Cheat Sheet
No ratings yet
Cheat Sheet
12 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
Pandas Data Wrangling Cheat Sheet
100% (2)
Pandas Data Wrangling Cheat Sheet
6 pages
Python Data Science Cheat Sheet
0% (1)
Python Data Science Cheat Sheet
3 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Eda Lab 2 Asn 202301460
No ratings yet
Eda Lab 2 Asn 202301460
16 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Mastering Pandas: DataFrame Operations
100% (2)
Mastering Pandas: DataFrame Operations
33 pages
Python Data Structures and Libraries Guide
No ratings yet
Python Data Structures and Libraries Guide
7 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
Pandas & PyNumS Essentials
No ratings yet
Pandas & PyNumS Essentials
10 pages
DataFrame Statistics
No ratings yet
DataFrame Statistics
41 pages
Eda Lab Assignment2
No ratings yet
Eda Lab Assignment2
10 pages
Data Handling Module
No ratings yet
Data Handling Module
10 pages
Ai Programs
No ratings yet
Ai Programs
22 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
Data Science Python Cheat Sheet
No ratings yet
Data Science Python Cheat Sheet
25 pages
Usage of NumPy For Numerical Data in Detail
No ratings yet
Usage of NumPy For Numerical Data in Detail
52 pages
Class Xii PDF For Practical
No ratings yet
Class Xii PDF For Practical
24 pages
NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
Python Cheat Sheet Code Academy
100% (1)
Python Cheat Sheet Code Academy
1 page
EDA Code Cheatsheet for Data Analysis
No ratings yet
EDA Code Cheatsheet for Data Analysis
29 pages
Pandas Cheat Sheet PDF
67% (3)
Pandas Cheat Sheet PDF
1 page
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Pandas
No ratings yet
Pandas
5 pages
Even Students
No ratings yet
Even Students
36 pages
Python Libraries for Statistical Analysis
No ratings yet
Python Libraries for Statistical Analysis
40 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Python NumPy and Pandas Exercises
No ratings yet
Python NumPy and Pandas Exercises
24 pages
Pandas NumPy Practice Questions
No ratings yet
Pandas NumPy Practice Questions
2 pages
Python
No ratings yet
Python
32 pages
L - AND - T - Project - Naveen 24cs002895
No ratings yet
L - AND - T - Project - Naveen 24cs002895
7 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
4 PythonPandas
No ratings yet
4 PythonPandas
8 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
Python Data Science Cheat Sheet
100% (2)
Python Data Science Cheat Sheet
6 pages
Fundamental - Python
No ratings yet
Fundamental - Python
3 pages
Python CSBS Bhavya Lab Manual
No ratings yet
Python CSBS Bhavya Lab Manual
14 pages
Pandas For Machine Learning
No ratings yet
Pandas For Machine Learning
10 pages
Interactive Data Analysis With Jupyter Cheatsheet 1731972443
No ratings yet
Interactive Data Analysis With Jupyter Cheatsheet 1731972443
10 pages
Series and Pandas Methods
No ratings yet
Series and Pandas Methods
5 pages
Pandas Trampas
No ratings yet
Pandas Trampas
9 pages
Pandas Operations Guide
No ratings yet
Pandas Operations Guide
6 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
6 pages
Python Pandas Practical Examples
No ratings yet
Python Pandas Practical Examples
15 pages
Python Comands
No ratings yet
Python Comands
3 pages
Pandas
No ratings yet
Pandas
25 pages
L and T Projects - Colabs
No ratings yet
L and T Projects - Colabs
7 pages
Pandas Cheat Sheet for Data Science
No ratings yet
Pandas Cheat Sheet for Data Science
5 pages
Data Prep & EDA for Python Users
No ratings yet
Data Prep & EDA for Python Users
12 pages
DNS and HTTP Protocol Overview
No ratings yet
DNS and HTTP Protocol Overview
9 pages
Gate Coa Net Notes
No ratings yet
Gate Coa Net Notes
44 pages
ML QB Odd 2023
No ratings yet
ML QB Odd 2023
23 pages
Questions On Spanning Tree, BFS & DFS
No ratings yet
Questions On Spanning Tree, BFS & DFS
4 pages
MA6459 Numerical Methods
No ratings yet
MA6459 Numerical Methods
12 pages
9 Bfad 6
No ratings yet
9 Bfad 6
20 pages
Curve Length Calculation in MTH 252
No ratings yet
Curve Length Calculation in MTH 252
2 pages
9FM0 A Level Maths Papers 1 and 2 Topic Test 2
No ratings yet
9FM0 A Level Maths Papers 1 and 2 Topic Test 2
31 pages
Maths 2U 1989 HSC
No ratings yet
Maths 2U 1989 HSC
4 pages
8TH Maths
No ratings yet
8TH Maths
4 pages
Harolds Series Convergence Tests Cheat Sheet 2016
No ratings yet
Harolds Series Convergence Tests Cheat Sheet 2016
2 pages
General Topology: Concepts & Dimension
No ratings yet
General Topology: Concepts & Dimension
205 pages
Big M Method
No ratings yet
Big M Method
19 pages
Taylor Series
No ratings yet
Taylor Series
6 pages
The Fundamental Solutions of The Curve Shortening Problem Via The Schwarz Function
No ratings yet
The Fundamental Solutions of The Curve Shortening Problem Via The Schwarz Function
6 pages
Increasing and Decreasing HW
No ratings yet
Increasing and Decreasing HW
6 pages
Work Sheet-9. Real Numbers
No ratings yet
Work Sheet-9. Real Numbers
2 pages
Linear Equations Practice Worksheet
No ratings yet
Linear Equations Practice Worksheet
4 pages
Chapter 2
No ratings yet
Chapter 2
29 pages
Somerville School, UT-2 Mathematics, Class 10th
No ratings yet
Somerville School, UT-2 Mathematics, Class 10th
2 pages
Strata Technical Manual Guide
No ratings yet
Strata Technical Manual Guide
103 pages
Solving Linear Equations in MATLAB
No ratings yet
Solving Linear Equations in MATLAB
7 pages
The Z-Transform and Its Applications
No ratings yet
The Z-Transform and Its Applications
9 pages
Game Theory Solutions Guide
100% (1)
Game Theory Solutions Guide
16 pages
Calculus Examples and Solutions
No ratings yet
Calculus Examples and Solutions
2 pages
Chapter 1: Warm Up - Solution: 1 Powers of Base 2
No ratings yet
Chapter 1: Warm Up - Solution: 1 Powers of Base 2
4 pages
Integration and Application: X X DX
No ratings yet
Integration and Application: X X DX
3 pages
Spherical Geometry for Qiblah Direction
No ratings yet
Spherical Geometry for Qiblah Direction
10 pages
FIR Filter Design Techniques
No ratings yet
FIR Filter Design Techniques
69 pages
Simple Algorithm for Torus-Sphere Intersection
No ratings yet
Simple Algorithm for Torus-Sphere Intersection
10 pages
Ramanujan: Life and Mathematical Legacy
No ratings yet
Ramanujan: Life and Mathematical Legacy
12 pages
MAT223 Solved Problems On Eigenvalues, Eigenvectors, and Diagonalization PDF
100% (3)
MAT223 Solved Problems On Eigenvalues, Eigenvectors, and Diagonalization PDF
3 pages
Further Statistics 1 Unit Test 7 Central Limit Theorem
No ratings yet
Further Statistics 1 Unit Test 7 Central Limit Theorem
3 pages
Some Results On The Growth of Entire Functions On The Basis of Central Index
No ratings yet
Some Results On The Growth of Entire Functions On The Basis of Central Index
13 pages