0% found this document useful (0 votes)

5 views

Neel

The document outlines a series of tasks related to data manipulation and analysis using NumPy and Pandas in Python. It includes generating movie IDs, creating user rating matrices, handling employee data with email and password generation, and performing operations on a DataFrame. Additionally, it covers saving data to Excel and accessing specific records from the dataset.

Uploaded by

NITHEESH KUMAR REDDY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Neel

Uploaded by

NITHEESH KUMAR REDDY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Bachelor of Technology in

Computer Science and Engineering

Scripting Language Laboratory
18CS58L
NAME:G.VENKATA NEELESH
USN NO: 20BTRCS078
SECTION: B
YEAR / SEM: 5th / 3rd
BRANCH: CSE GENERAL

1
USE CASE - 1 (NUMPY)

Tasks To Perform

1. Generate 1000 Movies IDs starting from 1301.

import numpy as np
import random
movie_id=np.arange(1301,2301) #generating 1000 movie ids
print(movie_id.shape)
print(movie_id[0:5])
print(movie_id[500:506])

2. Create a movie matrix, to store user rating such that:

a. There are 100 users.
b. Each Users can review as many movies as they wish.
c. The Review should be b/w 0 to 10 (inclusive)

def createMovieMatrix(numUser, numMovies) : # creating a matrix

movie_matrix=[]
for user in range(numUser):
movies_rated_by_a_user= np.full(numMovies,-1) # default -1
num_movies_rated=random.randint(0,numMovies-1) # number of movies rated
movies_that_user_rates=random.sample(range(numMovies),num_movies_rated) # all movies cant be
rated by every user this will give random values to num of movies rated
# range is 0 to 999 and it'll display the array of number of movies rated having values in range 0 to 999
for i in movies_that_user_rates :
movies_rated_by_a_user[i]=random.randint(0,10)
movie_matrix.append(movies_rated_by_a_user)
movie_matrix=np.array(movie_matrix)
return movie_matrix

#displaying the matrix

numMovies=1000
numUser=100
movie_matrix=createMovieMatrix(numUser,numMovies)
print('Movie Matrix Details : ')
print('Shape : ', movie_matrix.shape)
print('10, 10 slice of movie matrix : ')
print(movie_matrix[31:40,500:510])

2
3. We have 10 movie experts, lets us take their review too, also 50 new moives have to be added to the
matrix along with their reviews.

movie_matrix=createMovieMatrix(100,1000) # adding 10 new users

expert_matrix=createMovieMatrix(10,1000)
movie_matrix=np.vstack([movie_matrix,expert_matrix])
print(movie_matrix.shape)

newMovie_matrix=createMovieMatrix(110,50) # adding 50 new movies

movie_matrix=np.hstack([movie_matrix,newMovie_matrix])
print(movie_matrix.shape)
print(movie_matrix)

print(movie_matrix.shape)
col=5 #cols are movie id in matrix
m=movie_matrix[:,col]
print(m)
m=m[m>=0]
print(m)
print(len(m))
print(m.shape[0]) # same as previous line #num of rating
print(round(m.mean(),2)) #mean
print(round(m.std(),2)) # standard deviation

3
4. Create a Final moive rating matrix with 4 columns, i.e ‘Movie ID’, ‘Avg-Rating’, ‘Number Of
Ratings’, ‘Standard Deviation Of Ratings’.

movie_id=np.arange(1301,2351)
movie_stats=[]
for col in range(1050):
m=movie_matrix[:,col]
m=m[m>=0]
movie_stats.append([movie_id[col], round(m.mean(),2), m.size ,round(m.std(),2)])
movie_stats=np.array(movie_stats)
print(movie_stats.shape)
print(movie_stats[:5,:])

5. Also Counvert the final movie ratings to have range from 0 to 10, such that the minimum ratings
convert into 0 and maximum to 10 and other values in between.

startOfNewRange=0
for i in range(1050):
movie_rating=movie_stats[i,1]
print('Movie',i,'old rating',movie_rating)
print('Distance from minimum',movie_rating-x.min())
print('ratio of range',originalRatingRange/avgRatingRange)
newRating=(movie_rating-x.min())*(originalRatingRange/avgRatingRange)+startOfNewRange
print('Movie',i,'new rating :',newRating)

4
6. Display the films rating wise, Highest to Lowest.

movie_ratings=np.array(movie_stats)
movie_ratings=np.sort(movie_ratings,axis=0)
movie_ratings=movie_ratings[::-1]
print(movie_ratings)

USE CASE - 2 (PANDAS)

Tasks to Perform.

!wget https://www.dropbox.com/s/onl5ac2ea3v11aw/names.txt

5
import pandas as pd
import random as r
import numpy as np

f=open('names.txt')
allnames= f.read()
f.close()
print(allnames)

names = allnames.split('\n')
print(type(names))
print(len(names))
print(names[100:110])

removedName=names.pop()

print(len(names))

6
1. We have hired 1000 new employees. Here are the names in a text file, each name separated using a
new line: Can you generate the following
● Employer ID: Starting from 2929999 Example 2029001, 2039002 and so on
● Email ID: [email protected]
● Password: It should be an alphanumeric value, Must be having capital letters,small letters, and
special symbols

def emailgen(name):
namesp=name.split()
emailid = '.'.join(namesp) + '@jainuniversity.ac.in'
return emailid.lower()

emailgen(names[100])

def pwd_gen():
caps_alpha = r.sample('ABCDEFGHIJKLMNOPQRSTUVWXYZ', r.randint(1,3))
small_alpha = r.sample('abcdefghijklmnopqrstuvwxyz', r.randint(2,5))
num = r.sample('0123456789', r.randint(1,3))
sp_chr = r.sample('!@#$%^&*_.', r.randint(1,1))
pwdlist=caps_alpha+small_alpha+num+sp_chr
r.shuffle(pwdlist) #it belongs to numpy
pwd=''.join(pwdlist)
return pwd

pwd_gen()

# List Comprehension

emailIDs = [emailgen(name) for name in names]

passwords = [pwd_gen() for i in range(1000)]
empIDs = [id for id in range(2020001, 2020001+1000)]

print(names)
print(emailIDs)
print(passwords)
print(empIDs)

2. I need you to perform some operations on the data, so can we load the data in a Pandas DataFrame.

7
df = pd.DataFrame({'Name': names, 'Employee ID': empIDs, 'Email ID': emailIDs, 'Password' : passwords})

3. Just to cross-check, can you show me the first 2 and the last 3 rows, Also, let's check the shape of
the DataFrame and print the data-types of each column.

df.head(2)

df.tail(3)

df.columns
type(df.Name)
df.dtypes

4. Print the employee id email ail id and password of ‘Nancy Zediker’.

df[df.Name=='Nancy Zediker']
Or
df[df['Name']=='Nancy Zediker']

5. Can we check if the column Email it should not contain any duplicate emails, also check its size and
print the first 10 values.
8
serEmail = df['Email ID']
type(serEmail)
print(serEmail.shape)

emailDub = df['Email ID'].duplicated()

df['Email ID'].duplicated().any()

df[emailDub]

6. If duplicate email is found, add a number in the duplicated email We have planned to invite them
on lunch in batches.

df[df['Name']=='Lois Thompson']

df.loc[850,'Email ID']='[email protected]'

df[df['Name']=='Lois Thompson']

df['Email ID'].duplicated().any()

7. Lets create the first batch of all people whose names are starting with A. Also give me their count.

# Create a batch of employees Starting with 'A'

A = df[df.Name.str.startswith("A")]
A.shape

9
fb = pd.DataFrame(A)
fb

8. I just got to know, that the people at index 10, 130, and 560 will not be joining so please remove
their records.

print(df.iloc[10,0])
print(df.iloc[130,0])
print(df.iloc[560,0])

df.drop([10,130,560], axis=0, inplace=True)

df.head(12)

9. We also need to share the data with the finance department so can you create a new DataFrame
without the password column and save it in an excel file for sharing

# Using DataFrame.copy() create new DaraFrame.

df2 = df[['Name', 'Employee ID','Email ID']].copy()
print(df2)

10
file_name = 'Data.xlsx'
# saving the excel
df2.to_excel(file_name)
print('DataFrame is written to Excel File successfully.')

10. Can you tell me how I will access the data from excel file wing names. For example show me the
data for ‘John Brown’ and ‘Michael Combes’

df3=pd.read_excel('Data.xlsx')
print(df3.shape)
print(df3.columns)
df3.head()

11
print(df3.loc[(1,3),['Name', 'Employee ID', 'Email ID']])

06 09 0841 02 2RP AFP-Narrative-Greedy-Hyena-1
No ratings yet
06 09 0841 02 2RP AFP-Narrative-Greedy-Hyena-1
12 pages
Banned Books Project
No ratings yet
Banned Books Project
3 pages
NEEL (1)
No ratings yet
NEEL (1)
12 pages
NEEL (1)_edited
No ratings yet
NEEL (1)_edited
12 pages
NEEL (1) Edited Edited
No ratings yet
NEEL (1) Edited Edited
12 pages
FDS RECORD-1-4
No ratings yet
FDS RECORD-1-4
18 pages
Python programming U5
No ratings yet
Python programming U5
46 pages
Khadeeja_DS_PRACTICAL 4
No ratings yet
Khadeeja_DS_PRACTICAL 4
24 pages
QP DAV 3rd Sem Dec 2023
No ratings yet
QP DAV 3rd Sem Dec 2023
12 pages
Pandas Lab Assignment Work-2
No ratings yet
Pandas Lab Assignment Work-2
5 pages
Lab_questions_IDSE_2024
No ratings yet
Lab_questions_IDSE_2024
7 pages
Ip Practical File Final
No ratings yet
Ip Practical File Final
50 pages
ML ASSIGNMENT 2..
No ratings yet
ML ASSIGNMENT 2..
6 pages
Manual
No ratings yet
Manual
48 pages
IP
No ratings yet
IP
10 pages
DS Practical
No ratings yet
DS Practical
30 pages
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
No ratings yet
Ge Sem II Dav Upc 2344001201 Sl. No. Qp. 2012 July 2023
16 pages
Library Management System Removed[1]
No ratings yet
Library Management System Removed[1]
17 pages
Pandas NumPy Practice Questions
No ratings yet
Pandas NumPy Practice Questions
2 pages
Class 12 IP - Program List - Term1
No ratings yet
Class 12 IP - Program List - Term1
2 pages
IP Practicals Filed
No ratings yet
IP Practicals Filed
5 pages
4BUIS014W Business Computing-Portfolio
No ratings yet
4BUIS014W Business Computing-Portfolio
7 pages
Python Pandas Assignments
No ratings yet
Python Pandas Assignments
3 pages
PDA_Assignment questions
No ratings yet
PDA_Assignment questions
4 pages
Shivansh Rawat IP Practical File XII
No ratings yet
Shivansh Rawat IP Practical File XII
43 pages
Info Pract Xii Ms PB 1 Set 1
No ratings yet
Info Pract Xii Ms PB 1 Set 1
4 pages
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
No ratings yet
Exp_1_Introduction to Data Analytics and Python fundamentals_sdk_ok
9 pages
Class 12 IP Practical Record
No ratings yet
Class 12 IP Practical Record
33 pages
Practical of R
No ratings yet
Practical of R
38 pages
XII IP Practical File - 2023-24upto June
No ratings yet
XII IP Practical File - 2023-24upto June
6 pages
Pragya File
No ratings yet
Pragya File
31 pages
dav 2024 pyq
No ratings yet
dav 2024 pyq
7 pages
Experiment 5 To 8
No ratings yet
Experiment 5 To 8
2 pages
Assignment 2 Oops
No ratings yet
Assignment 2 Oops
10 pages
Practice 1,2
No ratings yet
Practice 1,2
8 pages
Informatics Practices Practical List22-2323
No ratings yet
Informatics Practices Practical List22-2323
6 pages
ASSIGNMENT 1
No ratings yet
ASSIGNMENT 1
2 pages
Class 1 - 2024 Business Analytics
No ratings yet
Class 1 - 2024 Business Analytics
8 pages
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
No ratings yet
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
12 pages
2023 Data Analysis and Visualization Using Python
100% (1)
2023 Data Analysis and Visualization Using Python
9 pages
Python Project File
No ratings yet
Python Project File
31 pages
IDAP Assignment
No ratings yet
IDAP Assignment
6 pages
Practical Record File X - DS
No ratings yet
Practical Record File X - DS
12 pages
Harsh Practical File
No ratings yet
Harsh Practical File
54 pages
12PB24IP01 QP
No ratings yet
12PB24IP01 QP
12 pages
AI Lab 04 Lab Tasks
No ratings yet
AI Lab 04 Lab Tasks
18 pages
XII IP Practical List 2023-24
No ratings yet
XII IP Practical List 2023-24
4 pages
XII IP Practical File 1 Complete
No ratings yet
XII IP Practical File 1 Complete
38 pages
Project On Netflix Data Analysis
100% (1)
Project On Netflix Data Analysis
22 pages
Ip Practical 2024 2025
No ratings yet
Ip Practical 2024 2025
14 pages
Lab 2 Solved
No ratings yet
Lab 2 Solved
3 pages
Ekansh Practical File XII-A
No ratings yet
Ekansh Practical File XII-A
16 pages
Grade 12 IP - Practical File Questions 2024-2025
No ratings yet
Grade 12 IP - Practical File Questions 2024-2025
6 pages
computer science programs
No ratings yet
computer science programs
13 pages
XII IP Practical File
No ratings yet
XII IP Practical File
52 pages
Python ClassXII AI
No ratings yet
Python ClassXII AI
4 pages
Practical File XII IP 2024-25
No ratings yet
Practical File XII IP 2024-25
6 pages
XII_CS_WC_MS_SET 2
No ratings yet
XII_CS_WC_MS_SET 2
5 pages
IP Practical 2023-24 (1 To 34)
100% (1)
IP Practical 2023-24 (1 To 34)
32 pages
PRACTICAL FILE INFOMATICS PRACTICES 2024-25
No ratings yet
PRACTICAL FILE INFOMATICS PRACTICES 2024-25
39 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Human Parsing For Image-Based Virtual Try-On Using Pix2Pix
No ratings yet
Human Parsing For Image-Based Virtual Try-On Using Pix2Pix
5 pages
Science Paper 1
No ratings yet
Science Paper 1
8 pages
L&T Construction - Water & Effluent Treatment IC: Master List of Formwork Components
No ratings yet
L&T Construction - Water & Effluent Treatment IC: Master List of Formwork Components
21 pages
FilmQ - The Mongols
No ratings yet
FilmQ - The Mongols
2 pages
Huawei NodeB Data Configuration
100% (3)
Huawei NodeB Data Configuration
57 pages
Beed 316 Reviewer
No ratings yet
Beed 316 Reviewer
3 pages
Adding Ribbons To Charts: Amibroker Canada User Group November 5, 2019
No ratings yet
Adding Ribbons To Charts: Amibroker Canada User Group November 5, 2019
19 pages
Discovering Fiction2 (150-270
No ratings yet
Discovering Fiction2 (150-270
122 pages
Mini IO Controller - Maxon Solutions
No ratings yet
Mini IO Controller - Maxon Solutions
2 pages
WaypointsFree PDF
No ratings yet
WaypointsFree PDF
3 pages
PATUL ES - FLEXIBLE CLASSROOM PROGRAM-K TO 3 Edited
No ratings yet
PATUL ES - FLEXIBLE CLASSROOM PROGRAM-K TO 3 Edited
16 pages
Master Thesis On Elliptic Curve Cryptography
100% (3)
Master Thesis On Elliptic Curve Cryptography
5 pages
Module 4 Lesson 1
No ratings yet
Module 4 Lesson 1
5 pages
Undergraduate Mathematics - Foundations
No ratings yet
Undergraduate Mathematics - Foundations
260 pages
Visual Programming Techniques
No ratings yet
Visual Programming Techniques
4 pages
வித்யா (தத்துவம்)
No ratings yet
வித்யா (தத்துவம்)
7 pages
DDCA Ch4 VHDL
No ratings yet
DDCA Ch4 VHDL
35 pages
Lesson Plan LLT
No ratings yet
Lesson Plan LLT
3 pages
LKG Syllabus 2020
No ratings yet
LKG Syllabus 2020
5 pages
Twelfth Night or What You Will 2nd Edition William Shakespeare - Download the ebook today and own the complete content
100% (1)
Twelfth Night or What You Will 2nd Edition William Shakespeare - Download the ebook today and own the complete content
47 pages
WK 1. Emerging Trends CL XII
No ratings yet
WK 1. Emerging Trends CL XII
6 pages
DXC Model Placement Paper
No ratings yet
DXC Model Placement Paper
34 pages
Math Syllabus (2025-27)
No ratings yet
Math Syllabus (2025-27)
43 pages
How To Speak Well Part 2
No ratings yet
How To Speak Well Part 2
151 pages
Top 100 Automation Testing Interview Questions - Most Automation Testing Interview Questions and Answers - Wisdom Jobs
No ratings yet
Top 100 Automation Testing Interview Questions - Most Automation Testing Interview Questions and Answers - Wisdom Jobs
11 pages
02 The Elements of Typographic Style PDF
No ratings yet
02 The Elements of Typographic Style PDF
386 pages
White Book About The Language Dispute Between Bulgaria and Republic of North Macedonia
100% (2)
White Book About The Language Dispute Between Bulgaria and Republic of North Macedonia
162 pages
Year 7 SOW MUSIC Autumn 1-Rhythm: Starter
No ratings yet
Year 7 SOW MUSIC Autumn 1-Rhythm: Starter
19 pages

Neel

Uploaded by

Neel

Uploaded by

Bachelor of Technology in

Computer Science and Engineering

1. Generate 1000 Movies IDs starting from 1301.

2. Create a movie matrix, to store user rating such that:

def createMovieMatrix(numUser, numMovies) : # creating a matrix

#displaying the matrix

movie_matrix=createMovieMatrix(100,1000) # adding 10 new users

newMovie_matrix=createMovieMatrix(110,50) # adding 50 new movies

USE CASE - 2 (PANDAS)

emailIDs = [emailgen(name) for name in names]

4. Print the employee id email ail id and password of ‘Nancy Zediker’.

emailDub = df['Email ID'].duplicated()

# Create a batch of employees Starting with 'A'

df.drop([10,130,560], axis=0, inplace=True)

# Using DataFrame.copy() create new DaraFrame.

You might also like