Data Handling with Python Pandas Basics
Data Handling with Python Pandas Basics
Series Dataframe
One-dimensional Two-dimensional
Value mutable i.e. element’s value Value mutable i.e. element’s value can
can be changed be changed
Size immutable i.e. once created, size Size mutable i.e. size can be changed
of series cannot be changed after creation
Series Datastructure
<series object>=[Link]()
eg: s1=[Link]()
1
Creating Series from a List/Tuple
eg:
import pandas as pd
s1=[Link]([12,10,14,16])
s2=[Link]([12,10,14,16],index=[‘a’,’b’,’c’,’d’])
print(“Series object with default index”)
print(s1)
print(“Series object with specified index”)
print(s2)
Output:
Series object with default index
0 12
1 10
2 14
3 16
Series object with specified index
a 12
b 10
c 14
d 16
eg:
import pandas as pd
import numpy as np
ar1=[Link](10,20,3)
ar2=[Link]([20,25,30])
2
s1=[Link](ar1)
s2=[Link](ar2,index=('Mark1','Mark2','Mark3'))
print('Series object from ndarray with default index')
print(s1)
print('Series object from ndarray with specified index')
print(s2)
Output:
Series object from ndarray with default index
0 10
1 13
2 16
3 19
Series object from ndarray with specified index
Mark1 20
Mark2 25
Mark3 30
Note: index argument is optional. If not given, keys of the dictionary becomes
the index values
eg:
import pandas as pd
dict1={"Name":"Rajeev","Age":17,"Class":"XII"}
s1=[Link](dict1)
print('Series object from dictionary with keys as index')
print(s1)
Output:
3
Creating Series from a scalar value
eg:
import pandas as pd
s1=[Link](15,index=['Mark1','Mark2','Mark3'])
print('Series object from scalar value')
print(s1)
Output:
MCQ questions
Section A
a) install pandas
b) pandas install python
c) python install pandas
d) pip install pandas
Ans: d
2 Pandas Series is a -----------------------------array
a) one dimensional
b) two dimensional
c) three dimensional
d) None of the above
Ans: a
3 Which of the following is the purpose of Python Pandas?
4
c) To create a High level array
d) All the above
Ans: c
4 Identify the correct statement
Ans: c
5 Minimum number of arguments required to pass in pandas Series
function for creating a non-empty series-----------------
a) 0
b) 1
c) 2
d) 3
Ans: b
6 Pandas is a/an -------------------python library
a) proprietary
b) open source
c) shareware
d) None of the above
Ans: b
7 Which of the following is not a feature of pandas series?
Ans: d
8 The label associated with a particular data value in Series is
called…………
a) Item
b) Index
c) Column
d) Values
Ans: b
9 Tabular data can be processed using-------------------
5
a) Numpy
b) Pandas
c) Matplotlib
d) All of these
Ans: b
10 Which of the following datatype can be given as data in a pandas Series
function?
a) a python dictionary
b) an ndarray
c) a scalar value
d) All the above
Ans: d
11 Pandas series is a combination of________
Ans: b
12 Which of the following is correct statement for creating empty series?
(Assume that pandas library is already imported as pd)
a) ser = [Link](NaN)
b) ser = [Link]
c) ser = [Link]()
d) None of the above
Ans: c
13 Which of the following condition raise a ValueError while creating a
series?
Ans: c
14 How many values will be there in array1, if given code is not returning any
error?
>>> series4 = [Link](array1, index = [“Jan”, “Feb”, “Mar”, “Apr”])
a) 1
b) 2
6
c) 3
d) 4
Ans: d
15 When we create a series from dictionary then the keys of dictionary
become ________________
Ans: a
Section B
1 For creating the below series, S1, which of the following command(s) can
be used?
Series(S1)
0 10
1 12
2 14
a) S1=[Link]([10,12,14])
b) S1=[Link]([10,12,14],index=[0,1,2])
c) S1=[Link](index=[0,1,2],data=[10,12,14])
d) All of the above
Ans: d
a)
One Hello
Two Hello
Three Hello
b)
One Hello
c) Error
d) None of the above
7
Ans: a
a) line1
b) line2
c) line3
d) line4
Ans: line3
a) import pandas as pd
S1 = [Link](data = [31,28,31],
index=["January","February","March"])
print(S1)
b) import pandas as pd
S1 = [Link]([31,28,31], index=["January","February","March"])
print(S1)
c) Both of the above
d) None of the above
Ans: c
5 Read the statements given below and identify the right option
Statement 1: Series is a one-dimensional labeled array capable of
holding any data type
Statement 2: If data is an ndarray, index must be the same length as
data.
8
Ans: c
6 Read the statements given below and identify the right option
Assertion (A): You need to install the pandas library using the pip install
command.
Reason (R): You can also access pandas without installation.
Ans: c
7 Read the statements given below and identify the right option
Ans: d
8 Ananya wants to store her Term-I mark in a Series which is already stored
in a NumPy array. Choose the statement which will create the series with
Subjects as indexes and Marks as elements.
import pandas as pd
import numpy as np
Marks =[Link]([30,32,34,28,30])
subjects = ['English','Maths','Chemistry','Physics','IP']
Series1= _______________________________
a) [Link](Marks,index=subjects)
b) [Link]([Link],index=subjects)
c) [Link](index=Marks, subjects)
d) [Link](Marks,index)
Ans: a
a) a 31
9
e 25
i 19
o 13
u7
dtype: int64
b) a 31
e 25
i 19
o 13
dtype: int64
c) Error
d) None of the above
Ans: a
10 Tushar is a new learner for the python pandas series. He learned some of
the concepts of python in class 11 with NumPy module. He wants to
create a series with the following code. The index should be from 20 to 30
and data value is obtained by multiplying each index value by 7. Help him
to create series by following code:
import pandas as pd
import numpy as np
s=[Link](20,30)
Ans: b
Section C
1 Ms. Priya is a python developer and she created a series using the
following code, but she missed some of the lines given as blank. Fill the
blanks and help her to complete the code:
Output:
10
03
14
2 NaN
3 44
4 67
a) p
b) pd
c) pandas
d) pdy
Ans: b
ii) Name the library to be imported in statement2 for the code to execute
correctly
a) numpy
b) pandas
c) matplotlib
d) pyplot
Ans: a
iii) Complete statement 3 to obtain the output shown in the code
a) NaN
b) [Link]
c) [Link]
d) none of the above
Ans: b
Ans: c
11
Any operation on Series object will be applied to each item of the Series. This is
known as Vector Operation
0 5
1 10
2 11
3 25
>>> S1+3 0 8
1 13
2 14
3 28
>>> S1*2 0 10
1 20
2 22
3 50
>>> S1%2 0 1
1 0
2 1
3 1
All arithmetic operations like addition, subtraction, multiplication, division etc. can
be done on Series objects
Eg;
12
import pandas as pd
s1=[Link]([15,20,21], index=['A','B','C'])
s2=[Link]([10,10,6], index=['A','B','D'])
print('Series object 1(s1)')
print(s1)
print('Series object 2(s2)')
print(s2)
Output
Series object 1(s1)
A 15
B 20
C 21
Series object 2(s2)
A 10
B 10
D 6
Output
A 25.0
B 30.0
C NaN
D NaN
Output
A 150.0
B 200.0
C NaN
D NaN
13
Division / or div >>> s1/s2 or >>>[Link](s2)
Output
A 1.5
B 2.0
C NaN
D NaN
Output
A 5.0
B 0.0
C NaN
D NaN
MCQ
Section A
1 The result of an operation between unaligned Series will have the ---------
---of the indexes involved
a) intersection
b) union
c) total
d) all of the above
Ans: b
2 We can perform _____________ on two series in Pandas
a) Addition
b) Subtraction
c) Multiplication
d) All of the above
Ans: d
3 Which of the following method is used to add two series?
a) sum( )
b) addition( )
c) add( )
d) None of the above
Ans: c
4 Which of the following statement will display the difference of two Series
‘A’ and ‘B’?
a) A – B
b) [Link](B)
c) Both a and b
d) None of the above
14
Ans: c
Series - S2
a5
b6
Ans: a
2 Choose the correct option:
Assertion (A): We can add two series objects using addition operator (+)
or calling explicit function add() .
15
Reason (R): While adding two series objects index matching is
implemented and missing values are filled with NaN by default.
Ans: a
3 Assume there is a series S1 having data elements as 11, 12, and 13
respectively. Programmer ‘Ravi’ wrote print(s1*2) in his python program.
Statement 1: A series will data elements as 22, 24, 26 will get printed.
Ans: d
Reason (R) : if two series are not aligned NaN are generated but in case
of arrays no concept of NaN and hence operations fail to perform.
Ans: a
5 Assuming the given series, named Salary, which command will be used
to increase 2000 in every employee’s salary?
Om 35000
Vinay 35000
16
Simi 50000
Nitin 54000
Nandi 60000
dtype: int64
a) Salary*2000
b) [Link](2000)
c) Salary+2000
d) [Link]()
Ans: c
6 Write the output of the given program:
import pandas as pd
S1=[Link]([3,6,9,12],index=['a','b','c','e'])
S2=[Link]([2,4,6,8],index=['c','d','b','f'])
print(S1*S2)
a) a 6.0
b 24.0
c 54.0
d 96.0
e NaN
f NaN
dtype: float64
b) a NaN
b 36.0
c 18.0
d NaN
e NaN
f NaN
dtype: float64
c) a 6.0
b 36.0
c 18.0
d 24.0
e NaN
f NaN
dtype: float64
d) Error
Ans: b
7 Predict the output of the following code:
import pandas as pd
stationary=['pencils','notebooks','scales','erasers']
17
S1=[Link]([20,33,52,10],index=stationary)
S2=[Link]([17,13,31,32],index=stationary)
S1=S1+S2
print(S1+S2)
a) pencils 37
notebooks 46
scales 83
erasers 42
dtype: int64
b) pencils 54
notebooks 59
scales 114
erasers 74
dtype: int64
c) pencils 20
notebooks 33
scales 52
erasers 10
dtype: int64
d) Error
Ans: b
a) 0 31
12
2 -6
3 31
42
dtype: int64
b) 0 31
12
2 -6
dtype: int64
c) 0 62
1 4
2 -12
dtype: int64
18
d) Error
Ans: c
9 Write the output of the following :
import pandas as pd
S1=[Link]([1,2,3,4])
S2=[Link]([7,8,9,10])
[Link]=['a','b','c','d']
print((S1+S2).count())
a) 8
b) 4
c) 0
d) 6
Ans: c
10 What will be the output of the following code?
import pandas as pd
s1=[Link]([4,5,7,8,9],index=['a','b','c','d','e'])
s2=[Link]([1,3,6,4,2],index=['a','p','c','d','e'])
print(s1-s2)
a) a 3.0
b0
c 1.0
d 4.0
e 7.0
p0
dtype: float64
b) a 3.0
b NaN
c 1.0
d 4.0
e 7.0
p NaN
dtype: float64
c) a 3.0
c 1.0
d 4.0
e 7.0
dtype: float64
d) a 3.0
b–
c 1.0
d 4.0
e 7.0
19
p–
dtype: float64
Ans: b
Section C
1 Answer the following questions(i to iv) based on the series given below:
a) numpy
b) pandas
c) matplotlib
d) math
Ans: b
ii) Complete code in statement2 to obtain the following output:
swimming 6
chess 12
football 10
a) school2 * 2
b) school1 * 2
c) school1+2
d) school1+school2
Ans: a
iii) Predict the output of statement 3
a) swimming 10
skating 2
kho kho 6
chess 4
20
football 5
swimming 3
chess 6
football 5
b) chess True
football True
kho kho False
skating False
swimming True
c) chess 10.0
football 10.0
kho kho NaN
skating NaN
swimming 13.0
d) Error
Ans: c
iv) Which method is to be used in statement4 to produce the following
output?
chess 24.0
football 25.0
kho kho NaN
skating NaN
swimming 30.0
a) add
b) sub
c) div
d) mul
Ans: d
>>> seriesCapCntry
India NewDelhi
USA WashingtonDC
UK London
France Paris
dtype: object
21
Attribute Purpose Syntax Example
Name
name assigns a <Seriesname>.name [Link] = ‘Capitals’
name to =<”name”>
the Series >>> print(seriesCapCntry)
India NewDelhi
USA WashingtonDC
UK London
France Paris
Name: Capitals, dtype: object
[Link] assigns a <Seriesname>.index. >>>[Link] =
name to name=<”name”> ‘Countries’
the index >>> print(seriesCapCntry)
of the Countries
series India NewDelhi
USA WashingtonDC
UK London
France Paris
Name: Capitals, dtype: object
values prints a list<Seriesname>.values >>> print([Link])
of the [‘NewDelhi’ ‘WashingtonDC’
values in ‘London’,‘Paris’]
the series
size prints the <Seriesname>.size >>> print([Link])
number of 4
values in
the Series
object
empty prints True <Seriesname>.empty >>> [Link]
if the False
series is # Create an empty series
empty, and seriesEmpt=[Link]()
False >>> [Link]
otherwise True
ndim prints the <Seriesname>.ndim d1={'a':9, 'b':1, 'c':7, 'd':2}
dimension s1=[Link](d1)
of the print([Link])
Series
object o/p:
1
shape shape <Seriesname>.shape d1={'a':9, 'b':1, 'c':7, 'd':2}
property s1=[Link](d1)
returns a print([Link])
tuple
(n,) o/p:
containing (4,)
a single
element
22
which is
the
number of
elements
in the
Series
object.
MCQ TYPE QUESTIONS
SECTION A
1 Which of the following is not an attribute of pandas Series?
[Link]
[Link]
[Link]
[Link].T
Ans.d
2 …………………..attribute will display the total number of elements in a given Series.
[Link]
[Link]
[Link]
[Link]
Ans c
3 Which of the following attribute is used to assigns a name to the index of the Series.
[Link]
[Link]
[Link]
[Link] of the above
Ans c
4 ……………………………property returns a tuple (n,) containing a single element which
is the number of elements in the Series object.
[Link]
[Link]
[Link]
[Link]
Ans shape
5 Choose the correct syntax to get the dimension of series named SR:
[Link]
[Link]
[Link]
[Link]
Ans b
SECTION B
23
1 Assuming the given series, named stud, which command will be used to print 5 as
output?
Amit 90
Ramesh 100
Mahesh 50
John 67
Abdul 89
Name: Student, dtype: int64
a. [Link]
b. [Link]
c. [Link]
d. [Link]
Ans d
2 What will be the output f the following code given:
import pandas as pd
seriesEmpt=[Link]()
>>> [Link]
[Link]
b.0
[Link]
[Link]
Ans c
3 Assuming the given series,named ‘capital’,which command will be used to print the
following output?
[‘NewDelhi’ ‘WashingtonDC’ ‘London’,‘Paris’]
India NewDelhi
USA WashingtonDC
UK London
France Paris
[Link]
[Link]
[Link]
[Link]
Ans c
4 Choose the correct name of Series from the given python code.
import pandas as pd
dict1 = {'India': 'NewDelhi', 'UK':'London', 'Japan': 'Tokyo'}
series8 = [Link](dict1)
print(series8) #Display the series
[Link]=’capital’
a.dict1
24
b.series8
[Link]
[Link]
Ans.c
5 Write the correct python statement to assign name to the index of the given series to
‘State’.
import pandas as pd
dict1 = {'India': 'NewDelhi', 'UK':'London', 'Japan': 'Tokyo'}
series8 = [Link](dict1)
print(series8)
series8. _______________ =’state’
[Link]
[Link]
[Link]
d All of the above.
Ans.b
import pandas as p1
import numpy as np
a1=[Link](2,11,2)
s1=[Link](a1,index=list(‘ABCDE’))
print([Link])
Ans:d
25
SECTION C
1 Nidhi has created Series S1 as following , help her to perform following tasks and write
the code to help her to
S1
India NewDelhi
USA WashingtonDC
UK London
France Paris
dtype: object
a Display the number of values in the series s1
[Link]([Link])
[Link]([Link])
[Link]([Link])
[Link]([Link])
b. Returns True/Flase if the Series S1 is empty
[Link]([Link]())
[Link]([Link])
[Link]([Link])
[Link]([Link])
c Displays the list of values in the series S1
[Link]([Link])
[Link]([Link])
[Link]([Link]())
[Link] of the above
d Display the ouput as (1,)
[Link]([Link])
[Link]([Link])
[Link]([Link])
[Link]([Link]())
e The command which will change the name of Series S1 to States.
[Link]=’state’
[Link].S1=’state’
[Link](state)
[Link] of the above.
TOPIC:Methods of Series
Head and Tail functions
LET US CONSIDER THE FOLLOWING EXAMPLE.
>>> seriesTenTwenty=[Link]([Link]( 10, 20, 1 ))
>>> print(seriesTenTwenty)
0 10
1 11
2 12
3 13
4 14
5 15
6 16
26
7 17
8 18
9 19
dtype: int32
Ans c
2 Which of the following returns number of non-NaN values of Series?
a. count
b. size
c. index
d. values
27
Ans a
3 Which of following statement will return 10 values from the end of the Series ‘S1’?
a. [Link]( )
b. [Link](10)
c. [Link](10)
d. S1(10)
Ans b
4 Function to display the first n rows in the Series:
a. tail (n)
b. head (n)
c. top (n)
d. first (n)
Ans b
5 To get bottom three rows of a Series, you may use _________ function: 1
a. tail()
b. bottom(3)
c. bottom(3)
d. tail(3)
Ans d
SECTION B
1 Write the output of the following:
import pandas as pd
S1=[Link]([1,2,3,4])
S2=[Link]([7,8])
print((S1+S2).count())
a. 6
b. 4
c. 2
d. 0
Ans b
2 Which of the following returns number of non-NaN values of Series?
a. count
b. size
c. index
d. values
Ans a
3 Write the output of the following:
import pandas as pd
S1=[Link]([1,2,3,4])
S2=[Link]([7,8])
S3=S1+S2
print([Link](3))
a 0 8.0
1 10.0
28
2 NaN
b. 0 1.0
1 2.0
2 NaN
c. 0 7.0
1 8.0
2 NaN
d 0 1.0
1 7.0
2 NaN
Ans a
4 Write the output of the following:
import pandas as pd
S1=[Link]([1,2,3,4])
S2=[Link]([7,8])
print((S1+S2).tail(2))
a 2 NaN
3 NaN
b 0 8.0
1 10.0
c 2 3
3 4
d 0 7
1 8
Ans a
29
# indexing a Series object single label import
pandas as pd
o/p: 102
ii) Using multiple labels- We can pass multiple labels in any order that is
present in the Series object. The multiple labels must be passed as a list i.e.
the multiple labels must be separated by commas and enclosed in double
square brackets. Passing a label is passed that is not present in the Series
object, should be avoided as it right now gives NaN as the value but in future
will be considered as an error by Python.
# indexing a Series object
multiple labels import
pandas as pd
o/p:
b 102
a 101
f 106
dtype: int64
iii) Using slice notation startlabel:endlabel- Inside the index operator we
can pass startlabel:endlabel. Here contrary to the slice concept all the items
from startlabel values till the endlabel values including the endlabel values is
returned back.
30
Output
b 102
c 103
d 104
e 105
dtype: int64
b) Slicing a Series object using Integer Index positions-
The concept of slicing a Series object is similar to that of slicing python lists, strings
etc. Even though the data type of the labels can be anything each element of the
Series object is associated with two integer numbers:
In forward indexing method the elements are numbered from 0,1,2,3, …
with 0 being assigned to the first element, 1 being assigned to the second
element and so on.
In backward indexing method the elements are numbered from -1,-2, -3,
… with -1 being assigned to the last element, -2 being assigned to the
second last element and so on.
For example consider the following Series object-
d={'a':101, 'b':102, 'c':103, 'd':104, 'e':105, 'f':106}
s=[Link](d)
forward
indexing---> 0 1 2 3 4 5
a b c d e f
101 111 121 131 141 151
< ---- backward
-6 -5 -4 -3 -2 -1 indexing
Slice concept-
The basic concept of slicing using integer index positions are common to Python
object such as strings, list, tuples, Series, Dataframe etc. Slice creates a new object
using elements of an existing object. It is created as: ExistingObjectName[start : stop
: step] where start, stop , step are integers
31
print('y=\n', y)
z=s[1: -2: 2]
print('z=\n', z)
o/p:
x=
b 111 d 131 f 151
dtype: int64
y=
f 151 e 141 d 131 c 121 b 111 a 101
dtype: int64
z=
b 111 d 131
Modifying elements of Series object-
The elements of a Series object can be modified using any of the following methods-
a. Using index [] operator to modify single/multiple values
__________________________________________________________________
# Modifying a Series object index [] method import pandas as pd
d={'a':101, 'b':111, 'c':121, 'd':131, 'e':141, 'f':151}
s=[Link](d)
s['c'] = 555
s[['f','a']] = [666,777]
print('s=\n', s)
s['b':'d']=[0,1,2]
print('s=\n', s)
Output
s=
a 777
b 111
c 555
d 131
e 141
f 666
dtype: int64
s=
a 777
b 0
c 1
d 2
e 141
f 666
dtype: int64
b. sing at/iat property to modify a single value
# Modifying a Series object at iat property import pandas as pd
d={'a':101, 'b':111, 'c':121, 'd':131, 'e':141, 'f':151}
32
s=[Link](d)
[Link]['d'] = 999
[Link][-1] = 777
print('s=\n', s)
Output
s=
a 101
b 111
c 121
d 999
e 141
f 777
dtype: int64
c. Using loc, iloc property to modify single /multiple values
Output
s=
a 101
b 9
c 121
d 131
e 8
f 7
dtype: int64
s=
a 101
b 33
c 121
d 44
e 8
f 55
dtype: int64
c. Using slice method to modify multiple values
33
# Modifying a Series object slice method
import pandas as pd
d={'a':101, 'b':111, 'c':121, 'd':131, 'e':141, 'f':151}
s=[Link](d)
s[1: :2] = [1,2,3]
print('s=\n', s)
Output
s=
a 101
b 1
c 121
d 2
e 141
f 3
dtype: int64
MCQ
1 What will be the output of the given code?
import pandas as pd
s = [Link]([1,2,3,4,5],
index=['akram','brijesh','charu','deepika','era'])
print(s['charu'])
a. 1
b. 2
34
c. 3
d. 4
Ans C
2 Consider the following series named animal:
L Lion
B Bear
E Elephant
T Tiger
W Wolf
dtype: object
Write the output of the command:
print(animal[::-3])
a L Lion
T Tiger
dtype: object
b. B Bear
E Elephant
dtype: object
c. W Wolf
B Bear
dtype: object
d. W Wolf
T Tiger
dtype: object
Ans C
3 Write the output for the following Python code.
import pandas as pd
s=[Link]([1,2,3,4,5,6],index=['A','B','C','D','E','F'])
print(s[s%2==0])
a. B 2
D 4
F 6
b. A 1
C 3
E 5
c. B 2
D 4
F 5
d. B 3
D 4
F 6
Ans a
35
4 Write the output of the following code ?
import pandas as pd
seriesMnths=[Link]([2,3,4],index=['Feb','Mar','Apr'])
print(seriesMnths[1])
a. 2
b. Mar
c. Feb
d. 3
Ans d
5 Choose the correct output of the following code?
import pandas as pd
seriesCapCntry=[Link](['New Delhi','WashingtonDC','London','Paris'],index=
['India','USA','UK','France'])
print(seriesCapCntry[[3,2]])
a. France Paris
France Paris
b. USA WashingtonDC
France Paris
c. France Paris
UK London
d. USA WashingtonDC
UK London
Ans c
6 Assertion (A) : We cannot access more than one element of Series without slicing .
Reason (R) :More than one element of series can be accessed using a list of positional
index or labeled index.
(A) Both A and R are true and R is the correct explanation of A.
(B) Both A and R are true and R is not the correct explanation of A.
(C) A is true but R is false.
(D) A is false but R is true.
(E) Both A and R are false.
Ans D
7 Assertion (A) : Elements of Series can be accessed using positional index.
Reason (R) : positional index values ranges from 1 to n if n is the size of the series.
(A) Both A and R are true and R is the correct explanation of A.
(B) Both A and R are true and R is not the correct explanation of A.
(C) A is true but R is false.
(D) A is false but R is true.
(E) Both A and R are false
Ans A
8 Answer the following based on the series given below.
36
import pandas as pd
list1=[1,2,3,4,5,6,7,8]
list2=['swimming','tt','skating','kho kho','bb','chess','football','cricket']
school=[Link](list1,index=list2)
[Link]=("little")
print(school*2) # statement 1
print([Link](3)) # statement 2
print(school['tt']) # statement 3
print(school[2:4])
Ans b
iii Choose the correct output of the statement
print(school['tt']) # statement 3
a. 2
b. 3
c. tt 2
d. true
Ans c
9 Write the output of the following:
import pandas as pd
S1 = [Link](['NewDelhi', 'WashingtonDC', 'London', 'Paris'],
index=['India', 'USA', 'UK', 'France'])
print(S1['India', 'UK'])
a.
India NewDelhi
UK London
dtype: object
37
b.
India NewDelhi
UK Washington
dtype: object
c. Error
d. None of the above
Ans a
10 What will ne the output of the above given code?
import pandas as pd
s=[Link]([1,2,3,4,5],index=["ajay", "pankaj","deepti","rajesh","ritika"])
print(s["rajesh"])
a) 1
b) 2
c) 3
d) 4
Ans 4
38
UNIT I- DATA FRAMES
Empty DataFrame
import pandas as pd
df=[Link]()
print(df)
We can specify our own index too by using the index argument.
df2=[Link](dict1,index=['I','II','III'])
print(df2)
The number of indexes given in the index
sequence must match the length of the dictionary’s values, otherwise Python will
give error.
smarks=[Link]({'Neha':80,'Maya':90,'Reena':70})
sage=[Link]({'Neha':25,'Maya':30,'Reena':29})
dict={'Marks':smarks,'Age':sage}
df3=[Link](dict)
print(df3)
or
smarks=[Link]([80,90,70],index=['Neha','Maya','Reena'])
sage=[Link]([25,30,29],index=['Neha','Maya','Reena'])
dict={'Marks':smarks,'Age':sage}
df3=[Link](dict)
print(df3)
DataFrame object created has columns assigned from the keys of the
dictionary object and its index assigned from the indexes of the Series
object which are the values of the dictionary object.
student=[{'Neha':50,'Manu':40},{'Neha':60,'Maya':45}]
df4=[Link](student,index=['term1','term2'])
print(df4)
NaN is automatically added in missing places.
40
Selecting or Accessing Data
import pandas as pd
dict={'BS':[80,98,100,65,72],'ACC':[88,67,93,50,90],
'ECO':[100,75,89,40,96],'IP':[100,98,92,80,86]}
df5=[Link](dict,index=['Ammu','Achu','Manu','Anu','Abu'])
print(df5)
print([Link])
or
print(df5['BS'])
print(df5[['BS','IP']])
print([Link]['Ammu', :])
41
To access multiple rows:
<dataframe object>.loc[<start row>:<end row> , : ]
Python will return all rows falling between start row and end row; along with
start row and end row.
print([Link]['Ammu':'Manu', : ])
print([Link][:,'ACC':'IP'])
print([Link]['Manu':'Abu','ACC':'ECO'])
Sometimes our dataframe object does not contain row or column labels or even we may not
remember, then to extract subset from dataframe we can use iloc.
42
print([Link][1:3,1:3])
print([Link]['Achu']) 67
or
print([Link][1])
(ii) Using at or iat
<dataframe object>.at[<row label>,<column label>]
Or
<dataframeobject>.iat[<numeric row index>,
<numeric column index>]
print([Link]['Achu','ACC']) 67
or
print([Link][1,1])
df5['ENG']=60
print(df5)
If you want to add a column that has different values for all its rows, then we
can assign the data values for each row of the column in the form of a list.
df5[‘ENG’]=[50,60,40,30,70]
There are some other ways for adding a column to a database.
<dataframe object>.at[ : , <column name>]=value
43
Or
<dataframe object>.loc[ : ,<column name>]=value
[Link][ : ,'ENG']=60
print(df5)
or
[Link][ : ,'ENG']=60
print(df5)
[Link]['Sabu', : ]=50
print(df5)
or
[Link]['Sabu', : ]=50
print(df5)
If there is no row with such row label, then adds new row with this row label
and assigns given values to all its columns.
[Link]['Ammu']=100
print(df5)
or
[Link][0]=100
print(df5)
44
e.g.: del df5[‘ENG’]
Or
df5=[Link](columns=['ECO','IP'])
We can use pop() to delete a column. The deleted column will be returned as Series
object.
bstud=[Link](‘BS’)
print(bstud)
df5=[Link](['Ammu','Achu'])
or
df5=[Link](index=['Ammu','Achu'])
45
Iterating over a DataFrame
Using [Link]() Function
The method <DF>.iterrows() views a dataframe in the form of horizontal subset
ie row-wise.
Each horizontal subset is in the form of (row-index, Series) where Series
contains all column values for that row –index.
We can iterate over a Series object just as we iterate over other sequences.
import pandas as pd
dict={'BS':[80,98],'ACC':[88,67]}
df5=[Link](dict,index=['Ammu','Achu'])
print(df5,"\n")
import pandas as pd
dict={'BS':[80,98],'ACC':[88,67]}
df5=[Link](dict,index=['Ammu','Achu'])
46
print(df5,"\n")
47
Renaming index / column labels
rename() renames the existing index or column labels in a dataframe/series.
The old and new index/column labels are to be provided in the form of a dictionary
where keys are the old indexes/row labels and the values are the new names
for the same.
Syntax:
<DF>.rename(index=None, columns=None, inplace=False)
where index and columns are dictionary like.
inplace, a boolean by default False (which returns a new dataframe with renamed
index/labels).
If True then changes are made in the current
dataframe.
import pandas as pd
dict={'p_id':[101,102],'p_name':['Hard disk','Pen Drive']}
df=[Link](dict)
print(df,"\n")
#[Link](columns={'p_id':'Product_ID','p_name':'product_name'},inplace=True)
#or
df=[Link](columns={'p_id':'Product_ID','p_name':'product_name'})
print(df)
48
Reindexing
reindex() used to change the order of the rows or columns in DataFrame/Series
and returns DataFrame/Series after changes.
Syntax:
<DF>.reindex(index=None, columns=None, fill_value=NaN)
df=[Link](columns=['product_name','Product_ID'])
print(df)
If the mentioned indexes/columns do not exist in
dataframe, these will be added as per the mentioned order with NaN values.
df=[Link](columns=['product_name','Product_ID','product_category'])
print(df)
By using fill_value, we can specify which will be filled in the newly added
row/column.
df=[Link](columns=['product_name','Product_ID','product_category'],
index=[1,0],fill_value='Home')
print(df)
Boolean indexing
Like default indexing (0,1,2…) or labeled indexing , there is one more way to index –
Boolean Indexing (Setting row index to True/ False etc.) .
This helps in displaying the rows of Data Frame, according to True or False as
specified in the command.
import pandas as pd
dict={'p_id':[101,102,103],'p_name':['Hard disk','Pen Drive','Camera']}
df=[Link](dict)
[Link]=[True,False,True]
print(df,"\n")
print([Link][True])
49
DataFrame attributes
All information related to a DataFrame object is available through attributes.
<DataFrane object> . <attribute name>
Attribute Description
index Returns the index (row labels) of the DataFrame
columns Returns the column labels of the DataFrame
axes Returns a list representing both the axes of the Data
Frame (axis=0 i.e. index and axis=1 i.e. columns)
values Returns a Numpy representation of the DataFrame
dtypes Returns the dtypes of data in the DataFrame
shape Returns tuple of the shape of the DataFrame
ndim Returns number of dimensions of the dataframe
size Returns the number of elements in the dataframe
empty Returns True if the DataFrame object is empty, otherwise
False
T Transpose index and columns of DataFrame
50
vii. Write code to rename column ‘A’ to ‘D’ which will not effect original
dataframe
viii. Write code to add a column E with values [CS, 104,XYZ, 300000]
ix. Write code to add a row COMM with values [3000,4000,5000]
x. Write code to rename DEPT to DEPARTMENT which will effect the original
dataframe
xi. Write code to display DEPT in A
i. print(df.A[‘DEPT’])
ii. print(df[‘A’,’DEPT’])
iii. print([Link][1:2,1:2])
iv. print([Link][3,2])
Answers :=
i. del df['A']
ii. A B C
ENAME ABC PQR LMN
SALARY 200000 100000 20000
iii. df=[Link](['SALARY'],axis=0)
iv. df['A']=100
v. df.B['DEPT']='MECH'
vi. print([Link][['DEPT','SALARY'],["A","B"]])
vii. [Link](columns={"A":"D"},inplace=False)
viii. df['E']=["CS",104,"XYZ",300000]
ix. [Link]['COMM']=[3000,4000,5000]
x. [Link](index={"DEPT":"DEPARTMENT"},inplace=True)
xi. print(df.A[‘DEPT’])
xii. 4
2. Consider the following Data Frame df and answer questions
51
iii. Find the lowest marks scored by student s1
iv. Find the highest marks in ACC
v. Find the lowest marks in IP
Answers:=
i. df['TOT']=df['ACC']+df['BST']+df['ECO']+df['IP']
ii. print(max([Link]['S1',:]))
iii. print(min([Link]['S1',:]))
iv. print(max(df['ACC']))
v. print(min(df['IP']))
Answers:=
i. print(df[['delhi','chennai']])
ii. print([Link]['hospitals'])
iii. print([Link])
iv. [Link]['population']=50
v. [Link](index={"population":"pop"},inplace=True)
52
i. Display the name of city whose population >=20
range of 12 to 20
ii. Write command to set all vales of df as 0
iii. Display the df with rows in the reverse order
iv. Display the df with only columns in the reverse order
v. Display the df with rows & columns in the reverse order
answers:-
i. print(df[[Link]>=20])
ii. df[:]=0
iii. print([Link][::-1)
iv. print([Link][:,::-1])
v. print([Link][::-1,::-1])
5. Consider the following Data Frame df and answer questions
53
Answers
i. 4
ii. A 4
B 4
C 4
dtype: int64
iii. DEPT 3
EMPNO 3
ENAME 3
SALARY 3
dtype: int64
iv. 20000
v. PQR
Sl
MCQ QUESTIONS
No
To display the 3rd, 4th and 5th columns from the 6th to 9th rows of a dataframe
you can write
54
The head() function of dataframe will display how may rows from top if no
parameter is passed.
(i) 1
(ii) 3
3
(iii) 5
(iv) None of these
ANS : (iii) 5
To change the 5th column's value at 3rd row as 35 in dataframe DF, you can
write
(a) DF[4, 6] = 35
4 (b) [Link][4, 6] = 35
(c) DF[3, 5] = 35
(d) [Link][3, 5] = 35
ANS:- d) [Link][3, 5] = 35
Which function is used to find values from a DataFrame D using the index
number?
a) [Link]
b) [Link]
5
c) [Link]
d) None of these
ANS: b) [Link]
In a DataFrame, Axis= 0 represents the elements
[Link]
[Link]
6
[Link]
[Link] of these.
ANS: [Link]
55
In DataFrame, by default new column added as the _____________ column
(i) First (Left Side)
(ii) Second
7 (iii)Last (Right Side)
(iv) Any where in dataframe
a.Df2=[Link](Df1)
b. Df2=Df2+Df1
9
c. Df2=[Link].Df1
d. Df2=[Link](Df1)
ANS: a.Df2=[Link](Df1)
When we create DataFrame from List of Dictionaries, then number of columns in
DataFrame isequal to the _______
a. maximum number of keys in first dictionary of the list
b. maximum number of different keys in all dictionaries of the list
10
c. maximum number of dictionaries in the list
d. None of the above
ANS: (iii)DF.T
In DataFrame, by default new column added as the _____________ column
(i) [Link]( )
16 (ii) readcsv( )
(iii) read_csv( )
(iv) Read_csv( )
58
ANS: a) T
When we create DataFrame from List of Dictionaries, then number of columns in
DataFrame is equal to the _______
(i) maximum number of keys in first dictionary of the list
(ii) maximum number of different keys in all dictionaries of the list
20
(iii) maximum number of dictionaries in the list
(iv) None of the above
ANS: (ii) maximum number of different keys in all dictionaries of the list
Which of the following is/are characteristics of DataFrame?
a) Columns are of different types
b) Can Perform Arithmetic operations
21 c) Axes are labeled (rows and columns)
d) All of the above
(a) print(SHOP[City==’Delhi’])
22 (b) print(SHOP[[Link]==’Delhi’])
(c) print(SHOP[SHOP.’City’==’Delhi’])
(d) print(SHOP[SHOP[City]==’Delhi’])
ANS: [Link]
(i) 3, 3
27
(ii) 3, 4
(iii)3, 5
(iv)None of the above
ANS: (iii)3, 5
60
To delete a row from a DataFrame, you may use
(a) remove
(b) del
28 (c) drop
(d) cancel
(a) skip_rows = 5
30 (b) skiprows = 5
(c) skip - 5
(d) noread - 5
61
Which of the following statements is false?
(i) Dataframe is size mutable
(ii) Dataframe is value mutable
32 (iii) Dataframe is immutable
(iv) Dataframe is capable of holding multiple type of data
ANS: (i) 0
Which of the following function is used to load the data from the CSV file to
DataFrame?
(i) [Link]( )
(ii) readcsv( )
34
(iii)read_csv( )
(iv)Read_csv( )
ANS: (iii)read_csv( )
Write code to delete rows those getting 5000 salary.
(a) df=[Link][salary==5000]
(b) df=df[[Link]!=5000]
35
(c) [Link][[Link]==5000,axis=0]
(d) df=[Link][salary!=5000]
62
[Link][ ] method is used to ______ # DF1 is a DataFrame
(i) Add new row in a DataFrame ‘DF1’
(ii) To change the data values of a row to a particular value
(a) df=[Link](‘A1’)
(b) df=[Link](index=‘A1’)
38
(c) df=[Link](‘A1,axis=index’)
(d) df=[Link](‘A1’)
63
(a) skiprows = 11315
(b) skiprows - (1, 3, 5]
(c) skiprows = [1, 5, 1]
(d) Any of these
(a) Row
(b) Column
43 (c) True
(d) False
ANS: b. Index
To get top 5 rows of a dataframe, you may use
(a) head( )
(b) head(5)
45 (c) top( )
(d) top(5)
(a) Row
(b) Column
50
(c) True
(d) False
a. Not a Number
b. None and None
51 c. Null and Null
d. None a Number
ANS: (i) 1
To delete a row from dataframe, you may use _______ statement.
i. remove()
ii. ii. del()
56 iii. iii. drop()
iv. iv. cancel()
a. Row
b. Column
57
c. Row and Column Both
d. None of the above
ANS: a. Row
___________ method in Pandas can be used to change the index of rows and
columns of a Series or Dataframe
(a) rename()
58 (b) reindex()
(c) reframe()
(d) none of these
(a) df=[Link](col=‘marks’)
59 (b) df=[Link](‘marks’,axis=col)
(c) df=[Link](‘marks’,axis=0)
(d) df=[Link](‘marks’,axis=1)
ANS:- a. delete three columns having labels ‘Name’, ‘Class’ and ‘Rollno’
Difference between loc() and iloc().:
a. Both are Label indexed based functions.
b. Both are Integer position-based functions.
c. loc() is label based function and iloc() integer position based function.
d. loc() is integer position based function and iloc() index position based function.
62
ANS: c. loc() is label based function and iloc() integer position based
function.
Which command will be used to delete 3 and 5 rows of the data frame. Assuming
the data frame name as DF.
a. [Link]([2,4],axis=0)
b. [Link]([2,4],axis=1)
63
c. [Link]([3,5],axis=1)
d. [Link]([3,5])
ANS: a [Link]([2,4],axis=0)
Assuming the given structure, which command will give us the given output:
Output Required: (3,4)
64
69
EmpCode Name Desig
ANS: b. print([Link])
Write the output of the given command: [Link][:0,'Name'] Consider the given
dataframe.
EmpCode Name Desig
0 1405 VINAY Clerk
1 1985 MANISH Works Manager
2 1636 SMINA Sales Manager
3 1689 RINU Clerk
65
ANS : VINAY
70
UNIT I- Data Visualization
Data visualization is the technique to present the data in a pictorial or graphical format. It
enables stakeholders and decision makers to analyze data visually. The data in a
graphical format allows them to identify new trends and patterns easily.
71
Importing PyPlot
import [Link]
or
import [Link] as plt
After importing matplotlib in the form of plt we can use plt for accessing any function of
matplotlib
• HISTOGRAM etc.
Line Plot:
A line plot/chart is a graph that shows the frequency of data occurring along a number
line. The line plot is represented by a series of data points called markers connected
with a straight line. Generally line plots are used to display trends over time. A line
plot or line graph can be created using the plot() function available in pyplot library.
We can, not only just plot a line but we can explicitly define the grid, the x and y axis
scale and labels, title and display options etc.
• We can create line graph with x coordinate only or with x and y coordinates.
• Syntax: [Link](x,y)
• Label-
[Link](‘TIme') – to set the x axis label
[Link](‘Temp') – to set the y axis label
Changing Marker Type, Size and Color
[Link](x,y,'blue',marker='*',markersize=10,markeredgecolor='magenta')
[Link](x,y,color,linewidth,linestyle,marker, markersize,markeredgecolor)
[Link]( )
PROGRAM
X=[1,2,3,4,5]
Y=[2,4,6,8,10]
[Link]('Y Axis')
[Link](X,Y,'r')
[Link]()
Bar Graph
A graph drawn using rectangular bars to show how large each value is. The bars can
be horizontal or vertical. A bar graph makes it easy to compare data between
different groups at a glance. Bar graph represents categories on one axis and a
discrete value in the other. The goal bar graph is to show the relationship between
the two axes. Bar graph can also show big changes in data over time.
Syntax : [Link](x,y)
74
To se different widths for different bars
[Link](x,y, width=float value sequence)
• Title
[Link](' Bar Graph ') – Change it as per requirement
• Label-
[Link](‘Overs') – to set the x axis label
[Link](‘Runs') – to set the y axis label
PROGRAM :
overs=['1-10','11-20','21-30','31-40','41-50']
runs=[65,55,70,60,90]
[Link]('Over Range')
[Link]('Runs Scored')
[Link](overs,runs)
[Link]( )
75
HISTOGRAM
Creating a Histogram :
It is a type of bar plot where X-axis represents the bin ranges while Y-axis gives
information about frequency.
To create a histogram the first step is to create bin of the ranges, then distribute the
whole range of the values into a series of intervals, and count the values which fall
into each of the intervals.
76
The hist() function is used to create histogram
Syntax:
[Link](x,other parameters)
Optioal Parameters
x array or sequence of array
PROGRAM :
77
• Title
[Link]('Histogram ') – Change it as per requirement
• Label-
[Link](‘Data') – to set the x axis label
[Link](‘Frequency') – to set the y axis label
• Legend - A legend is an area describing the elements of the graph. In the matplotlib
library there is a function named legend() which is used to place a legend on the axes .
When we plot multiple ranges in a single plot ,it becomes necessary that legends are
[Link] is a color or mark linked to a specific data range plotted .
i)In the plotting function like bar() or plot() , give a specific label to the data range using
label
ii)Add legend to the plot using legend ( ) as per the sytax given below .
Syntax : - [Link]((loc=position number or string)
position number can be u1,2,3,4 specifying the position strings upper right/'upper
left/'lower left/lower right respectively .
78
Saving the Plot
Tosave any plot savefig() method is used. Plots can be saved in various formats
like pdf,png,eps etc .
[Link]('line_plot.pdf') // save plot in the current directory
[Link]('d:\\plot\\line_plot.pdf') // save plot in the given path
SECTION B
a) [Link]
b) [Link]
c) [Link]
d) [Link]
Ans: a) [Link]
[Link] command used to give a heading to a graph is _________
(a) [Link]()
(b) [Link]()
(c) [Link]()
79
(d) [Link]()
Ans: (d) [Link]()
4. Using Python Matplotlib _________ can be used to count how many values fall
into each interval.
(a) line plot
(b) bar graph
(c) histogram
(d) None of these
Ans : (c) histogram
[Link] the missing statement
import [Link] as plt
marks=[30,10,55,70,50,25,75,49,28,81]
plt._____(marks, bins=’auto’, color=’green’)
[Link]()
(a) plot
(b) bar
(c)hist
(d)draw
Ans : (c)hist
[Link] module of matplotlib library is required for plotting of graph ?
(a) Plot
(b) Matplot
(c) pyplot
(d) graphics
Ans : (c) pyplot
[Link] the output figure. Identify the code for obtaining this output.
80
a) import [Link] as plt
[Link]([1,2],[4,5])
[Link]()
b) import [Link] as plt
[Link]([2,3],[5,1])
[Link]()
c) import [Link] as plt
[Link]([1,2,3],[4,5,1])
[Link]()
d) import [Link] as plt
[Link]([1,3],[4,1])
[Link]()
Ans: b) [Link](“title”)
[Link] change the width of bars in bar chart, which of the following argument
with a float value is used?
a) thick
b) thickness
c) width
d) barwidth
Ans: c) width
Ans: b) savefig( )
82
[Link] one of these is not a valid line style in matplotlib
a) ‘-‘
b) ‘--‘
c) ‘-.’
d) ‘<’
Ans: d) ‘<’
Ans: c) [Link]()
Ans: c) legend( )
Ans : a) Markers
83
[Link] specify the style of line as dashed , which argument of plot() needs to be set ?
a) line
b) width
c) Style
d) linestyle
Ans: d) linestyle
20. Which of the following ia not a valid plotting function in pyplot?
a) bar()
b) hist()
c) histh()
d) barh()
Ans: c)histh( )
SECTION B
[Link] the following figure. Identify the coding for obtaining this as output.
84
b) import [Link] as plt
eng_marks=[10,55,30,80,50]
st_name=["amit","dinesh","abhishek","piyush","rita"]
[Link](st_name,eng_marks)
[Link] the statements given below and identify the right option to draw a histogram.
3. Which graph should be used where each column represents a range of values, and
the height of a column corresponds to how many values are in that range?
a) plot
b) line
85
c) bar
d) histogram
Ans: d). histogram
marks=[30,10,55,70,50,25,75,49,28,81]
[Link]()
(a) plot
(b) bar
(c) hist
(d) barh
In each of the questions given below, there are two statements marked as
Assertion (A) and Reason (R). Mark your answer as per the codes provided below:
(A) A is true but R is false.
(B) Both A and R are true
(C) A is false but R is true.
(D) Both A and R are false.
86
1. ASSERTION(A) :A histogram is basically used to represent data provided in the
form of groups spread in non-continuous ranges
Ans: C
REASON(R) : [Link](“path”) will save the current graph in png or jpeg format
Ans: C
Ans: A
Ans: B
5. ASSERTION(A) : In histogram X-axis is about bin ranges where Y-axis talks about
frequency
REASON(R) : The bins (intervals) must be adjacent, and are often (but are not required
to be) of equal size.
Ans: B
Ans: D
8. ASSERTION(A) : legend of the graph reflects the data displayed on the graph’s Y-
axis
Ans: B
import__________________________ #Statement 1
Games=[“Subway Surfer”,”Temple Run”,”Candy Crush”,”Bottle hot”,”Runner
Best”]
88
Rating=[4.2,4.8,5.0,3.8,4.1]
plt.______________(Games,Rating) #Statement 2
[Link](“Games”)
plt.______________(“Rating”) #Statement 3
plt._______________ #Statement 4
(i) Choose the right code from the following for statement 1.
(a) matplotlib as plt
(b) pyplot as plt
(c) [Link] as plt
(d) [Link] as pyplot
Ans: (c) [Link] as plt
(ii) Identify the name of the function that should be used in statement 2 to plot the
above graph.
(a) line()
(b) bar()
(c) hist()
d) barh()
Ans: (b) bar()
(a) title(“Rating”)
(b) ytitle(“Rating”)
(c) ylabel(“Rating”)
(d) yaxis(“Rating”)
89
(iv) Choose the right function/method from the following for the statement 4.
(a) display()
(b) print()
(c) bar()
(d) show()
(v) In case Mr. Sharma wants to change the above plot to any other shape, which
statement, should he change.
(a) Statement 1
(b) Statement 2
(c) Statement 3
(d) Statement 4
2. ABC Enterprises is selling its products through three salesmen and keeping the
records of sales done quarterly of each salesman as shown below:
1 import pandas as pd
(a). read_csv
(b). pd.read_csv
(c). pd.get_csv
(d). get_csv
Ans B
3. Choose the correct option to select the type of graph in line 4
(a). type
(b). kind
(c). style
(d). graph
91
Ans : (b). kind
3. [Link] is trying to write a code to plot line graph shown in fig-1. Help Mr.
Sharma to fill in the blanks of the code and get the desired output.
92
import [Link] as plt # statement 1
x = [1,2,3] # statement 2
y = [2,4,1] # statement 3
[Link](x, y, color=’g’) #statement 4
______________ # statement 5
______________ # statement 6
i) Which of the above statement is responsible for plotting the values on canvas.
a) Statement 8
b) Statement 4
c) Statement 1
d) None of the above
Ans: b) Statement 4
ii) Statements 5 & 6 are used to give names to x-axis and y-axis as shown in fig.1.
Which of the following can fill those two gaps
a) [Link]('x - axis') [Link]('y - axis')
b) [Link]('x - axis') [Link]('y - axis')
c) [Link]('x - axis') [Link]('x - axis')
93
d) [Link]('x axis') [Link]('y axis')
iii) Raman has executed code with first 7 statements. But No output displayed. which
of the following statements will display the graph?
a) [Link]()
b) [Link]()
c) [Link]()
d) Both b & c ]
Ans: b) three
v) Which of the following methods will result in displaying 'My first graph!' in the
above graph
a) legend()
b) label()
c) title()
d) Both a & c
Ans : c) title()
94
UNIT 4: SOCIETAL IMPACTS
● Digital footprint, net and communication etiquettes,
● Data protection, intellectual property rights (IPR), plagiarism, licensing and copyright,
● Free and open source software (FOSS),
● Cybercrime and cyber laws, hacking, phishing, cyber bullying, overview of Indian IT
Act.
● E-waste: hazards and management. Awareness about health concerns related to the
usage of technology.
DIGITAL FOOTPRINT
A digital footprint – refers to the trail of data you leave while using the internet. It includes
websites you visit, emails you send, and information you submit online. A digital footprint
can be used to track a person’s online activities and devices.
Internet users create their digital footprint either actively or passively. A passive
footprint is made when information is collected from the user without the person knowing
this is happening. An active digital footprint is where the user has deliberately shared
information about themselves either by using social media sites or by using websites
Online shopping
Making purchases from e-commerce websites
Online banking
Using a mobile banking app
Social media
Using social media on your computer or devices
Sharing information, data, and photos with your connections
Reading the news
Subscribing to an online news source
Health and fitness
Using fitness trackers
Using apps to receive healthcare
NETIQUETTE
95
Be respectful
Think about who can see what you have shared.
Read first, then ask
Pay attention to grammar and punctuation
Respect the privacy of others
Do not give out personal information
DATA PROTECTION
Data protection is a set of strategies and processes you can use to secure the privacy,
availability, and integrity of your data. It is sometimes also called data security or information
privacy. A data protection strategy is vital for any organization that collects, handles, or
stores sensitive data.
For data privacy, users can often control how much of their data is shared and with whom.
For data protection, it is up to the companies handling data to ensure that it remains private.
Data privacy is focused on defining who has access to data while data protection focuses
on applying those restrictions.
Intellectual Property Right (IPR) is the statutory right granted by the Government, to the
owner(s) of the intellectual property or applicant(s) of an intellectual property (IP) to exclude
others from exploiting the IP commercially for a given period of time, in lieu of the discloser
of his/her IP in an IPR application.
Copyright laws protect intellectual property
Copyright It is a legal concept, enacted by most governments giving creator of
original work exclusive rights to it, usually for a limited period.
96
Copyright infringement – When someone uses a copyrighted material without
permission, it is called Copyright infringement.
Patent – A patent is a grant of exclusive right to the inventor by the government.
Patent give the holder a right to exclude others from making, selling, using or importing a
particular product or service, in exchange for full public disclosure of their invention.
Trademark – A Trademark is a word, phrase, symbol, sound, colour and/or design
that identifies and distinguishes the products from those of others.
PLAGIARISM
Plagiarism It is stealing someone’s intellectual work and representing it as your own work
without citing the source of information.
Any of the following acts would be termed as Plagiarism:
Using some other author’s work without giving credit to the author
Using someone else’s work in incorrect form than intended originally by the author or
creator.
Modifying /lifting someone’s production such as music composition etc. without
attributing it to the creator of the work.
Giving incorrect source of information.
OSS refers to Open Source Software, which refers to software whose source code is
available to customers and it can be modified and redistributed without any limitation.
Free and open-source software (FOSS) is software that can be classified as both
free software and open-source software. That is, anyone is freely licensed to use, copy,
study, and change the software in any way, and the source code is openly shared so that
people are encouraged to voluntarily improve the design of the software.
CYBER CRIME:
Any criminal or illegal activity through an electric channel or through any computer
network is considered as cyber crime.
Eg: Cyber harassment and stalking, distribution of child pornography,types of
spoofing, credit card fraud ,. etc
CYBER LAW:
It is the law governing cyberspace which includes freedom of expression, access to
and usage of internet and online privacy.
The issues addressed by cyber law include cybercrime, e-commerce, IPR and Data
protection.
97
HACKING:
It is an act of unauthorised access to a computer, computer network or any digital
system.
Hackers usually are technical expertise of hardware and software.
Hacking when done with a positive intent is called as Ethical hacking or
White hat.
Hacking when done with a negative intent is called as Unethical hacking or
Black hat.
PHISHING:
It is an unlawful activity where fake websites or emails appear as original or authentic
.This sites when clicked by the user will collect sensitive and personal details like
usernames, password, credit card details etc.
CYBER BULLYING:
It is the use of technology to harass , threaten or humiliate a target .
Example: sharing of embarrassing photos or videos, posting false information,
sending mean text., etc.
98
One may come across various health issues like eye strain, muscle
problems, sleep issues,etc
Anti social behaviour, isolation, emotional issues, etc.
Assertion: (A) Plagiarism is stealing someone else’s intellectual work and representing it
as your own work.
Reason : (R) Using someone else’s work and giving credit to the author or creator.
1. Online posting of rumours, giving threats online, posting the victim’s personal
information, comments aimed to publicly ridicule a victim is termed as __________
a. Cyber bullying
b. Cyber crime
c. Cyber insult
d. All of the above
2. Ankit made a ERP - Enterprise resource planning solution for a renowned university
and registered and Copyrights for the same. Which of the most important option;
Ankit got the copyrights.
a. Facebook
b. Pinterest
c. Google+
d. Social channel
99
Ans: Social channel
5. A ___________ is some lines of malicious code that can copy itself and can have
detrimental effect on the computers, by destroying data or corrupting the system.
a. Cyber crime
b. Computer virus
c. Program
d. Software
7. You are planning to go for a vacation. You surfed the internet to get answers for
following queries.
a) Places to visit
b) Availability of air tickets and fares
c) Best hotel deals
d) All of these
Which of the above-mentioned actions might have created a digital footprint?
Ans: All of these
8. Legal term to describe the rights of a creator of original creative or artistic work is
called……..
a) Copyright
b) Copyleft
c) GPL
d) BSD
Ans: Copyright
9. Intellectual Property is legally protected through ____
a) copyright
b) patent
c) registered trademark
100
d) All of the above
Ans: All of the above
10. _____________ includes any visual symbol, word, name, design, slogan, label,
etc., that distinguishes the brand from other brands.
a) Trademark
b) Patent
c) Copyright
d) None of the above
Ans: Trademark
1. Naveen received an email warning him of closure of his bank accounts if he did not
update his banking information as soon as possible. He clicked the link in the email and
entered his banking information. Next he got to know that he was duped.
d) Naveen’s Online personal account, personal website are the examples of?
i. Digital wallet
ii. Digital property
[Link] certificate
[Link] signature
(iv) ______is a small piece of data sent from a website and stored in a user’s web
browser while a user is browsing a website.
102
(a) Hyperlinks
(b) Web pages
(c) Browsers
(d) Cookies
(v) The process of getting web pages, images and files from a web server to local
computer is called
(a) FTP
(b) Uploading
(c) Downloading
(d) Remote access
Solution:
I. (d)None of the above
II. (a) plagiarism
III. (c) copyright infringement
IV. (d) Cookies
V. (c) Downloading
103