In [32]: import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Q1) Given a list of 2-D coordinates, write a function that returns True if
the points lie on a straight line and False otherwise.
Use only the Python standard libraries.
collinear([[1,1], [2,2], [4,4], [-10, -10]]) = True
collinear([[1,0], [2,0], [3,1]]) = False
In [8]: def collinear(arr):
# Counting linear points
points = 0
linear_point = [[n,n] for n in range(10)] # Creating a 2d Arr of linear points
# Iterating over the 2d arr
for point in arr:
# If it's a linear point +1
if point in linear_point:
points = points+1
# Checking for atleast 3 linear points if it's less then reseting the point
elif point not in linear_point:
points = 0 if points < 3 else points
return True if points >= 3 else False
print(collinear([[1,1], [2,2], [4,4], [-10, -10]]))
print(collinear([[1,0], [2,0], [3,1]]))
True
False
Q2) Define a function that returns the frequencies of the last digits
of a list of nonnegative integers.
Use only the Python standard libraries.
Given the list [49, 10, 20, 5, 30, 785]:
9 is the last digit once (in 49),
0 is the last digit three times (in 10, 20, and 30),
5 is the last digit two times (in 5 and 785)
last_digit_counts([49, 10, 20, 5, 30, 785])
= {9:1, 0:3, 5:2} # or something equivalent
In [ ]: def frequencies(arr):
# dict for counting freq numbers
dic = {}
# Iterating over the numbers in the list
for num in arr:
# First convert num into a str then getting the last digit from it then aga
last_digit = int(str(num)[-1])
# Checking if it's the first time and number doesn't exist in the dict then
dic[last_digit] = 1 if last_digit not in dic else dic[last_digit]+1
return dic
print(frequencies([49, 10, 20, 5, 30, 785]))
{9: 1, 0: 3, 5: 2}
Q3) Using whatever library you like, make an effective
visualization for the data stored here:
https://github.com/michaelbilow/open-data/raw/main/spotify-2023.xlsx
It consists the most streamed songs on Spotify in 2023,
along with their artist, key, mode, and year of publication.
Write 1-2 sentences about what you're trying to show.
In [28]: # Reading the execl sheet
df = pd.read_excel("spotify-2023.xlsx")
# Checking data
df.head()
Out[28]: Year
Track Name Artist Streams BPM Key Mode
Released
0 Blinding Lights The Weeknd 2019 3703895074 171 C# Major
1 Shape of You Ed Sheeran 2017 3562543890 96 C# Minor
2 Someone You Loved Lewis Capaldi 2018 2887241814 110 C# Major
3 Dance Monkey Tones and I 2019 2864791672 98 F# Minor
Sunflower - Spider-Man: Into the Post Malone,
4 2018 2808096550 90 D Major
Spider-Verse Swae Lee
In [36]: # Plot the number of streams per release year
plt.figure(figsize=(10, 5))
df.groupby("Year Released")["Streams"].sum().plot(kind="bar", color="skyblue")
plt.xlabel("Year Released")
plt.ylabel("Total Streams (Billions)")
plt.title("Total Streams per Release Year")
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
In [37]: # Plot the distribution of BPM values
plt.figure(figsize=(10, 5))
df["BPM"].hist(bins=20, color="lightcoral", edgecolor="black")
plt.xlabel("Beats Per Minute (BPM)")
plt.ylabel("Number of Songs")
plt.title("Distribution of BPM in Popular Tracks")
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
In [38]: # Top 10 artists by total streams
top_artists = df.groupby("Artist")["Streams"].sum().nlargest(10)
# Plot top 10 streamed artists
plt.figure(figsize=(12, 6))
top_artists.sort_values().plot(kind="barh", color="mediumseagreen")
plt.xlabel("Total Streams (Billions)")
plt.ylabel("Artist")
plt.title("Top 10 Artists by Total Streams")
plt.grid(axis='x', linestyle='--', alpha=0.7)
plt.show()
In [39]: # Count occurrences of each musical key
key_counts = df["Key"].value_counts()
# Plot distribution of musical keys
plt.figure(figsize=(10, 5))
key_counts.plot(kind="bar", color="slateblue", edgecolor="black")
plt.xlabel("Musical Key")
plt.ylabel("Number of Songs")
plt.title("Distribution of Musical Keys in Popular Tracks")
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()
Q4) If you flip a fair coin 100 times (independently),
what is the chance that more than 60 flips come up heads?
Do not try to compute this value exactly; instead, use
the simplest "good" approximation you can come up with.
You should not write any code for this problem.
Steps to Find the Probability
1. Normal Approximation:
• Use normal distribution to approximate the binomial distribution.
2. Key Info:
• Mean (µ) = 50 (expected heads)
• Standard deviation (σ) = 5
3. Z-Score Calculation: [ Z = \frac{60.5 - 50}{5} = 2.1 ]
4. Look up the Z-Score:
• The probability for ( Z = 2.1 ) is approximately 0.9821.
5. Find the Probability: [ P(More than 60 heads) = 1 - 0.9821 = 0.0179 ]
6. Final Answer:
• Probability = 1.79%
Q5) What are some Python libraries (or libraries in other
programming languages) you think are fun or interesting to use?
pandas and scikit-learn are both very useful, but I'd argue
they're not so fun or interesting.
Plotly (Visualization)
Why it's fun: Plotly makes interactive visualizations easy to create. It's like bringing your data to life! You
can hover over charts, zoom in, and create dashboards. Fun Factor: It allows you to play around with
web-based interactive plots and maps.
Turtle (Graphics & Education)
Why it's fun: It's a great way to learn basic programming and visualize logic in an interactive and creative
way. Fun Factor: Drawing shapes, patterns, and even creating little animations with code is a great way
to get the creative juices flowing.
Three.js (JavaScript - 3D Graphics)
Why it's fun: If you’re into creating 3D graphics on the web, Three.js makes it easy to experiment with 3D
scenes, lighting, and animations.
Q6) What tools do you use to help you code productively?
For example, what editor/IDE do you prefer?
Testing framework? Linter? Command line utilities?
Other tools you like or recommend to friends?
Is there something you've discovered recently but haven’t
had time to learn yet?
Visual Studio Code (VS Code):
It's lightweight, fast, and highly customizable. It has fantastic plugin support, and with extensions, it turns
into a powerhouse for almost any language, including Python, JavaScript, and more.
Key Features:
IntelliSense (auto-completion)
Integrated Git support
Extensions for linting, formatting, Docker, Jupyter Notebooks, etc.
Split view and custom themes.
Testing Frameworks
pytest:
pytest is simple to use but extremely powerful. It has great assertion introspection, and the fixtures
feature allows for flexible setup and teardown code. It integrates well with other tools and provides clear,
helpful error messages.
Key Features:
Supports fixtures, parameterization, and plugins.
Simple syntax for writing tests
Great output formatting and reports
Linters & Formatters
flake8:
flake8 is a Python linter that enforces style rules (like PEP 8) and catches bugs before they happen. It
works well as a pre-commit hook and integrates seamlessly with VS Code.
Key Features:
Enforces code style and quality
Can be extended with plugins (like for complexity or docstring checks)
Version Control & Collaboration
Git:
Git is the foundation of all version control. It's indispensable for tracking changes, working in teams, and
rolling back when something breaks.
Key Features:
Powerful branching and merging
GitHub/Bitbucket/GitLab for hosting and collaboration