www.acharya.ac.
in
PYTHON PROGRAMMING
UNIT-IV
What is Data Visualization?
• Data Visualization is the graphical representation
of data using visual elements like charts, graphs,
and maps. It helps to:
• Understand data trends and patterns
• Communicate insights clearly
Click to Edit
• Identify outliers or anomalies
• Make decisions based on visual evidence
Why is it important?
Benefit How it helps
Quick understanding Visuals are easier to grasp than raw numbers
Pattern recognition Spot trends, spikes, dips, and correlations
Better storytelling Turns complex data into meaningful stories
Faster decision-making Visual insights can drive action sooner
Click to Edit
Types of Data Visualizations
Type Best For
Bar Chart Comparing categories
Line Chart Showing trends over time
Pie Chart Showing proportions or percentages
Histogram Showing distribution of a single variable
Scatter Plot Showing relationships/correlations
Box Plot Displaying spread and outliers
Click to Edit
Heatmap Visualizing density or matrix data
Map Plot Displaying geographical data
Tools for Data Visualization
•Matplotlib: Low-level, highly customizable.
•Seaborn: Built on Matplotlib, with beautiful defaults.
•Plotly: Interactive visualizations.
•Pandas (plot): Quick plotting directly from
DataFrames.
Click to Edit •Altair: Declarative and concise syntax for interactive
plots.
•Bokeh: Great for interactive web-based plots.
Example in Python (using
Matplotlib & Seaborn)
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
# Sample data
data = pd.DataFrame({
'Month': ['Jan', 'Feb', 'Mar', 'Apr'],
'Sales': [2500, 2700, 3000, 2800]
Click to Edit
})
# Bar Plot
plt.figure(figsize=(8, 4))
sns.barplot(x='Month', y='Sales', data=data)
plt.title("Monthly Sales")
plt.show()
Generating Data-Installing
matplotlib
•Install matplotlib
•Generate sample data
•Create basic visualizations
Click to Edit
• ✅ Step 1: Installing matplotlib
pip install matplotlib
Click to Edit
Step 2: Generate Sample Data
You can either manually create data, or use libraries like numpy or pandas to generate random data.
import matplotlib.pyplot as plt
import random
# Generate random data
x = list(range(1, 11))
y = [random.randint(10, 100) for _ in x]
Click to Edit
# Plot the data
plt.plot(x, y, marker='o')
plt.title("Random Data Line Chart")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.grid(True)
plt.show()
Step 3: Optional - Using numpy or pandas
import numpy as np
x = np.linspace(0, 10, 50)
Click to Edit
y = np.sin(x)
plt.plot(x, y)
plt.title("Sine Wave")
plt.show()
Bar Chart Example
categories = ['A', 'B', 'C', 'D']
values = [random.randint(20, 100) for _ in categories]
Click to Edit
plt.bar(categories, values, color='skyblue')
plt.title("Random Bar Chart")
plt.show()
Plotting a simple line graph
import matplotlib.pyplot as plt
# Sample data for the line graph
x = [1, 2, 3, 4, 5]
y = [10, 15, 20, 25, 30]
# Create the line plot
plt.plot(x, y, marker='o', linestyle='-', color='blue')
# Add labels and title
Click to Edit plt.title("Simple Line Graph")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
# Optional: add a grid
plt.grid(True)
# Show the plot
plt.show()
•plt.plot(x, y): Draws the line.
•marker='o': Adds circle markers to each point.
•linestyle='-': Connects points with lines.
•color='blue': Sets the line color.
Click to Edit
•plt.show(): Displays the graph.
Random Walks
• In a random walk, you start at an initial point and
take steps in random directions. The "walk" can be
in 1D, 2D, or even 3D, and the steps can be in any
direction, with equal probability.
Click to Edit
import matplotlib.pyplot as plt
import random
# Number of steps in the random walk
num_steps = 100
# Start at position 0
position = 0
walk = [position]
# Perform the random walk
for _ in range(num_steps):
step = random.choice([-1, 1]) # Move left (-1) or right (+1)
Click to Edit position += step
walk.append(position)
# Plot the random walk
plt.plot(walk, marker='o', linestyle='-', color='blue')
plt.title("1D Random Walk")
plt.xlabel("Step Number")
plt.ylabel("Position")
plt.grid(True)
plt.show()
How the Code Works:
1.Start at position 0.
2.For each step, randomly choose either -1 (left) or
+1 (right) and update the position.
3.Store the position at each step in the walk list.
Click to Edit
4.Use matplotlib to plot the path of the random
walk.
Rolling Dice with Plotly
import plotly.graph_objects as go
import random
# Number of dice rolls
num_rolls = 1000
# Simulate rolling the die
rolls = [random.randint(1, 6) for _ in range(num_rolls)]
# Count the frequency of each die face (1 to 6)
frequency = [rolls.count(i) for i in range(1, 7)]
# Create a bar chart using Plotly
fig = go.Figure(data=[go.Bar(x=[1, 2, 3, 4, 5, 6], y=frequency)])
Click to Edit # Add titles and labels
fig.update_layout(
title="Dice Rolls Distribution",
xaxis_title="Dice Face",
yaxis_title="Frequency",
template="plotly_dark"
)
# Show the interactive plot
fig.show()
Downloading data in CSV format
• Downloading data in CSV format and working with
it in Python is a common task, especially for data
analysis. You can download a CSV file from the
web or work with local CSV files that you already
have.
Click to Edit • Let’s walk through a few key steps:
• Downloading a CSV File from the Web
• Reading and Writing CSV Files in Python
1. Downloading a CSV File from the Web
(Using requests)
If you have a link to a CSV file, you can download it
using Python's requests library. Here's an
Click to Edit example:
Install requests (if not already installed):
pip install requests
Downloading a CSV File –
import requests
# URL of the CSV file you want to download
url = "https://example.com/path/to/data.csv"
# Send GET request to download the file
Click to Edit
response = requests.get(url)
# Save the content to a local file
with open("downloaded_data.csv", "wb") as file:
file.write(response.content)
print("CSV file downloaded successfully!")
2. Reading and Writing CSV Files in Python
Reading a CSV File (Using pandas)
The most common and efficient way to read CSV files
is by using the pandas library. Here’s how:
Click to Edit Install pandas (if not already installed):
pip install pandas
• import pandas as pd
• # Read the CSV file into a DataFrame
• df = pd.read_csv("downloaded_data.csv")
Click to Edit
• # Show the first few rows of the dataset
• print(df.head())
Mapping global data sets-JSON
format
Mapping global datasets in JSON format is a
powerful way to visualize and analyze geographical
data, especially when dealing with regions, countries,
or cities. We can use libraries like geopandas or
folium to plot such data on interactive maps.
Click to Edit Here's a general guide to working with JSON-based
global datasets and visualizing them on maps.
• Step 1: Understanding JSON in Geospatial Data
• In a geospatial context, GeoJSON is a common
format for encoding a variety of geographic data
structures. It can contain:
• Point (latitude, longitude)
Click to Edit
• Line (connected series of points)
• Polygon (area enclosed by points)
• An example of GeoJSON for a country might look
like this:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {
"name": "USA",
"population": 331002651
Click to Edit },
"geometry": {
"type": "Polygon",
"coordinates": [[[-125.0, 24.0], [-125.0, 49.0], [-66.0, 49.0], [-66.0, 24.0], [-125.0, 24.0]]]
}
}
]
}
• Step 2: Downloading a Global Dataset in
GeoJSON Format
• You can find global datasets in GeoJSON format
from sources like:
• Natural Earth: Provides free vector and raster maps
Click to Edit of countries, cities, regions, and physical features.
• GeoJSON.xyz: A collection of open datasets in
GeoJSON format.
• OpenStreetMap: A collaborative mapping project
that provides geographic data.
Step 3: Visualizing Global Data Using folium
(Interactive Maps)
We’ll use the folium library to visualize the GeoJSON
dataset. If you don’t have folium, install it first:
Click to Edit pip install folium
import folium
import requests
# URL of a GeoJSON file (Global Countries GeoJSON)
url = "https://raw.githubusercontent.com/codeforamerica/click_that_hood/master/public/data/world-
countries.geojson"
# Load the GeoJSON data from the URL
response = requests.get(url)
geo_data = response.json()
Click to Edit
# Create a base map centered around the world
m = folium.Map(location=[20, 0], zoom_start=2)
# Add the GeoJSON data to the map
folium.GeoJson(geo_data).add_to(m)
# Show the map
m.save("global_map.html")
•Request GeoJSON data: We download a GeoJSON file from a URL containing
country boundaries.
•Create a Map: folium.Map creates an interactive map centered on the world.
•Add GeoJSON data: folium.GeoJson adds the geographic data (countries) to the
map.
•Save the Map: The map is saved as global_map.html, which you can open in a
web browser to interact with the map.
Click to Edit
• Step 4: Working with Data in the JSON
(GeoJSON) File
• You can also access specific properties in the
GeoJSON, such as population, region, or name,
and display them interactively on the map.
Click to Edit
import folium
import requests
# Load the GeoJSON file
url = "https://raw.githubusercontent.com/codeforamerica/click_that_hood/master/public/data/world-countries.geojson"
response = requests.get(url)
geo_data = response.json()
# Create a map centered on the world
m = folium.Map(location=[20, 0], zoom_start=2)
Click to Edit
# Add GeoJSON to the map with tooltips showing country names
folium.GeoJson(
geo_data,
tooltip=folium.features.GeoJsonTooltip(fields=["name"], aliases=["Country:"])
).add_to(m)
# Save to HTML file
m.save("global_map_with_tooltips.html")
Example: Choropleth Map (Visualizing Data by Country):-
import folium
import requests
# Download GeoJSON file (World countries)
url = "https://raw.githubusercontent.com/codeforamerica/click_that_hood/master/public/data/world-
countries.geojson"
response = requests.get(url)
geo_data = response.json()
Click to Edit
# Data for countries, e.g., population
country_data = {
"USA": 331002651,
"India": 1380004385,
"China": 1393409038,
"Brazil": 212559417,
# Add more country data...
}
# Create a map
m = folium.Map(location=[20, 0], zoom_start=2)
# Add choropleth to the map
folium.Choropleth(
geo_data=geo_data,
data=country_data,
columns=["Country", "Population"],
key_on="feature.properties.name", # Key to match GeoJSON properties
Click to Edit
fill_color="YlOrRd", # Color scheme
fill_opacity=0.7,
line_opacity=0.2,
legend_name="Population"
).add_to(m)
# Save map to HTML
m.save("choropleth_map.html")
What is an API?
• An API lets you request and exchange data with a
remote server. Most web APIs return data in JSON
format.
• For example, you might use an API to:
• Get weather data from OpenWeather
Click to Edit
• Fetch COVID-19 statistics
• Pull stock market prices
• Get country or city info from a Geo API
•Python Libraries Used
•You'll mostly use:
•requests – to make HTTP calls
•json – to parse JSON data
Click to Edit Install requests if not already installed:
pip install requests
import requests
# Example API endpoint (JSON Placeholder)
url = "https://jsonplaceholder.typicode.com/posts/1"
# Send GET request
response = requests.get(url)
Click to Edit
# Parse response JSON
data = response.json()
# Print the result
print(data)
import requests
# Example API endpoint (JSON Placeholder)
url = "https://jsonplaceholder.typicode.com/posts/1"
# Send GET request
response = requests.get(url)
# Parse response JSON
data = response.json()
# Print the result
Click to Edit
print(data)
Output :-
{
"userId": 1,
"id": 1,
"title": "sunt aut facere repellat provident occaecati",
"body": "quia et suscipit suscipit recusandae consequuntur"
}
Making Requests with Parameters
Many APIs require query parameters (e.g., ?city=London&units=metric).
params = {
'q': 'London',
'appid': 'YOUR_API_KEY',
'units': 'metric'
Click to Edit }
url = 'http://api.openweathermap.org/data/2.5/weather'
response = requests.get(url, params=params)
data = response.json()
print(data)
Working with the JSON Response
# Extract specific values
city = data['name']
temperature = data['main']['temp']
Click to Edit description = data['weather'][0]['description']
print(f"{city}: {temperature}°C, {description}")
Using APIs that Need Authentication
headers = {
"Authorization": "Bearer YOUR_API_KEY"
Click to Edit }
response =
requests.get("https://api.example.com/data",
headers=headers)
Using a Web API in Python
• Step 1: Install requests (if you haven’t already)
• pip install requests
• Step 2: Choose a Public API
• We'll use this free API:
Click to Edit REST Countries API
📎 URL:
https://restcountries.com/v3.1/name/{country}
• This API returns data about a country when
you provide its name.
Python Code to Use the API:-
import requests
# Ask user for a country name
country = input("Enter a country name: ")
# API endpoint with the country name
url = f"https://restcountries.com/v3.1/name/{country}"
Click to Edit
# Send a GET request to the API
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
data = response.json()[0] # Get the first match
# Extract relevant information
name = data["name"]["common"]
capital = data.get("capital", ["N/A"])[0]
region = data.get("region", "N/A")
population = data.get("population", "N/A")
area = data.get("area", "N/A")
# Print the data
Click to Edit print(f"\nCountry: {name}")
print(f"Capital: {capital}")
print(f"Region: {region}")
print(f"Population: {population}")
print(f"Area: {area} sq km")
else:
print("Country not found or error retrieving data.")
Sample Output:-
Enter a country name: Japan
Click to Edit Country: Japan
Capital: Tokyo
Region: Asia
Population: 125836021
Area: 377930 sq km
Visualizing GitHub Repository
Data with Plotly
•Use the GitHub API to fetch data
•Extract useful information
•Create visualizations using Plotly
Click to Edit
Fetch Repository Data Using GitHub API
import requests
import plotly.graph_objects as go
# Replace with any GitHub username
username = "microsoft"
# GitHub API endpoint
url = f"https://api.github.com/users/{username}/repos"
Click to Edit # Send GET request
response = requests.get(url)
repos = response.json()
# Extract data
repo_names = []
stars = []
forks = []
open_issues = []
for repo in repos:
repo_names.append(repo['name'])
stars.append(repo['stargazers_count'])
forks.append(repo['forks_count'])
open_issues.append(repo['open_issues_count'])
# Create a bar chart using Plotly
fig = go.Figure()
Click to Edit
fig.add_trace(go.Bar(x=repo_names, y=stars, name='Stars'))
fig.add_trace(go.Bar(x=repo_names, y=forks, name='Forks'))
fig.add_trace(go.Bar(x=repo_names, y=open_issues, name='Open Issues'))
# Customize layout
fig.update_layout(
title=f"GitHub Repo Activity for '{username}'",
xaxis_title="Repository",
yaxis_title="Count",
barmode='group',
template='plotly_dark',
xaxis_tickangle=-45
Click to Edit )
# Show the chart
fig.show()