Unit 7: Chapter 2
Using Databases and SQL
What is a Database?
A database is an organized collection of data stored and accessed electronically.
It allows easy management, retrieval, and updating of data.
Examples: Relational databases like MySQL, SQLite; NoSQL databases like MongoDB.
Database Concepts
Tables: Data is stored in rows and columns (like a spreadsheet).
Records: A row in a table, representing a single data entry.
Fields: Columns in a table, representing attributes of the data.
Primary Key: Unique identifier for each record in a table.
Foreign Key: A field in one table that links to the primary key in another table.
Database Browser for SQLite
SQLite: A lightweight, embedded database used in many Python applications.
DB Browser for SQLite: A GUI tool for managing SQLite databases without SQL
commands.
o Features: Create databases, design tables, execute SQL queries, export data.
Creating a Database Table in SQLite:
import sqlite3
# Connect to (or create) a database
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
# Create a table
cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT,
email TEXT UNIQUE
)
''')
conn.commit()
conn.close()
Structured Query Language (SQL) Summary
SELECT: Retrieve data.
INSERT: Add new data.
UPDATE: Modify existing data.
DELETE: Remove data.
JOIN: Combine data from multiple tables.
WHERE: Filter conditions.
ORDER BY: Sort results.
GROUP BY: Aggregate data.
Spidering Twitter Using a Database
Spidering: Collecting and storing data systematically.
Example: Using tweepy to fetch Twitter data and store it in SQLite.
import sqlite3
import tweepy
# Set up Twitter API (use your credentials)
client = tweepy.Client('BEARER_TOKEN')
# Connect to the database
conn = sqlite3.connect('twitter.db')
cursor = conn.cursor()
# Create a tweets table
cursor.execute('''
CREATE TABLE IF NOT EXISTS tweets (
id TEXT PRIMARY KEY,
text TEXT
)
''')
# Fetch tweets and store them
tweets = client.search_recent_tweets(query="Python", max_results=10)
for tweet in tweets.data:
cursor.execute('INSERT OR IGNORE INTO tweets (id, text) VALUES (?, ?)', (tweet.id,
tweet.text))
conn.commit()
conn.close()
Basic Data Modelling
Entities: Represent real-world objects (e.g., users, tweets).
Attributes: Properties of entities (e.g., name, email).
Relationships: Links between entities (e.g., a user can post many tweets).
Programming with Multiple Tables
Example: Users and posts in a social media app.
# Create tables
cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT
)
''')
cursor.execute('''
CREATE TABLE IF NOT EXISTS posts (
id INTEGER PRIMARY KEY,
user_id INTEGER,
content TEXT,
FOREIGN KEY (user_id) REFERENCES users (id)
)
''')
Constraints in Database Tables
PRIMARY KEY: Ensures unique identification.
FOREIGN KEY: Ensures relationships between tables.
NOT NULL: Prevents null values.
UNIQUE: Ensures unique values in a column.
CHECK: Ensures a condition is met.
Retrieve and/or Insert a Record:
# Insert a record
cursor.execute('INSERT INTO users (name) VALUES (?)', ('Alice',))
# Retrieve a record
cursor.execute('SELECT * FROM users WHERE name = ?', ('Alice',))
print(cursor.fetchone())
Storing the Friend Relationship
Represented by a junction table.
cursor.execute('''
CREATE TABLE IF NOT EXISTS friends (
user_id INTEGER,
friend_id INTEGER,
PRIMARY KEY (user_id, friend_id),
FOREIGN KEY (user_id) REFERENCES users (id),
FOREIGN KEY (friend_id) REFERENCES users (id)
)
''')
Three Kinds of Keys
1. Primary Key: Unique identifier for records.
2. Foreign Key: Links between tables.
3. Composite Key: Combination of fields acting as a unique identifier.
Using JOIN to Retrieve Data
cursor.execute('''
SELECT users.name, posts.content
FROM users
JOIN posts ON users.id = posts.user_id
WHERE users.name = ?
''', ('Alice',))
print(cursor.fetchall())