AI Milestones &
Introduction to
Generative AI
Learning
Outcomes
• Understand Generative
AI Fundamentals
• Hands-on Experience
with IBM watsonx.ai
• Prompt Engineering
Techniques
• Work with Foundation
Models
• Develop Retrieval-
Augmented Generation
(RAG) Pipelines
• Build and Deploy AI-
Powered Chatbots
• Industry Use Cases
By the End of the Course, You Will
Be Able To:
1 2 3 4 5
Build GenAI Engineer Apply Implement Create
Solutions Prompts Foundation RAG Chatbots
Models
IBM
Certificate
Prolearn
What is Artificial
Intelligence (AI)?
• The field of computer science
focused on creating intelligent
machines capable of performing
tasks that typically require human
intelligence.
Key Areas:
• Learning
• Reasoning
• Problem-solving
• Perception
• Language understanding
Current Applications:
Voice Recommendati Facial
assistants on engines recognition
Autonomous
Robotics.
vehicles
What is
Generative
AI ?
A class of AI models that
generate new content
(text, images, code, etc.)
Difference from
Traditional AI
• Traditional AI:
Predictive or
classification-based
• Generative AI:
Creative, synthesizing
outputs
Real-World
Examples
• Text: ChatGPT,
Google Gemini,
Claude
• Images: DALL·E,
Midjourney
• Music: Suno, Aiva
• Code: GitHub
Copilot
1950 – Alan Turing
& The Turing Test
Milestone: Alan Turing’s paper “Computing
Machinery and Intelligence”
Key Question: "Can machines think?"
The Turing Test:
• A human interacts with both a machine and
a human via text.
• If the human can't distinguish which is
which, the machine is said to have passed
the test.
Legacy: Still a foundational concept in AI
ethics and benchmarks.
1956 – Birth of
the Term ‘Artificial
Intelligence’
Milestone: John McCarthy coins
the term “Artificial Intelligence”
Event: Dartmouth Summer
Research Project on Artificial
Intelligence
Significance:
• Marked the official launch
of AI as a field of study.
• Led to decades of
exploration and funding in
AI research.
1958–1980s –
Neural Networks
& Perceptrons
1958: Frank Rosenblatt develops Mark 1
Perceptron
• Early neural network capable of learning
through trial and error.
1969: Marvin Minsky & Seymour Papert publish
“Perceptrons”
• Highlighted limitations of single-layer
perceptrons.
• Temporarily led to a decline in neural network
research (AI Winter).
1980s: Revival with multi-layer neural networks
and backpropagation.
1997 – Deep
Blue vs Garry
Kasparov
Achievement: IBM’s Deep Blue
defeats world chess champion
Garry Kasparov.
Importance:
• First time a computer beat a
reigning world champion in
chess.
• Marked AI’s ability to perform
complex strategic planning.
2011 – IBM
Watson Wins
Jeopardy!
Event: IBM Watson defeats
Jeopardy! champions Ken
Jennings and Brad Rutter.
Technologies Used:
• Natural language
processing (NLP)
• Information retrieval
• Machine learning
Impact: Sparked interest in
AI for healthcare, customer
service, etc.
2015 – Baidu’s Minwa
& Image Recognition
Tool: Minwa supercomputer by
Baidu
Technology: Convolutional Neural
Networks (CNNs)
Achievement: Classified images
with higher accuracy than humans.
Use Case: Set new benchmarks in
computer vision.
2016 – AlphaGo
Defeats Lee Sedol
Milestone: DeepMind’s
AlphaGo beats world
champion Go player Lee
Sedol.
Challenge: Go has more
possible moves than atoms in
the universe.
Technology: Deep
reinforcement learning +
neural networks
Significance: Demonstrated
AI mastery in intuition-based
strategy games.
2022 – Developer: OpenAI
Model: GPT-3.5 → upgraded to
Release of GPT-4
Breakthrough:
ChatGPT • Human-like conversation
• Creative writing
• Code generation
• Language translation
Widespread Impact: Education,
business, content creation,
customer support.
How ChatGPT
Works
Underlying Model:
Large Language Models
(LLMs)
Training Data Includes:
• Wikipedia
• Books
• Web articles
• Code repositories
• Reddit (3-star+ rated
responses)
Capabilities:
• Rewriting
• Summarizing
• Reorganizing
• Style/Language adaptation
Understanding AI,
Machine Learning, and
Deep Learning
Overview
Objective: Clarify the
differences and
relationships between
AI, ML, and DL.
Why it matters:
These terms are often
confused, but each
represents a different
scope and level of
complexity.
Artificial Intelligence (AI) – The
Big Picture
Goal: Build machines
AI is the broad science that simulate human
of mimicking human intelligence to perform Reasoning
abilities. cognitive functions
like:
Learning Problem-solving Decision-making
Examples:
Chatbots Autonomous Robotics Virtual
cars assistants
(e.g., Siri,
Alexa)
Machine Learning
(ML) – A Subset
of AI
ML enables machines to learn
from data without explicit
programming.
Key Characteristic:
Improves automatically with
experience.
Techniques Used:
• Supervised Learning
• Unsupervised Learning
• Reinforcement Learning
Real-World Examples:
FRAUD DETECTION PRODUCT EMAIL SPAM
RECOMMENDATIONS FILTERING
Deep Learning
(DL) – A
Subset of ML
Deep Learning is an
advanced ML technique using
artificial neural networks with
many layers.
Keyword: “Deep” = Multiple
hidden layers
Best For:
• Image recognition
• Speech-to-text
• Natural language
understanding
Example Applications:
FACE RECOGNITION IN PHONES AUTOMATIC LANGUAGE SELF-DRIVING CAR OBJECT
TRANSLATION (E.G., GOOGLE DETECTION
TRANSLATE)
Visualizing the Relationship
How Machine
Learning Works
Step-by-Step Process:
• Data Collection
• Data Preprocessing
• Model Training
• Model Evaluation
• Prediction &
Improvement
Goal: Minimize error or
maximize prediction accuracy
Example: Predicting house
prices based on square
footage, location, etc.
Deep Learning
in More Detail
Architecture: Artificial Neural
Networks (ANNs)
• Input Layer: Raw data
(e.g., image pixels)
• Hidden Layers: Feature
extraction through weighted
operations
• Output Layer: Final
prediction/classification
Advantages:
• Automatic feature extraction
• Handles unstructured data
(images, audio, text)
Disadvantages:
• Requires large datasets
• High computational cost
Why Deep
Learning Took Off
After 2010
Earlier Challenges:
• Limited labeled data
• High computational
demands
Breakthrough Factors:
• Big Data availability
• More powerful GPUs
• Algorithmic improvements
(e.g., ReLU, Dropout, Adam
optimizer)
Result: DL became viable for
business, healthcare, security, and
more.
AI vs ML vs DL – Comparison
Table
Feature AI ML DL
Simulates human Learns from data to make Uses deep neural networks
Definition
intelligence predictions for learning
Scope Broad Narrower (subset of AI) Narrowest (subset of ML)
Neural networks with
Learning Approach May use rules or logic Statistical models
multiple layers
Data Requirements Moderate Moderate to high Very high
Self-driving cars, facial
Example Smart assistants Email filtering
recognition
Real-World
Examples for
Each
AI:
• Robotics in manufacturing
• Smart thermostats
ML:
• Netflix movie
recommendations
• Stock market predictions
DL:
• Tesla’s autonomous driving
• Deepfake videos
• Voice assistants
understanding natural
language
Common
Misconceptions
• AI = ML = DL ❌
• All AI involves learning ❌ (e.g.,
rule-based systems are AI but
not ML)
• DL is always better than ML ❌
(DL is data-hungry and costly)
Why This Matters
for Businesses
Strategic Planning: Knowing
the right technique saves time
and cost
AI Readiness: Assess
infrastructure and data before
implementing DL
Decision Making: Choose AI
vs ML vs DL based on:
• Data volume
• Problem complexity
• Available compute
resources
Summary
• AI is the broadest field.
• ML allows machines to learn
from data.
• DL uses deep neural
networks to solve highly
complex problems.
• All three enable pattern
recognition and predictive
analytics at scale.
Rise of
Generative AI
• A subset of AI that
generates text, images,
music, or code based on
patterns learned from
training data.
Popular Tools:
• ChatGPT
• IBM WATSON
• DALL·E / Stable
Diffusion
• GitHub Copilot
• MusicLM / Jukebox
Use Cases:
• Content generation
• Design automation
• Personalized education
• Software development
assistance
Benefits of Generative AI
Supports
Enables
Accelerates accessibility
personalized
content creation (e.g., auto-
experiences
captioning)
Assists in Enhances
research and productivity
development across sectors
Challenges & Concerns
Ethical issues
Bias in training around Job displacement
data misinformation and in creative fields
plagiarism
Security threats
Data privacy
(e.g., deepfakes,
concerns
phishing)
Future of AI
& Society
Trend: Increasing integration into
daily life
Opportunities:
• Healthcare (diagnostics, drug
discovery)
• Education (personal tutors, content
summaries)
• Business (automation, analytics)
Calls for Action:
• Responsible AI development
• Global AI governance frameworks
• Collaboration between tech and
policy leaders
Summary of Key AI Milestones
Year Milestone
1950 Turing’s “Can machines think?”
1956 Term "AI" coined by John McCarthy
1958 Perceptron introduced
1997 Deep Blue beats Kasparov
2011 Watson wins Jeopardy!
2015 Baidu’s CNN outperforms humans in image classification
2016 AlphaGo beats Go world champion
2022 ChatGPT released by OpenAI
Understanding
the Roots of
Generative AI
• The foundation of
Generative AI—and much of
what is known as
Traditional AI—lies in the
development of Neural
Networks and Deep
Neural Networks. These
architectures enabled
machines to learn patterns,
represent complex data, and
eventually generate new
content, leading to the
powerful systems we see
today.
Neural
Networks
• Mimics the human
brain using layers of
"neurons"
• Processes inputs
through multiple
connected layers
• Learns patterns,
relationships &
performs predictions
Neural
Networks
A Neural Network (NN) is a
computational model inspired by
the way biological neurons in the
brain communicate. Each neuron
(node) processes inputs using
weights and biases, applies an
activation function, and passes the
result forward.
There are 3 main types of layers:
• Input Layer – Receives the raw
data
• Hidden Layers – Perform
computations and
transformations
• Output Layer – Produces the
final prediction
Deep Learning
• Subset of Machine Learning
using Neural Networks
• Learns from unstructured
data (images, audio, text)
• Uses multiple hidden layers
for feature extraction
Deep
Learning
Deep Learning is a branch of
Machine Learning that uses
artificial neural networks with
many layers. It enables systems
to learn from large volumes of
unstructured data and extract
meaningful patterns without
manual feature engineering.
Examples include:
• Face Recognition
• Language Translation
• Medical Diagnosis
Recurrent Neural
Networks (RNN)
• Neural networks with
memory
• Designed for sequential
data
• Shares weights across time
steps
Recurrent
Neural
Networks (RNN)
• Recurrent Neural Networks
(RNNs) are specialized for
processing sequential
data such as text, time
series, or speech.
• They maintain a memory
of past inputs using hidden
states, enabling context-
aware predictions.
• However, they suffer from
vanishing gradient
problems when sequences
are long.
Limitations
of RNN
Recurrent Neural Networks
(RNNs)
• Struggle with long-term
dependencies
• Training is computationally
expensive
• Difficult to parallelize due to
sequential nature
Modern
Generative
AI
Architectur
es
The Evolution
of Generative AI
• Traditional deep neural
networks were used mainly
for classification.
• The introduction of
Autoencoders, VAEs, and
GANs marked the beginning
of modern generative AI.
• These models are capable of
generating new content
— images, text, audio, and
more.
Autoencod
er: Basics
• Neural network that
compresses input data
(encoding) and then
reconstructs it (decoding).
• Learns latent features
(essential characteristics) of
the input.
• Uses unsupervised learning.
Architecture:
• Encoder → Compresses
data
• Bottleneck (Code
Layer) → Most compact
representation
• Decoder →
Reconstructs data
Working of Autoencoder
Input passes through shrinking layers in the encoder.
Bottleneck forces the network to learn only key information.
Decoder expands the data back to original dimensions.
The difference between input and output is reconstruction error.
Trained using loss function + backpropagation.
Use Cases of Autoencoders
Image Anomaly
Image Denoising
Compression Detection
Forms basis for
Feature VAEs and AAEs
Facial Recognition
Extraction (Adversarial
Autoencoders)
Variational
Autoencoder
(VAE)
• Introduced by Kingma &
Welling (2014)
• Learns probabilistic
latent space
• Can generate new data
samples
Uses two vectors:
• μ (mean)
• σ (standard deviation)
• Enables interpolation
and randomness
VAE
Architecture
Encoder: Converts inputs to
latent distributions (μ, σ)
Reparameterization Trick:
• ε = random noise
• z=μ+σ×ε
Decoder: Reconstructs data
from z
Loss function = Reconstruction
Loss + KL Divergence
Applications of VAEs
Data Generation
Data
(images, text,
Augmentation
audio)
Controlled
Anomaly Generation with
Detection Conditional VAEs
(CVAEs)
Introductio
n to GANs
Generative Adversarial
Networks (GANs)
Introduced by Ian
Goodfellow (2014)
Generative: Produces
synthetic data
Adversarial: Two networks
compete
• Generator (G):
Tries to fool the
system
• Discriminator (D):
Tries to detect fakes
GAN
Analogy
• Generator = Counterfeiters
• Discriminator = Police
• Both improve until
counterfeits are
indistinguishable from real
data
• Drives learning in both
models
GAN
Architecture
Input: Random noise vector
Generator: Uses
transposed convolutions
to produce image
Discriminator: Uses CNN
to classify as real or fake
Training:
• Generator learns to
fool the
Discriminator
• Discriminator learns
to detect fakes
How GANs
Train
• Both networks use
backpropagation
• Loss functions are adversarial:
• Generator loss: How well
it fools the discriminator
• Discriminator loss:
Accuracy in classifying real
vs fake
• Training is unstable, requires
fine-tuning
Applicatio
ns of GANs
• Image Synthesis
(e.g., DeepFakes,
faces)
• Style Transfer (e.g.,
photo to painting)
• Super Resolution
• Data
Augmentation
• Video Prediction
• 3D Model
Generation
Summary
Architectur
Primary Use Type Output Strength
e
Autoencode Compression/ Unsupervise Reconstructe Feature
r Reconstruction d d Data Learning
Probabilistic
VAE Generation Generative New Samples Latent
Space
Generative + Realistic High-quality
GAN Generation
Adversarial Fakes Outputs
Discriminator in GANs
Takes two inputs:
• Real images from the dataset
• Fake images from the generator
• Uses Convolutional Neural Networks
(CNNs) to extract features
• Downsampling layers reduce image
size for classification
• Ends with a fully connected layer +
Sigmoid activation
• Outputs probability of image being
real or fake
• Learns to distinguish between authentic
and synthetic data
GAN
Optimization
• Both Generator and Discriminator
are trained simultaneously
• Each model learns in response to the
other
• Backpropagation used for learning
Objective:
• Generator minimizes chance
of being caught (fooling D)
• Discriminator maximizes
accuracy in classifying
real/fake
GAN
Equilibrium
Equilibrium point:
• Discriminator can’t
distinguish real vs fake (50%
confidence)
• At this stage:
• Generator creates realistic
data
• Discriminator becomes
uncertain
• Training GANs is tricky:
• Too strong D → G can’t learn
• Too strong G → D becomes
useless
• Requires careful learning rate
tuning and model balancing
Practical Applications of GANs
Data Augmentation – Generate new training data
Drug Discovery – Generate molecular structures
Super-Resolution – Upscale low-quality images
Image Inpainting – Fill in missing pixels
Colorization – Turn B/W images to colored
Creative AI – Generate art, music, faces (e.g., DeepFakes)
Seq2Seq Architecture
Introduced by Google in 2014 for machine translation
Maps input sequence to output sequence
Encoder: Processes input and creates internal
Uses an Encoder-Decoder states
architecture Decoder: Uses internal states to generate
output
Key Feature: Attention Focuses on relevant parts of input when
Mechanism generating output
Seq2Seq
Working
Mechanism
• Encoder reads the
sequence (e.g.,
sentence in English)
• Encoder’s hidden
states are passed to
Decoder
• Decoder uses attention
to generate output
(e.g., in French)
• Decoder ignores
encoder’s outputs
and uses only hidden
states
Applications of Seq2Seq Models
Machine
Translation Text
Chatbots
(Google Summarization
Translate)
Speech Image
Recognition Captioning
Transformers – A Revolution
• Proposed in 2017: “Attention Is All
You Need”
• Solves limitations of RNNs &
Seq2Seq
• Uses multi-head self-attention
and feed-forward layers
• Enables parallel processing
(unlike RNNs)
• Learns long-term dependencies
better
Transformer
Architecture
• Composed of:
• Encoder Stack (6 layers)
• Decoder Stack (6 layers)
• Each layer includes:
• Multi-head attention
• Layer normalization
• Feed-forward neural nets
• Encoders and decoders are fully
connected
• Can process full input in parallel
Transformer
vs Seq2Seq
Feature Seq2Seq Transformer
Attention-
Architecture RNN-based
based
Processing Sequential Parallel
Long
Dependencie Poor Strong
s
Speed Slow Fast
Scalability Limited Highly Scalable
Impact of Transformers
• Basis for LLMs (GPT, PaLM, LLaMA, Claude)
• Used in Image Generators (DALL·E, Midjourney)
• Transformers enable:
• Chatbots
• Code generation
• Story and poem writing
• Scientific research assistance
Summary
• Discriminator in GANs acts as
a binary classifier
• GANs require careful
optimization and
equilibrium
• Seq2Seq is powerful for
sequential data
• Attention Mechanism
brought major improvements
• Transformers revolutionized
AI by enabling LLMs
Types of
Generative AI
Main Categories:
• Image Generators
• Generate realistic or stylized
images from text prompts
(e.g., DALL·E, MidJourney).
• Used in design, fashion,
architecture, and medical
imaging.
• Text Generators
• Generate human-like
language for
communication, content
creation, and summarization.
• Examples: ChatGPT, Claude,
Gemini, LLaMA.
Emerging Categories:
Music Generators – Video Generators – 3D Model
Create melodies, AI-generated Generators – Used in
instrumentals, or entire animation or real- gaming, product
tracks. looking videos. design, AR/VR.
Building Blocks
of an AI Strategy
1. Foundation Models
• Trained on massive, diverse,
unlabeled datasets.
• General-purpose models
adaptable to various tasks.
• Powered by Transformer
architecture.
• Require minimal fine-tuning
for specific use cases.
• Example: GPT, BERT, PaLM.
Large Language Models
(LLMs)
• A type of Foundation Model
specialized for natural language
tasks.
• Understand, generate, translate,
summarize, and answer questions.
• Example: ChatGPT, LLaMA, Claude.
• Foundation Models =
Infrastructure; LLMs = Tools built
on it
Foundation Models vs Traditional Models
Feature Traditional ML Model Foundation Model
Training Data Task-specific Large, broad, unlabeled
Fine-Tuning Extensive Minimal
Transfer Learning Limited Strong
Scalability Narrow scope Multi-tasking capable
Example Spam Classifier GPT, BERT, DALL·E
Applications of Traditional AI
Predictive Recommendatio
Fraud Detection
Maintenance in n Systems in E-
in Banking
Manufacturing Commerce
Autonomous
Disease
Navigation in
Diagnosis in
Robotics and
Healthcare
Vehicles
Applications of NLP (Natural
Language Processing)
Text Speech
Sentiment
Summarization Recognition
Analysis (e.g.,
(e.g., executive (e.g., voice
customer reviews)
reports) assistants)
Language Text
Translation (e.g., Classification &
Google Translate) Extraction
Applications of
Conversational
AI
• Chatbots for Customer
Service
• Available 24x7 to respond to
FAQs
• Voice Assistants
• Alexa, Siri, Google Assistant
• Internal Helpdesk Bots
• HR queries, IT support in
enterprises
• Smart Home Automation
Interfaces
Applications
of Generative
AI
• Text Generation: Blogs, code,
emails, reports.
• Image Generation: Art,
design, branding assets.
• Code Generation: Auto-
suggest functions (e.g., GitHub
Copilot).
• Video Synthesis: Deepfakes,
marketing content.
• Synthetic Data: For training
other AI models.
• Digital Avatars and Virtual
Influencers
Benefits of Generative AI
Boosts Productivity Cost-Efficient
→ Automates → Reduces need for
repetitive content human-only content
creation. generation.
Creativity
Personalization at
Enhancement
Scale
→ Assists artists,
→ Customized emails,
writers, and musicians
ads, messages for
with inspiration and
millions.
ideas.
Rapid Prototyping
Data Augmentation
→ Designers and
→ Fills in missing or
developers can
rare data scenarios for
visualize concepts
model training.
instantly.
Challenges in
Generative AI
1. Data Quality & Quantity
• Requires vast amounts of
clean, diverse training data.
• Biased or incomplete data
leads to poor results.
2. Model Alignment
• Ensuring the AI’s outputs
align with human values,
intentions, and context.
• Hard to control creativity
without losing accuracy.
Challenges in Generative AI
3. Computational Cost
• Training large models demands massive GPU/TPU resources.
• High energy consumption and carbon footprint.
4. Lack of True Understanding
• Models mimic language or images statistically.
• They do not "understand" facts or concepts like humans.
Limitations of
Generative AI
1. Hallucinations
• AI may confidently generate incorrect or fictional
information.
e.g., "ChatGPT creates fake quotes or citations."
2. Lack of Contextual Memory
• Short memory span; struggles with long
conversations or context chains.
• LLMs don’t retain memory between sessions
(unless designed to).
3. Domain-Specific Limitations
• May not perform well on niche, scientific, or highly
technical topics without domain fine-tuning.
4. Dependence on Prompt Engineering
• Output quality heavily relies on how well prompts
are crafted.
Risks in
Generative AI
1. Misinformation & Fake Content
• Realistic fake news, deepfakes, and
synthetic media can be misused.
2. Privacy Breaches
• Models can inadvertently memorize
and regurgitate sensitive data from
training datasets.
3. Identity Fraud & Impersonation
• Voice and face generation can lead to
fraud or reputational harm.
Risks in
Generative AI
4. Ethical Concerns &
Bias
• Reinforcement of racial,
gender, or political biases
present in training data.
• Risk of discrimination in
automated decision-
making.
5. Security Threats
• Can be misused for
phishing, social
engineering, and
generating malicious
code or malware.
Mitigation
Strategies
1. Human-in-the-
Loop (HITL)
2. Better Training
Data Curation
3. Regulatory
Compliance
4. Robust Testing &
Monitoring
5. Prompt Safety
and Guardrails
THANK YOU