Aragog 2025
Spider ML
Intro to ML and CV
Types of AI
Computer Vision?¿
What is (Computer) Vision?
Human vs Computer Vision
Applications of CV
What do we do in CV?
Techniques in CV
Edge Detection
Shape Detection
Object Detection
Motion Detection
Scene Understanding
Edge detection
Edges are basically boundaries between
different regions in an image. Detecting them
helps computers understand the shapes, objects,
and structure within an image. It’s just like how
we recognize outlines of things in a photo — our
brains are wired for it, and edge detection lets
machines do the same.
Sobel Filters
Sobel filters use simple convolution operations to
find edges in two directions:
● Horizontal (X-direction)
● Vertical (Y-direction)
The Sobel operator computes an approximation of
the gradient of an image
Canny Edge detection
Shape Detection Using CV
WHAT IS IT?
Identifying and classifying
geometric shapes in an image by
detecting closed contours.
Typically done using edge
detection and contour
approximation.
WHY IS IT NEEDED?
It is widely used in applications
such as object recognition,
industrial inspection, autonomous
navigation, and gesture-based
interfaces.
Basic Steps In Shape Detection
Load the image Preprocessing Edge detection
Classify Shapes (Accurate) Shape classification Contour approximation
Code Demonstration
Link to access the code: https://colab.research.google.com/drive/1679e1OZjra-
bGyGZsno8lXJmbHELoCSp?usp=sharing
Follow the instructions mentioned on google colab and run the code
Challenges In Shape Detection
Noisy Images Similar Shapes (e.g., Square
vs. Diamond)
Random variations or grainy
textures in images can create Shapes like squares and
false edges, leading to diamonds have the same
incorrect or excessive shape number of sides and angles, but
detection. different orientations.
Distorted Shapes Poor Lighting Detecting Insignificant Shapes
Shapes may appear warped due Low or uneven lighting causes When all contours are
to perspective angles or lens shadows and highlights that alter considered, even tiny
distortion, making it hard for the appearance of edges, imperfections or noise can be
algorithms to match them with resulting in incomplete or detected as shapes, cluttering
ideal geometric templates. inaccurate contour detection. the result with irrelevant outlines.
Object Detection
According to the image, object detection can recognize:
● Objects: e.g., tree, bench, deck, carousel
● Scenes/Locations: amusement park, Cedar Point,
Lake Erie
● Activities: people waiting in line, people sitting on ride
● Structures/Rides: Ferris wheel, The Wicked Twister,
maxair
● Text & Signs: 12 E
● Motions/Emotions: pedestrians walking, waiting
Object Detection
Approach #2: Network-based (Deep
Approach #1: Feature-based
Learning)
● This is a more traditional method.
● This is the modern method using neural
● It works by:
networks.
○ Finding specific features in an image (like
● Steps:
edges, corners).
○ Train a deep network with many
○ Modeling objects based on these labeled images (with objects
features. marked).
○ Searching for matching patterns in the ○ The model learns to recognize
image and locate objects on its own.
Think of it like finding pieces of a puzzle and Think of it like teaching a child with lots of
matching them. examples until they learn to recognize
things themselves.
Motion Detection with
Computer Vision
Motion detection is a crucial area in computer vision. It involves identifying
movement in a scene over time.
Core CV Models for Motion Detection
Deep Learning
Object detection plus tracking across frames.
Optical Flow
Estimates motion vectors between frames.
Several CV models excel at motion detection.
Background Subtraction
Models static backgrounds to detect deviations.
Frames
Key Challenges in Motion Detection
Several factors complicate motion detection.
Illumination Changes
Occlusion
Dynamic Backgrounds
Depth Estimation and Motion
Detection
Depth data enhances motion understanding.
Stereo Vision RGB-D Sensors
Uses triangulation and disparity Provides motion analysis using
maps. depth.
Deep Learning
Monocular depth prediction.
Intro to Mediapipe :
Mediapipe offers cross-platform,
customizable ML solutions for live
and streaming media.
Face Object Pose
detection detection detection
MediaPipe Python Framework:
Mediapipe key features :
● Modularity
● Pre-built Solutions
● Cross-Platform:
● Real-Time Performance:
To install the module :
Pip install mediapipe
MediaPipe Syntax:
● Mp.solutions.hands :This is the MediaPipe module that contains all hand-related processing
solutions.
● The Hands class is the main hand detection and tracking model.
MediaPipe Syntax:
● hands.process() is a core method that analyzes images/frames to detect and track hands.
Hand detection in the image,Landmark detection (21 points per hand),Hand type classification (left/right),3D pose
estimation
Returns all this information in a structured format
MediaPipe Syntax:
To access the landmarks
Why do we need both mediapipe and cv :
● MediaPipe handles the core ML-based hand detection and landmark estimation, OpenCV
provides the essential utilities for getting the video input, performing basic image manipulations,
drawing custom annotations (like gesture names), and displaying the final output video stream .
● MediaPipe tells you what the gesture is (or gives you the landmarks to figure it out), but OpenCV
helps you display that information visually on the screen.
Intro to YOLO :
YOLO (You Only Look Once): Object
a popular object detection and image
segmentation model, was developed detection
by Joseph Redmon and Ali Farhadi at
the University of Washington.
Image Pose
classification detection
Mediapipe vs YOLO :
Model flow:
What to do next?
● Click the link here to access the code, or use the one sent in the WhatsApp group.
● Ensure to do the following steps – Install Python 3.10.11 , Unzip the code, Open the folder in VS Code, Download using pip, run the code.
● Incase of any queries, feel free to contact any one of the ML seniors.