0% found this document useful (0 votes)
26 views

Computer Vision Notes

Uploaded by

Shayar Chauhan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Computer Vision Notes

Uploaded by

Shayar Chauhan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Computer vision

Computer Vision in AI is a field where computers and systems can interpret, analyze, and
understand images or videos in a way similar to human vision. AI-based computer vision
uses algorithms and machine learning models to detect patterns in visual data, recognize
objects, and even make decisions based on what it “sees.”

How Computer Vision Works:

1. Image Capture: The system receives an image or video as input (e.g., through a
camera).
2. Preprocessing: The image is processed to make it more usable. This can include
noise reduction, resizing, or converting it to grayscale.
3. Feature Extraction: The system identifies key elements like shapes, edges, or
colors in the image. For example, it can detect where eyes, noses, or other parts of
a face are located.
4. Classification: Based on features, the AI compares the patterns to a database of
known objects to recognize what it’s looking at. This step often uses machine
learning to improve accuracy.
5. Decision Making: After identifying the object, the AI can take action or provide
insights based on the image. For example, in facial recognition systems, it may
grant access or alert security if an unknown person is detected.

Real-World Example: Self-Driving Cars

In self-driving cars, computer vision is used to detect objects like pedestrians, traffic
lights, and other vehicles. The car's cameras capture images of the road, and the AI
analyzes the scene to:

• Recognize traffic signs (e.g., stop or speed limit signs).


• Detect lanes and keep the car within them.
• Spot pedestrians and avoid collisions.
• Read traffic signals and make decisions like stopping at a red light.
Where Computer Vision is Useful:

1. Healthcare: AI can analyze medical images like X-rays or MRIs to detect diseases
such as cancer or fractures more accurately than humans.
2. Retail: Used in cashier-less stores (like Amazon Go) to automatically track items
customers pick up and add them to a virtual cart.
3. Security: Facial recognition systems use computer vision to identify people in
surveillance footage, granting or denying access in secure areas.
4. Agriculture: Drones equipped with cameras and AI analyze crops, detecting
disease, nutrient deficiencies, or growth patterns to help farmers optimize yields.
5. Manufacturing: Automated quality control systems can visually inspect products
for defects in real time.

Why is it Useful?

• Efficiency: Automates tasks that would otherwise require human vision.


• Accuracy: AI can often detect patterns or abnormalities that may be missed by
humans.
• Scalability: Can be applied on a large scale (e.g., inspecting thousands of
products, or monitoring traffic).

The history of computer vision

Scientists and engineers have been trying to develop ways for machines to see and
understand visual data for about 60 years. Experimentation began in 1959 when
neurophysiologists showed a cat an array of images, attempting to correlate a
response in its brain. They discovered that it responded first to hard edges or lines and
scientifically, this meant that image processing starts with simple shapes like straight
edges. 2

At about the same time, the first computer image scanning technology was developed,
enabling computers to digitize and acquire images. Another milestone was reached in
1963 when computers were able to transform two-dimensional images into three-
dimensional forms. In the 1960s, AI emerged as an academic field of study and it also
marked the beginning of the AI quest to solve the human vision problem.
1974 saw the introduction of optical character recognition (OCR) technology, which
could recognize text printed in any font or typeface. Similarly, intelligent character
3

recognition (ICR) could decipher hand-written text that is using neural networks. Since4

then, OCR and ICR have found their way into document and invoice processing, vehicle
plate recognition, mobile payments, machine conversion and other common
applications.

In 1982, neuroscientist David Marr established that vision works hierarchically and
introduced algorithms for machines to detect edges, corners, curves and similar basic
shapes. Concurrently, computer scientist Kunihiko Fukushima developed a network of
cells that could recognize patterns. The network, called the Neocognitron, included
convolutional layers in a neural network.

By 2000, the focus of study was on object recognition; and by 2001, the first real-time
face recognition applications appeared. Standardization of how visual data sets are
tagged and annotated emerged through the 2000s. In 2010, the ImageNet data set
became available. It contained millions of tagged images across a thousand object
classes and provides a foundation for CNNs and deep learning models used today. In
2012, a team from the University of Toronto entered a CNN into an image recognition
contest. The model, called AlexNet, significantly reduced the error rate for image
recognition. After this breakthrough, error rates have fallen to just a few percent. 5

You might also like