0% found this document useful (0 votes)
72 views13 pages

Object Finding Robot Project Report

The document describes a project to build a robot that can detect objects through computer vision and locate their exact positions. It discusses the objectives of object recognition, navigation, user interaction and hardware integration. It also reviews two existing projects, RFusion and RF-Grasp, that use radio frequency signals and computer vision to locate hidden objects.

Uploaded by

Sroy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views13 pages

Object Finding Robot Project Report

The document describes a project to build a robot that can detect objects through computer vision and locate their exact positions. It discusses the objectives of object recognition, navigation, user interaction and hardware integration. It also reviews two existing projects, RFusion and RF-Grasp, that use radio frequency signals and computer vision to locate hidden objects.

Uploaded by

Sroy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Project Report:

Object Finding Robot


A robot that detects objects through computer vision and finds their exact location

Submitted to Submitted by

Sourav Roy Shahadat Hossen

Assistant Professor Roll: 2031003

Department of Mechatronics Engineering Department: Mechatronics Engineering

Khulna University of Engineering & Batch: 2k20

Technology, Khulna. Year: 3rd

Semester: 1st

Khulna University of Engineering &

Technology, Khulna.

Signature of Supervisor
Introduction

The Object Finding Bot is an intelligent robotic system designed to locate and identify
specific objects within a given environment. Whether it’s finding lost keys, detecting
misplaced items in a cluttered room, or assisting visually impaired individuals, this bot aims
to enhance efficiency and convenience.
This object finding robot will accomplish the task of finding objects whose data will be sent
using the user interface through images of objects or voice as a command. This robot will
actually be designed to enhance efficiency and precision in locating and retrieving specific
items within diverse environments, ranging from cluttered warehouses to complex household
settings. Utilizing a combination of sophisticated sensors such as cameras, LIDAR, and
ultrasonic devices, along with powerful processing units (here I am using Raspberry Pi) and
machine learning algorithms, these robots can navigate through spaces, identify target objects
with high accuracy, and perform necessary manipulation tasks. By integrating real-time data
processing, robust navigation strategies, and adaptive control mechanisms, the object-finding
robot aims to revolutionize operations in various sectors including logistics, search and
rescue, and home automation, thereby significantly reducing human effort and error while
increasing productivity and operational safety.

The need for an object finding solution arises from everyday scenarios where humans spend
valuable time searching for items. By creating an automated bot capable of locating objects
accurately, we can significantly improve productivity and reduce frustration. This robot will
be a solution to the problem of finding various things even in any risky place where humans
can’t easily survive, lessening some unnecessary human efforts.

Objectives
1. Object Recognition:

Developing an efficient object recognition algorithm using computer vision


techniques.

2. Navigation and Mapping:

Implementing path planning and mapping algorithms to guide the bot toward the
target object. Though my primary goal is to find objects in a small area so I won’t
work with navigation and planning right now.

3. User Interaction:

Designing an intuitive user interface for inputting object descriptions or queries.

4. Hardware Integration:

Integrating sensors(e.g. Cameras, ultrasonic sensor) with the bot for real-time

1
perception.

Literature Review / Existed Projects


There have been several research and projects on object finding that uses different
methodologies to search for an object. Two of them are RFusion and RF-Grasp which are
actually a robotic arm with a camera and radio frequency antenna attached to their gripper.
Here I'm mentioning a brief in the following of these robots.

RFusion Robot:
Researchers at MIT have created a robotic system that can do just that. The system, RFusion,
is a robotic arm with a camera and radio frequency (RF) antenna attached to its gripper. It
fuses signals from the antenna with visual input from the camera to locate and retrieve an
item, even if the item is buried under a pile and completely out of view.
The RFusion prototype the researchers developed relies on RFID tags, which are cheap,
battery-less tags that can be stuck to an item and reflect signals sent by an antenna. Because
RF signals can travel through most surfaces (like the mound of dirty laundry that may be
obscuring the keys), RFusion is able to locate a tagged item within a pile.
Using machine learning, the robotic arm automatically zeroes in on the object’s exact
location, moves the items on top of it, grasps the object, and verifies that it picked up the
right thing. The camera, antenna, robotic arm, and AI are fully integrated, so RFusion can
work in any environment without requiring a special setup.

Fig.(1.1): RFusion robot made by Researchers at MIT[2]

2
RF-Grasp:

The development of RF-Grasp, a robotic system that incorporates radio wave sensing and
computer vision to locate and grasp occluded objects, represents a significant advancement in
robotics technology. By utilizing radio waves, which have the ability to pass through walls, in
conjunction with traditional optical vision, RF-Grasp is capable of detecting and
manipulating items that are obstructed from view. This innovation has the potential to
revolutionize e-commerce fulfillment processes in warehouses and enhance the capabilities of
machines in various applications, such as retrieving tools from cluttered environments.

Traditional optical vision systems are limited in their ability to perceive objects that are
concealed within containers or obscured by other items, as visible light waves cannot
penetrate solid barriers. However, radio frequency (RF) identification technology, which has
long been employed for tracking purposes, offers a solution to this limitation. RF
identification systems consist of a reader and a tag, with the tag containing a computer chip
that responds to RF signals emitted by the reader, thereby providing information about the
location and identity of the tagged item. This technology has been widely used in retail
supply chains and is poised for further adoption in various industries.

RF-Grasp integrates a camera and an RF reader to identify and manipulate tagged objects,
even when they are not visible to the camera. The system includes a robotic arm with a
grasping hand, with the camera positioned on the robot's wrist and the RF reader operating
independently to relay tracking data to the robot's control algorithm. The challenge of
integrating and processing the data from these two distinct sources to inform the robot's
decision-making processes presented a significant hurdle for the researchers. However,
overcoming this challenge has resulted in a groundbreaking robotic system that combines the
strengths of radio wave sensing and computer vision to expand the capabilities of machines in
complex, real-world environments[3].

Fig(1.2): RF-Grasp robot searching a hidden object


that cant be visualized with the camera[4]

3
Comparison between RFusion and RF-Grasp:

1. Approach:

RFusion RF-Grasp

RFusion combines visual data from a RF-Grasp combines radio waves (which
camera with radio frequency (RF) can penetrate obstacles) with traditional
information obtained from an antenna. computer vision techniques. By
The system relies on RFID tags, which leveraging RF signals, RF-Grasp can
reflect signals sent by the antenna. This locate and grasp items that would
allows RFusion to find tagged items even otherwise remain hidden from purely
when they are buried under a pile or out visual approaches. While not a
of direct view. RFusion includes a fully traditional robot, RF-Grasp showcases
integrated robotic arm capable of locating the power of sensor networks and
and retrieving objects. cross-disciplinary approaches.

2. Strengths:

RFusion RF-Grasp

RFusion excels at detecting objects that are RF-Grasp can detect occluded objects
obstructed from direct line-of-sight vision. behind walls or other [Link] fusion
It performs well in cluttered environments of RF and visual data provides a unique
where traditional visual approaches may [Link] technology could extend
struggle. The use of RFID tags provides beyond robotics to security systems or
flexibility in tagging various types of environmental monitoring.
objects.

4
3. Limitation:

RFusion RF-Grasp

The system relies on RFID tags being RF-Grasp can detect occluded objects
attached to objects, which may not be behind walls or other barriers. The
feasible for all scenarios. RF signals can fusion of RF and visual data provides a
be affected by obstacles and unique advantage. The technology could
interference. Integrating visual and RF extend beyond robotics to security
data requires careful calibration and systems or environmental monitoring.
synchronization.

Suggestion:

The applicability of these robot can be broaden by attaching arms or wheel (that will also be
controlled by a central processor) that let him go forward , backward or left-right. This
feature will make this robot more dynamic and help him to search for any object in a wide
region. There can be several more suggestions and a few of them are mentioned in the
following:

1. RFusion:
● RFID Tag Dependency:
○ Solution: While RFID tags are effective for certain scenarios, consider hybrid
approaches. Combine RFID with other localization methods (e.g., visual
odometry or SLAM) to reduce reliance on tags.
● RF Signal Propagation:
○ Solution: Optimize the antenna placement and power levels. Use multiple
antennas for better coverage and triangulation.
● Complexity:
○ Solution: Develop calibration routines to ensure accurate synchronization
between visual and RF data. Automate this process during system setup.

2. RF-Grasp:
● Complex Hardware Setup:
○ Solution: Explore miniaturized RF sensors or integrate them into existing
robotic platforms. Collaborate with hardware experts to simplify the setup.
● Limited Range:

5
○ Solution: Investigate higher-frequency RF bands or use relay stations to
extend the range. Alternatively, combine RF with other sensing modalities
(e.g., LiDAR or ultrasonics).
● Calibration Challenges:
○ Solution: Develop self-calibration algorithms that adapt to environmental
changes. Use machine learning techniques to fine-tune alignment parameters.

Potential Enhancements:

1. Sensor Fusion:
○ Combine RF data with other sensors (e.g., cameras, LiDAR, IMUs) for robust
object localization.
2. Machine Learning:
○ Train models to predict RF signal behavior based on environmental factors
(e.g., obstacles, reflections).
○ Explore deep learning approaches for joint RF and visual feature extraction.
3. Dynamic Antenna Arrays:
○ Design adaptive antenna arrays that adjust beamforming based on the
environment.
○ Consider phased-array antennas for better directionality.
4. Energy Efficiency:
○ Investigate low-power RF technologies (e.g., LoRa, Zigbee) for prolonged
robot operation.
○ Optimize duty cycles to conserve energy.

My project is different from these in a few perspectives. First of all, I am combining here
computer vision and machine learning together. My robot will be dynamic and can move one
place to another in order to search for an object while those project get stick in one place.
Though I am not using any RF( radio frequency) technology to search any object like
RFusion and Rf-grasp, but I think computer vision and machine learning approaches will be
enough to accomplish the task.

Proposed Methodology

Step 1: Robot Vision

The task of Robot vision can be split into 2 parts

6
i) Computer Vision:

Computer vision is a field of artificial intelligence and systems to derive meaningful


information from digital images, videos and other visual inputs. In this project, the object
finding robot will capture images of surroundings and extract data to make a comparison with
lost objects.

ii) Machine Learning:

Use thousands of images of an object. Let's assume a football, an object, and use this data to
calculate a model for what the average football looks like based on the images in the data set.
The more pictures in the data set the more accurately the robot can detect the object.

This object finding robot combines machine learning with computer vision. This means it
passes the webcam stream through a machine-learned model in order to detect objects in
frames. For example, based on an average football model machine learning example, the
computer can look at the camera’s video frame and try to fit the average football image onto a
new image. If it fits within a certain degree of accuracy, the new image will be labeled as a
football and thus we will detect the object.

As sourcing of thousands images of multiple objects can be difficult and computing what the
average object should look like is an extremely computationally intensive task therefore I will
use ‘YOLO’ or ‘Mobile lite SSD COCO’ model provided by Google’s Tensor flow library.
Here Tensorflow is a machine learning program that actually trains the object detection
model and compares new images to classify them and detect objects within them.

Step 2: Movement of Robot

The movement of the robot will be performed by 4 gear motors and wheels. The operation of
these motors will be done by a motor driver (here L293D) with the help of a central
processing unit Raspberry pi. A servo motor will be attached to the camera to move the
camera angle on any side to take images of the surroundings.

Key Components

1. Raspberry Pi :

This is the brain of the robot. The Raspberry Pi does all the computation for where the robot
should go, how it should move and what task it should do now. The Raspberry pi is the hub
for all the sensors and electronics and allows them all to work together.

7
2. Raspberry Pi Camera Module:

This camera module takes the picture of the surrounding and helps the processor to detect and
find the targeted object.

3. Motor Driver:

Motor driver allow the bot precisely according to the object location. Here I am using
L293d motor driver. But it will be more convenient to use RoboClaw 2X7A instead of
L293d. As this motor driver is very much costly and rare in our country so I am using L293d
instead.

4. Step-Down converter:

This small electronic piece takes the battery's high voltage and steps it down to a lower 5 volt
signal. This 5v signal is used to power the Raspberry Pi via its USB power port.

5. Ultrasonic Sensor:

This measures the distance of the object and obstacles and helps bot to move smoothly
without any collision with any object.

6. Gear Motor and Wheel:

This actually runs the whole bot mechanically.

7. Servo Motor:

The servo motor is used to position the camera to any specific angle to capture all things
surrounding it.

8. Battery:

To power up the whole robot the battery used in the bot acts as the only power supplier.
Though any kind of battery can be used LiPo batteries are great since its small footprint with
high power densities.

Required Apparatus and Estimated Cost:

8
Components Quantity Unit price (tk) Total price (tk)
Name

Raspberry Pi 1 9800 9800

Raspberry Pi 1 4900 4900


Camera module

Motor Driver 1 495 495

Step down converter 1 125 125

Ultrasonic sensor 1 90 90

Gear Motors, 1 1095 1095


Wheels & Chassis

Servo motor 1 440 440

Battery 4 90 360

Total 17305

Work Plan

1. Project Definition and Scope


( Already done)
● Duration: 1 week
● Tasks:
○ Defining the project’s goals and objectives.
○ Specifying the scope (e.g., types of objects to detect, environment,
performance requirements).
○ Identifying constraints (e.g., hardware limitations, budget).

2. Research and Literature Review


● Duration: 2 weeks
● Tasks:
○ Studying the existing object detection and searching (RFusion and RF-Grasp)
○ Studying existing object detection methods (YOLO, SSD, etc.).
○ Reviewing relevant research papers and articles.
○ Understanding the principles of machine learning and computer vision.

9
3. Hardware Setup and Assembly
● Duration: 1 week
● Tasks:
○ Assembling the Raspberry Pi, camera module, motor driver, and ultrasonic
sensor.
○ Setting up the necessary connections and power supply.
○ Verifying that all components are functional.

4. Data Collection and Preprocessing


● Duration: 3 weeks
● Tasks:
○ Collecting a diverse dataset of images containing the objects you want to
detect.
○ Annotating the dataset with bounding boxes around the objects.
○ Preprocessing the images (resize, normalize, augment) for training.

5. Model Selection and Training


● Duration: 4 weeks
● Tasks:
○ Choosing between YOLO or MobileNet SSD based on your requirements
(speed, accuracy).
○ Train the selected model using TensorFlow.

6. Integration and Testing


● Duration: 2 weeks
● Tasks:
○ Integrating the trained model with the Raspberry Pi.
○ Developing code to capture video frames from the camera.
○ Implementing object detection using the model.
○ Testing the system’s accuracy and real-time performance.

7. Motor Control and Navigation


● Duration: 3 weeks
● Tasks:
○ Writting code to control the motors based on detected objects.
○ Implementing obstacle avoidance using the ultrasonic sensor.
○ Testing movement and navigation in a controlled environment.

10
8. User Interface and Interaction
● Duration: 2 weeks
● Tasks:
○ Designing a simple user interface (web-based or mobile app) for controlling
the robot.
○ Implementing commands for starting, stopping, and adjusting behavior.
○ Ensure user-friendly interaction.

9. Documentation and Finalization


● Duration: 1 week
● Tasks:
○ Documenting the entire project, including hardware setup, software
architecture, and code.
○ Creating a user manual for future maintenance.
○ Reviewing and finalizing the project.

10. Deployment and Field Testing


● Duration: Ongoing
● Tasks:
○ Deploying the robot in a real-world environment.
○ Continuously monitoring and improving its performance.
○ Gathering feedback and making necessary adjustments.

Current Status:
● Currently in the design phase and gathering components.

Conclusion
The Object Finding Robot has the potential to revolutionize how we interact with our
surroundings. By combining cutting-edge technology with practical applications, we can
create a reliable and user-friendly solution that simplifies our daily lives.

11
References:
[1]Title: “A robot that finds lost items”
Source: MIT News
Publication Date: October 5, 2021
URL: MIT News: A robot that finds lost items

[2]Title: “RFusion AI Robot Finds Your Misplaced Keys No Matter How Big the Pile of
Objects”
Source: autoevolution
Publication Date: October 6, 2021
URL: RFusion AI Robot Finds Your Misplaced Keys No Matter How Big the Pile of Objects

[3]Title: “A robot that senses hidden objects”


Source: MIT News
Publication Date: April 1, 2021
URL: MIT News: A robot that senses hidden objects

[4]Title: “RF-Grasp AI Robot Finds Your Misplaced Keys No Matter How Big the Pile of
Objects”
Source: Silicon Republic
Publication Date: April 2, 2021
URL: Silicon Republic: RF-Grasp AI Robot Finds Your Misplaced Keys No Matter How Big
the Pile of Objects

12

Common questions

Powered by AI

RFusion utilizes a combination of visual data from a camera with RF information obtained from an antenna, relying on RFID tags to locate items even when they are not visible. This allows it to perform well in cluttered environments where traditional vision systems might fail. The RF-Grasp, on the other hand, combines radio wave sensing with traditional optical vision, which enables it to detect and manipulate items obstructed from view and even behind walls. RFusion's advantage is its ability to work with tagged items in complex environments, while RF-Grasp's use of radio waves allows it to detect occluded objects without the need for tags. Each robot thus addresses different challenges in object detection: RFusion is advantageous in tagged environments while RF-Grasp's capability to penetrate obstacles gives it broader applications in environments where tagging is not feasible .

The Object Finding Robot exemplifies the potential of combining low-cost hardware such as the Raspberry Pi and affordable motor drivers with advanced software techniques like machine learning models (e.g., YOLO or MobileNet SSD) for object detection. This combination allows for high computational performance and precision in object identification while minimizing costs. The project emphasizes the strategic use of accessible components to build a dynamic and efficient robotic system, highlighting how sophisticated outcomes can be achieved without relying on expensive and specialized equipment .

The Object Finding Robot is designed to be dynamic, unlike RFusion and RF-Grasp, which are static systems focused on detecting objects from a fixed position. It is equipped with gear motors and wheels that allow it to navigate its environment, which is controlled by a Raspberry Pi central processing unit. This mobility enables the robot to search for objects over a larger area and reposition itself for optimal detection angles. This dynamic capability provides a distinct advantage in environments where the robot needs to move around obstacles and adapt to different spatial configurations, thus increasing its potential operational scope .

The Object Finding Robot comprises several key components: a Raspberry Pi as the central processing unit, a Raspberry Pi Camera Module for capturing images, gear motors and wheels powered by a motor driver to enable movement, a servo motor for adjusting the camera angle, an ultrasonic sensor for distance measurement and collision avoidance, and a battery as the power source. These components collectively support the robot's capabilities in detecting and locating objects via computer vision and machine learning, while allowing it to navigate and interact dynamically within its environment .

Integrating RFID and visual data poses significant challenges, particularly in the calibration and synchronization necessary for accurate object detection and manipulation. In the RFusion system, careful antenna placement and power level optimization, along with sophisticated calibration routines, are suggested to overcome the complexity of signal integration. For RF-Grasp, the challenge is addressed by developing algorithms that allow the system to adapt to environmental changes, integrating RF sensor data with visual inputs seamlessly for enhanced detection capabilities. Researchers have emphasized automating the calibration process to maintain precision and efficiency, thus addressing the complexity inherent in combining these two distinct data sources .

The Object Finding Robot aims to enhance operational efficiency by automating the process of finding misplaced objects, which is applicable in sectors such as logistics, search and rescue, and home automation. It reduces human effort and error by utilizing real-time data processing and adaptive control mechanisms. However, unlike systems such as RFusion and RF-Grasp that can detect objects through obstacles or without direct visibility using RF technology, this project relies solely on computer vision and machine learning. This dependency may limit its ability to locate objects hidden from view or covered by other items, which is an area where RFision and RF-Grasp excel due to their use of radio frequency sensing .

Building the Object Finding Robot involves various costs associated with acquiring components such as a Raspberry Pi, camera module, motor driver, and ultrasonic sensor, leading to a total estimated cost of 17,305 Taka. This project emphasizes affordability by using locally available and cost-effective solutions like the L293D motor driver. In contrast, implementing systems like RFusion or RF-Grasp might involve higher costs due to the sophisticated technology involved, such as RF antennas and specialized machine learning algorithms, which could increase the overall expense, especially if additional components or enhanced calibration systems are necessary .

Integrating machine learning with RF technology, as seen in RFusion and RF-Grasp, could give rise to novel applications such as automated inventory management systems, advanced robotics in rescue operations, and sophisticated security systems capable of detecting anomalies behind barriers. However, challenges include the complexity of synchronizing RF data with machine learning outputs, the need for high calibration precision to avoid false positives in detection, and the computational demands of processing high volumes of integrated visual and RF data. Overcoming these challenges requires innovative approaches in data fusion and algorithm design to fully leverage the synergies of these technologies .

Potential enhancements to RFusion and RF-Grasp include developing hybrid approaches that can incorporate additional localization methods such as visual odometry to reduce reliance on RFID tags, optimizing antenna placement for better RF signal propagation, and simplifying hardware setups with miniaturized RF sensors. These improvements could extend the range, accuracy, and flexibility of these systems, allowing them to operate more effectively in environments where tagging is not viable. By addressing calibration challenges and implementing adaptive algorithms, these enhancements would make the systems more robust against environmental variables, thereby improving their overall object detection and retrieval capabilities .

The Object Finding Robot integrates computer vision by capturing images of its surroundings and extracting data to compare with stored object models. This process utilizes machine learning, specifically through models like YOLO or MobileNet SSD from Google's TensorFlow. The robot trains on thousands of images to build a model of the average appearance of target objects, allowing it to recognize them in different contexts. This integration of computer vision and machine learning allows the robot to dynamically identify objects with high accuracy. By not relying on RF technology, the robot can detect a variety of objects based solely on visual data, making it versatile in different scenarios, though it may struggle where visibility is obstructed .

You might also like