0% found this document useful (0 votes)

12 views

3D Obstacle Avoidance For UAV Based On RL and RealSense

Построение маршрутов на основе обучения с подкреплением

Uploaded by

Evgeny Kondratiev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

3D Obstacle Avoidance For UAV Based On RL and RealSense

Построение маршрутов на основе обучения с подкреплением

Uploaded by

Evgeny Kondratiev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

The Journal of Engineering

The 3rd Asian Conference on Artificial Intelligence Technology (ACAIT

2019)

Three-dimensional obstacle avoidance for eISSN 2051-3305

Received on 11th October 2019

UAV based on reinforcement learning and

Accepted on 19th November 2019
E-First on 22nd May 2020
doi: 10.1049/joe.2019.1167
RealSense www.ietdl.org

Deqiang Han1, Qishan Yang1 , Rui Wang1

1Faculty of Information Technology, Beijing University of Technology, Beijing 100124, People's Republic of China
E-mail: [email protected]

Abstract: With the increasingly widespread application of unmanned aerial vehicle (UAV), safety issues such as effectiveness
of obstacle avoidance have been paid more attentions. The classical obstacle avoidance algorithms are mostly suitable for
mobile robots, but these algorithms are not ideal for UAV using in three-dimensional space. Most of the three-dimensional
obstacle avoidance algorithms which are more effective using RGB image data as input. Thus, a large amount of image data is
involved in complex computing process. This study proposes an effective obstacle avoidance algorithm for UAV with less input
data and fewer sensors based on RealSense and reinforcement learning. It combines the feature map of the depth image of
RealSense as the input data of reinforcement learning and the current direction of flight of UAV to calculate the direction and
angle of avoiding. The proposed algorithm that implements real-time obstacle avoidance for UAV has been verified by
simulation and tested in three-dimensional space scenario.

1 Introduction to use than dual vision camera. By taking the depth data of
RealSense as the input to calculate actions of avoiding, it can
With the development of science and technology, technologies of improve the effective rate of the obstacle avoidance and safety of
unmanned aerial vehicles (UAV) are evolving as well. Nowadays, UAV.
UAV is not only used in military, but also in commercial, consumer Machine learning is a new paradigm of problem solving [3].
electronics and even toys. At the same time, safety issues are also Reinforcement learning is an important machine learning method
increasingly concerned. Obstacle avoidance of UAV is a key point and applied many fields of intelligent control. The Agent does not
in safety issue. Whatever UAV is in the state of automation or need any prior knowledge of the environment and learn the rules
controlled by the user, it will cause damage to UAV or even gradually with each action and feedback. Traditional reinforcement
injuries to people if UAV cannot avoid obstacles in time. learning can be divided into Policy-based and Value-based. The
Therefore, an effective obstacle avoidance function is very purpose of the Policy-based is to find the optimal strategy. It can
important for UAV. get the probability of multiple possible actions to be taken next by
Obstacle avoidance algorithms for UAV can be divided into two analysing the current environment, and then performs the next
classifications. One is traditional obstacle avoidance methods and action according to the probability. The purpose of the Value-based
the other is based on machine learning to avoid obstacles. As is to find the optimal sum of rewards. It can get the value of each
traditional obstacle avoidance methods，artificial potential field and action, and then chooses the action according to the highest value.
vector field histogram have been used extensively in mobile robots Actor–Critic combines the advantages of the above two methods.
[1]. This class of algorithms use obstacles ranging data as input and The Actor means to perform the next action based on the
calculate the way of avoiding obstacles by specific models. Their probability distribution of the policy. The Critic means giving the
advantages are less input data and low computational complexity. value of the action. In other words, the Actor selects the action by
However, once some unexpected situations for the model happen, the probability and the Critic gives the Q-value by the action of the
it is likely that obstacle avoidance methods will become invalid. Actor. Then, the Actor updates the probability of selecting action
Another class of algorithms is based on reinforcement learning. based on the Q-value given by the Critic. Therefore, it is equivalent
The algorithms combine convolutional neural networks with to speeding up the learning process on the basis of the original
reinforcement learning to get features of input data which are policy gradients. The structure of Actor–Critic is shown in Fig. 1.
usually images of obstacles. Then, the features are matched to the Actor–Critic also has some disadvantages, one of which is
original actions of avoiding [2]. Finally, a better way to avoid inherent slow rate of converging. In order to solve this problem,
obstacles is given. However, the large number of images as input deep deterministic policy gradient (DDPG) is adopted in this paper.
data leads to higher computational complexity. However, model-free reinforcement learning algorithm requires the
Considering the advantages of both, this paper proposes an Agent to update knowledge by trial and error. Thus, training the
obstacle avoidance algorithm for UAV based on RealSense and neural network in the simulation environment can avoid to damage
reinforcement learning. The algorithm uses the feature map of raw to UAV before training completed. In addition, because of using
depth data of RealSense as input data. Compared to RGB images, the feature map of raw depth data of RealSense as input data, the
using the raw depth data which contains the necessary ranging difference between simulation environment and real-world testing
information to avoiding obstacles can reduce the amount of input has little effect on the algorithm, which is better than using RGB
and computational complexity. Furthermore, the algorithm images as input. Finally, the algorithm proposed in this paper is
implements obstacle avoidance function only using RealSense. It is verified that it can implement an effective obstacle avoidance
different to some common solutions with various types of sensors. function for UAV by real-world testing.
Intel RealSense technologies, formerly known as Intel
Perceptual Computing, are the perfect choice for the computer
vision and depth solution. For obtaining 3D depth information,
technologies commonly used in two-dimensional (2D)
environments such as ultrasonic and infrared ranging are no longer
the best choice. RealSense device is smaller than Kinect and easier
J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544 540
This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
20513305, 2020, 13, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/joe.2019.1167 by Cochrane Russian Federation, Wiley Online Library on [03/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 1 Structure of Actor–Critic

2.2 Design of training

According to the region corresponding to the direction of flight in
the feature map as shown in Fig. 2, the state of the UAV can be
confirmed. The black area in Fig. 2 is the flight area of the UAV.
By comparing the size of the area and the UAV, it can be
determined whether the UAV is able to pass safely. The white area
is the direction area of the UAV. The sum of the black and white
area is counted as the checking area.
The flight can be divided into two states: Safe and Dangerous,
Fig. 2 Feature map partition which depends on the analysis result of the checking area in the
feature map [4]. In the state of Safe, the UAV flies in the direction
of target. In the state of Dangerous, the UAV follows action which
is the output of the algorithm to avoid obstacles. To avoid some
problems of Partially Observable Markov Decision Process, the
Agent keeps training in two states.
In order to improve the training effect, DDPG adopts the
mechanism of experience replay. Some pre-processed data are
collected and stored in Replay Buffer before training. The state,
action, reward of each step are stored in Replay Buffer, while the
Fig. 3 Raw data processing data in Mini-Batch are randomly taken out during training. The
size of Replay Buffer is 500,000, and the size of Mini-Batch is 300.
2 Proposed method Training started after storing 30,000 sets of data.
2.1 Problem statement
2.3 Data pre-processing
Actor–Critic has been proved that it can achieve very good results
There are many parameters that can be collected during the UAV
in the game named The Open Racing Car Simulator. We can make
flight including Euler angles, relative position etc. If these
appropriate modifications based on this reinforcement learning
parameters are used as input for reinforcement learning, the
structure. After the depth image as input data is processed by
accuracy of obstacle avoidance can be further improved. However,
neural network, we can get the moving speed of the UAV in
some of parameters are difficult to collect in real time in practical
forward, up and down, left and right directions, respectively, for
application, and the general flight control system does not open all
avoiding obstacles in 3D space. Since controlled variable of the
these parameters for developers to use. Therefore, only feature map
UAV is continuous action, which may lead to trouble in
of depth data is used as input data for state. The data processing
convergence of Actor–Critic, we adopt a reinforcement learning
process is shown in Fig. 3.
method called DDPG. DDPG is based on Actor–Critic and
The depth data resolution of RealSense is 640*480. A set of
combines the advantages of deep-Q-network to improve stability
20*20 feature vector is taken out and used in convolution
and convergence. The description of the problem can be
processing with the stride of 20*20. Then, the output is 32*24,
summarised as a function formula
representing the average valid distance of each 20*20 pixels. Since
detectable range of RealSense is between 0.7 and 4 m [5], the data
action = f state (1)
overstepping the range are set to 0.
The state in (1) is the feature map of depth data. Since the input
variable is only state, the action which is actions of obstacle 2.4 Design of network structure
avoidance for the UAV depends solely on the value of state. That is There are two groups of neural networks which are Actor network
to say, there is a mapping relationship between state and action. and Critic network. The parameter TAU in target network is
With proper policy，the Agent gradually maps state to action and, 0.0003. Actor is a policy-based network which calculates action
eventually, achieves effective obstacle avoidance. Every time based on the input state. It updates depending on the output Q-
action is executed, it should be ensured that the value of state value of the Critic network. The structure of the Actor network is
which represents the condition of current obstacles is up to date. shown in Fig. 4.
From this, the real time of the algorithm can be guaranteed during In Actor network, the input is feature map of depth data and the
the UAV flying. output is the speed of forward and avoiding action both
horizontally and vertically. The input layer is connected to a fully
connected (FC) layer. After Swish function activating and handling
with Dropout, these data are used as input of three other FC layers,

J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544 541

This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
20513305, 2020, 13, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/joe.2019.1167 by Cochrane Russian Federation, Wiley Online Library on [03/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 4 Structure of Actor

Fig. 5 Structure of Critic

respectively [6]. One set of the data is connected to an FC layer Table 1 Correspondence between reward and status
and mapped to (0,1) by Sigmoid activation function. It can produce Status of the UAV Reward
the speed of forward. The other two sets of the data are connected S-D −3
to an FC layer and handled by Swish activation function, and then
D-S +3
connected to another FC layer and mapped to (−1, 1) by Tanh
activation function. It can produce the speed of horizontal and D-D, L↓ −2
vertical for obstacle avoidance, respectively. The learning rate in D-D, L↑ +2
Actor network is 0.00003.
Critic is a value-based network which evaluates Actor network
based on state and action. The network is updated by reward and Table 2 Correspondence between reward and flight states
output of target network for each action. The structure of the Critic Flight State Reward
network is shown in Fig. 5. collision −5
In Critic network, the input is feature map of depth data (state) reaching the destination +5
and action of Actor network, while the output is Q-value for
vforward > 0.9 m/s +1
evaluating Actor. The state is connected to an FC layer. After
Swish function activating and handling with Dropout, these data
and action are connected to another FC layer together. Similarly,
this set of data is connected to an FC layer and handled by Swish Collision, reaching the destination and flying forward are three
activation function with Dropout. Before Q-value is given, there is flight states to consider. When a collision occurs, the reward is
the last FC layer that needs to connect. The Q-value is used to reduced by 5. If the UAV arrives at the destination, the reward is
evaluate each action and train the Actor network. The learning rate added 5. The reward is also added 1 if current forward speed is
in Critic network is 0.0003. >0.9 m/s. Whether a collision occurs or the UAV reaches the
destination, the end of the round of training is marked. The reward
2.5 Design of reward corresponding to flight states is shown in Table 2.
It should be noted that the collision is divided into two cases.
When evaluating the effect of Actor, the method of calculation for One is the collision between the UAV and obstacles, the other is
reward needs to be considered. The design of reward has a great that obstacles in front is less than the minimum detectable range of
influence on the results. If the design is unreasonable, the neural RealSense. The significance of the latter is to avoid the problems,
network will easily fall into the result of local optimum. The such as the state is still in Safe when the UAV approaches collision,
reward calculation in this paper consists of two parts: the change of caused by the detectable range of RealSense.
feature map and the flight state of the UAV. After each action is executed, the reward corresponding to
As mentioned earlier, the flight state of the UAV is divided into action is obtained by calculating the sum of both above.
two states, and the reward corresponding to the change of state is
shown in Table 1. 2.6 Additional design of action
In Table 1, S means the state of Safe, D means the state of
Dangerous, and L means minimum non-zero distance from In order to detect enough continuous behaviour in the simulation
obstacles in Dangerous. The initial value of reward is 0. The environment, Ornstein–Uhlenbeck (OU) process is used to
reward is reduced by 3 when the state changes from Safe to generate noise, which increases the randomness of action.
Dangerous. After avoiding obstacles, the state changes from Combining the noise with action as the output of Actor can
Dangerous to Safe, and the reward is added 3. Moreover, there are improve adaptability to unknown environments during training. It
two situations that need to be considered when Dangerous remains is better to remove OU process in practical application, however.
unchanged. The L increasing indicates that the UAV is away from
obstacles, the reward is added 2. Conversely, the L will decrease if 3 Experiments
the UAV is approaching obstacles, and the reward is reduced by 2.
As it should be, the reward will not change if the state of Safe 3.1 Simulation experiments
remains unchanged. The simulation environment in this paper is Gazebo and robot
operating system (ROS). Gazebo can help users to rapidly test
542 J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544
This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
20513305, 2020, 13, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/joe.2019.1167 by Cochrane Russian Federation, Wiley Online Library on [03/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 6 The UAV modeling in design
(a) The model of the UAV, (b) The structure of the UAV

Fig. 7 Map of simulation environment

algorithms, perform regression testing, and train AI system using

realistic scenarios. In other words, it offers the ability to accurately
and efficiently simulate populations of robots in complex indoor
and outdoor environments [7]. The ROS is a flexible framework
for writing robot software. It is a collection of tools, libraries, and
conventions that aim to simplify the task of creating complex and
robust robot behavior across a wide variety of robotic platforms
[8]. In ROS, we can code control functions for the UAV and run
the algorithm.
In the simulation environment, the model and structure of the
UAV is shown in Fig. 6. Fig. 6a is the model of the UAV with the
same size as the actual UAV. Fig. 6b is the assembly location of
RealSense device on the UAV.
The relationship between the speed of the UAV and the output
of the Actor is shown in (2). The maximum speed change is 1 m/s

vx = action ⋅ x m/s
vy = action ⋅ y m/s (2)
vz = action ⋅ z m/s

Fig. 7 is a simulation environment for testing. The map contains

several groups of obstacles in a three-dimensional environment.
For each obstacle, some necessary data are collected from all Fig. 8 Results of training
directions and locations before training. (a) Distance, (b) Rewards, (c) Q-values
The training results are shown in Fig. 8. The first 4000 episodes
represent data collection process as mentioned above. The results The real-world testing environment is an underground car park
of this part are not used to discuss the training effect. Fig. 8a is the as shown in Fig. 10a. A wall-stud in park is used as the obstacle to
result of completed flying distance. The total length of the testing the effect of obstacle avoidance algorithm in practical
simulation environment is 40 m. After about thousands of training, application. Fig. 10b is the fixed depth image of the obstacle.
the UAV can pass through most of simulation testing areas. Fig. 8b Fig. 10c is the feature map of the depth image after the
is the result of rewards. The relationship between distance and convolution. It can be seen that a part of the shape of the obstacle is
rewards is linear to some extent. Fig. 8c is the result of Q-values. retained. At this point, the action for avoiding obstacle as the
Q-values generally shows an upward trend but continues to output by the algorithm is shown as in Fig. 10d. As shown in (2),
fluctuate. the UAV will move to the lower left to avoid obstacles.
The process of obstacle avoidance is shown in Fig. 9. This The policies adopted by the UAV for obstacle avoidance are the
figure, which can be divided into six parts, represents a complete result of training in simulation environment. When the UAV
process of the UAV successfully avoiding obstacles. detects obstacles, it first reduces the forward speed. Then it avoids
the obstacle. Finally, it increases the speed and continues to fly
3.2 Real-world experiments forward after successfully avoiding obstacles. The process of the
UAV avoiding obstacles in park is shown in Fig. 11 which also can
Hardware control system includes Intel Joule developer kit, Intel be divided into six parts.
RealSense R200 and Pixhawk. The Intel Joule platform is a system
on module and is available in multiple configurations with support
for Intel RealSense technologies [9]. Pixhawk is an open-hardware 4 Conclusion
flight controller and supports multiple flight stacks [10]. RealSense This paper proposed an obstacle avoidance algorithm for UAV
R200 as the input device and Pixhawk as the output device are based on reinforcement learning. The feature map of the depth
connected to and controlled by Joule as the control unit. image from RealSense was used as the input for reinforcement
learning. After training and processing by neural network, the

J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544 543

Fig. 11 Process of obstacle avoidance in real-world experiments

Fig. 10 Results of Real-world experiments 5 References

(a) The real-world testing environment, (b) The fixed depth image, (c) The feature [1] Vanneste, S., Bellekens, B., Weyn, M.: ‘3DVFH + : real-time three-
map, (d) The result of actions dimensional obstacle avoidance using an octomap’. The 1st Int. Workshop on
Model-Driven Robot Software Engineering (MORSE 2014), York, Great
Britain, 2014, pp. 91–102
action of avoiding obstacles for the UAV was outputted. The UAV [2] Fereshteh, S., Sergey, L.: ‘CAD2RL: real single-image flight without a single
can successfully avoid obstacles by using this algorithm, which real image’. Robotics: Science and Systems XIII, Massachusetts, USA, 2017,
was verified in the simulation environment and tested in indoor pp. 1–5
scenario. [3] De La Rosa, M., Chen, Y.: ‘A machine learning platform for multirotor
There are many details to be further improved. The feature activity training and recognition’. Proc. of IEEE 14th Int. Symp. on
Autonomous Decentralized Systems, Utrecht, Netherlands, April 2019, pp.
information extracted from the depth images is not complete. More 15–22
analysis and processing of the depth images can be done to mine [4] Chen, X., Kamel, A.E.: ‘A reinforcement learning method of obstacle
more valuable information. More information will also contribute avoidance for industrial mobile vehicles in unknown environments using
to improve the efficiency of training. Besides, both of simulation neural network’. The 21st Int. Conf. on Industrial Engineering and
Engineering Management 2014, Paris, France, 2014, pp. 671–675
and real-world experiments designed in this paper are less complex [5] Introducing the Intel® RealSense™ R200 Camera, https://
than many application scenarios. Furthermore, if the latest software.intel.com/en-us/articles/realsense-r200-camera, accessed 15 May
RealSense device is used, it will allow more useful information to 2015
be provided. Thus, there are some ways to improve the robustness [6] Ramachandran, P., Zoph, B., Quoc, V.L.: ‘Swish: a self-gated activation
function’, 2017, arXiv:1710.05941
of the algorithm in the future. [7] Why Gazebo, http://gazebosim.org/, accessed 10 December 2018
[8] About ROS, http://www.ros.org/about-ros/, accessed 15 December 2018
[9] Online Guide for the Intel® Joule™ Module, https://software.intel.com/en-us/
node/721455, accessed 14 July 2017
[10] What is Pixhawk, http://pixhawk.org/, accessed 4 August 2018

544 J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544

This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)

Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara 2024 scribd download
50% (2)
Natural language processing with TensorFlow Teach language to machines using Python s deep learning library 1st Edition Thushan Ganegedara 2024 scribd download
62 pages
2018-Obstacle Avoidance For Drones Using A 3DVFH Algorithm
No ratings yet
2018-Obstacle Avoidance For Drones Using A 3DVFH Algorithm
67 pages
Coursera Courses 1675114920
0% (1)
Coursera Courses 1675114920
2 pages
Learning Based Multi-Obstacle Avoidance of Unmanned
No ratings yet
Learning Based Multi-Obstacle Avoidance of Unmanned
13 pages
Path Plan
No ratings yet
Path Plan
9 pages
Elsarticle Template
No ratings yet
Elsarticle Template
24 pages
Reinforcement Learning-Based Collision Avoidance For UAV
No ratings yet
Reinforcement Learning-Based Collision Avoidance For UAV
6 pages
A_Soft_Actor-Critic_Based_Reinforcement_Learning_Approach_for_Motion_Planning_of_UAVs_Using_Depth_Images-1
No ratings yet
A_Soft_Actor-Critic_Based_Reinforcement_Learning_Approach_for_Motion_Planning_of_UAVs_Using_Depth_Images-1
10 pages
Reinforced Learning-Based Robust Control Design For Unmanned Aerial Vehicle
No ratings yet
Reinforced Learning-Based Robust Control Design For Unmanned Aerial Vehicle
16 pages
Reinforcement Learning For UAV Attitude Control: William Koch, Renato Mancuso, Richard West, and Azer Bestavros
No ratings yet
Reinforcement Learning For UAV Attitude Control: William Koch, Renato Mancuso, Richard West, and Azer Bestavros
21 pages
A New Autonomous Method of Drone Path Planning Based on Multiple Strategies for Avoiding Obstacles with High Speed and High Density
No ratings yet
A New Autonomous Method of Drone Path Planning Based on Multiple Strategies for Avoiding Obstacles with High Speed and High Density
21 pages
Reinforcement Learning Control of An Aerial Robot Based On A Tuned Proximal Policy Optimization in Takeoff and Hover Phases
No ratings yet
Reinforcement Learning Control of An Aerial Robot Based On A Tuned Proximal Policy Optimization in Takeoff and Hover Phases
7 pages
Autonomous Unmanned Aerial Vehicle Navigation Using Reinforcement Learning: A Systematic Review
No ratings yet
Autonomous Unmanned Aerial Vehicle Navigation Using Reinforcement Learning: A Systematic Review
24 pages
Trajectory Optimization For Autonomous Flying Base Station Via Reinforcement Learning
No ratings yet
Trajectory Optimization For Autonomous Flying Base Station Via Reinforcement Learning
5 pages
Autonomous_Decision-Making_Generation_of_UAV_based_on_Soft_Actor-Critic_Algorithm-1
No ratings yet
Autonomous_Decision-Making_Generation_of_UAV_based_on_Soft_Actor-Critic_Algorithm-1
6 pages
Reinforcement Learning Based Quadcopter Controller
No ratings yet
Reinforcement Learning Based Quadcopter Controller
7 pages
Drones 08 00060 With Cover
No ratings yet
Drones 08 00060 With Cover
22 pages
Paper - 3D ONLINE PATH PLANNING OF UAV
No ratings yet
Paper - 3D ONLINE PATH PLANNING OF UAV
15 pages
Artificial Intelligence Approaches For UAV Navigation Recent Advances and Future Challenges
No ratings yet
Artificial Intelligence Approaches For UAV Navigation Recent Advances and Future Challenges
21 pages
Artificial Intelligence Approaches for UAV Navigation Recent Advances and Future Challenges
No ratings yet
Artificial Intelligence Approaches for UAV Navigation Recent Advances and Future Challenges
20 pages
Obstacle Avoidance For Unmanned Aerial Vehicles: Gonçalo Charters Santos Cruz Pedro Miguel Martins Encarnação
No ratings yet
Obstacle Avoidance For Unmanned Aerial Vehicles: Gonçalo Charters Santos Cruz Pedro Miguel Martins Encarnação
15 pages
Automatic Landing of A UAV Using Model Predictive Control For The Surveillance of Internal Autopilot's Controls
No ratings yet
Automatic Landing of A UAV Using Model Predictive Control For The Surveillance of Internal Autopilot's Controls
4 pages
Reference Paper
No ratings yet
Reference Paper
25 pages
Literature Review20
No ratings yet
Literature Review20
52 pages
Robust Path Planning For Avoiding Obstacles Using
No ratings yet
Robust Path Planning For Avoiding Obstacles Using
8 pages
Deep Reinforcement Learning For Drone Delivery
No ratings yet
Deep Reinforcement Learning For Drone Delivery
19 pages
Artificial Intelligence Approaches for UAV Navigation Recent Advances and Future Challenges (2)
No ratings yet
Artificial Intelligence Approaches for UAV Navigation Recent Advances and Future Challenges (2)
21 pages
Cessna 2
No ratings yet
Cessna 2
20 pages
建立一个逼真的无人机虚拟模拟器
No ratings yet
建立一个逼真的无人机虚拟模拟器
24 pages
Minimum-Time Path Planning For Autonomous Landing of UAV On Aerial Drone Carrier
No ratings yet
Minimum-Time Path Planning For Autonomous Landing of UAV On Aerial Drone Carrier
6 pages
Real-Time Obstacle Detection and Tracking For Sense-And-Avoid Mechanism in Uavs
No ratings yet
Real-Time Obstacle Detection and Tracking For Sense-And-Avoid Mechanism in Uavs
13 pages
The Actor-Dueling-Critic Method
No ratings yet
The Actor-Dueling-Critic Method
20 pages
10 3390@drones3030058
No ratings yet
10 3390@drones3030058
14 pages
2212.03828v1
No ratings yet
2212.03828v1
9 pages
Autonomous Drone Racing With Deep Reinforcement Learning
No ratings yet
Autonomous Drone Racing With Deep Reinforcement Learning
9 pages
Control and Planning of Flight Trajectories of Multirotor Uavs (Or Fixed Wing) Based On Reinforcing Learning
No ratings yet
Control and Planning of Flight Trajectories of Multirotor Uavs (Or Fixed Wing) Based On Reinforcing Learning
5 pages
2-deits2015
No ratings yet
2-deits2015
8 pages
Autonomous Unmanned Aerial Vehicles in Search and Rescue Missions Using Real-Time Cooperative Model Predictive Control
No ratings yet
Autonomous Unmanned Aerial Vehicles in Search and Rescue Missions Using Real-Time Cooperative Model Predictive Control
22 pages
A Novel Improved Bat Algorithm in UAV Path Planning
No ratings yet
A Novel Improved Bat Algorithm in UAV Path Planning
22 pages
LiDAR Based Obstacle Avoidance Algorithm
No ratings yet
LiDAR Based Obstacle Avoidance Algorithm
8 pages
drones-07-00339
No ratings yet
drones-07-00339
21 pages
A Deep Reinforcement Learning Control Approach For High-Performance Aircraft
No ratings yet
A Deep Reinforcement Learning Control Approach For High-Performance Aircraft
41 pages
Optimal Trajectory-Planning of Uavs Via B-Splines and Disjunctive Programming
No ratings yet
Optimal Trajectory-Planning of Uavs Via B-Splines and Disjunctive Programming
12 pages
Multi-Agent Reinforcement Learning Aided Intelligent UAV Swarm for Target Tracking
No ratings yet
Multi-Agent Reinforcement Learning Aided Intelligent UAV Swarm for Target Tracking
15 pages
Timely Data Collection For UAV-based IoT Networks A Deep Reinforcement Learning Approach
No ratings yet
Timely Data Collection For UAV-based IoT Networks A Deep Reinforcement Learning Approach
13 pages
A Review of Perception Sensors, Techniques, and Hardware Architectures For
No ratings yet
A Review of Perception Sensors, Techniques, and Hardware Architectures For
21 pages
ICRA.2019.8793930
No ratings yet
ICRA.2019.8793930
7 pages
QoE-Driven_Adaptive_Deployment_Strategy_of_Multi-UAV_Networks_Based_on_Hybrid_Deep_Reinforceme
No ratings yet
QoE-Driven_Adaptive_Deployment_Strategy_of_Multi-UAV_Networks_Based_on_Hybrid_Deep_Reinforceme
14 pages
1 s2.0 S100093612030594X Main
No ratings yet
1 s2.0 S100093612030594X Main
18 pages
Two Step Dynamic Obstacle Avoidance
No ratings yet
Two Step Dynamic Obstacle Avoidance
38 pages
final project
No ratings yet
final project
4 pages
[4] Learned Multiagent Real-Time Guidance with Applications to Quadrotor Runway Inspection
No ratings yet
[4] Learned Multiagent Real-Time Guidance with Applications to Quadrotor Runway Inspection
29 pages
SocProS 2018 Paper ID 35
No ratings yet
SocProS 2018 Paper ID 35
13 pages
AERO.2019.8741612
No ratings yet
AERO.2019.8741612
6 pages
Using Q-Learning To Automatically Tune Quadcopter PID Controller Online For Fast Altitude Stabilization
No ratings yet
Using Q-Learning To Automatically Tune Quadcopter PID Controller Online For Fast Altitude Stabilization
6 pages
Ris + Uav
No ratings yet
Ris + Uav
14 pages
QPGAO RL UAVQ Rev1 Fix-1
No ratings yet
QPGAO RL UAVQ Rev1 Fix-1
15 pages
Electronics 11 04187 v2
No ratings yet
Electronics 11 04187 v2
33 pages
Cooperative Task Assignment and Path Planning For Multiple Uavs
No ratings yet
Cooperative Task Assignment and Path Planning For Multiple Uavs
30 pages
UAV Path Planning Based On Receding Horizon Control With Adaptive Strategy
No ratings yet
UAV Path Planning Based On Receding Horizon Control With Adaptive Strategy
5 pages
Ucav Mission Execution Reinforcement Learning Paper
No ratings yet
Ucav Mission Execution Reinforcement Learning Paper
9 pages
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
From Everand
Automatic Target Recognition: Advances in Computer Vision Techniques for Target Recognition
Fouad Sabry
No ratings yet
Important Questions Unit 2
No ratings yet
Important Questions Unit 2
8 pages
Presentation New
No ratings yet
Presentation New
24 pages
几何深度学习
No ratings yet
几何深度学习
160 pages
Deep Learning Precision Farming: Tomato Leaf Disease Detection by Transfer Learning
No ratings yet
Deep Learning Precision Farming: Tomato Leaf Disease Detection by Transfer Learning
5 pages
13 Statistical Analysis Methods For Data Analysts & Data Scientists - by BTD - Medium
No ratings yet
13 Statistical Analysis Methods For Data Analysts & Data Scientists - by BTD - Medium
22 pages
Notes On Deep Learning Theory
No ratings yet
Notes On Deep Learning Theory
68 pages
Aalborg Universitet: Dragicevic, Tomislav Wheeler, Patrick Blaabjerg, Frede
No ratings yet
Aalborg Universitet: Dragicevic, Tomislav Wheeler, Patrick Blaabjerg, Frede
12 pages
Question Bank On Artificial Neural Networks
No ratings yet
Question Bank On Artificial Neural Networks
2 pages
Coastal GIS - Functionality Versus Applications
No ratings yet
Coastal GIS - Functionality Versus Applications
18 pages
R08 Multiple Regression and Machine Learning
No ratings yet
R08 Multiple Regression and Machine Learning
24 pages
Optimization Basic Concepts
No ratings yet
Optimization Basic Concepts
5 pages
Avanti Kumari - A Report
No ratings yet
Avanti Kumari - A Report
39 pages
Modeling The Seed Yield of Ajowan (Trachyspermum Ammi L.) Using Artificial Neural Network and Multiple Linear Regression Models
No ratings yet
Modeling The Seed Yield of Ajowan (Trachyspermum Ammi L.) Using Artificial Neural Network and Multiple Linear Regression Models
11 pages
Energies: Solar Power Forecasting Using CNN-LSTM Hybrid Model
No ratings yet
Energies: Solar Power Forecasting Using CNN-LSTM Hybrid Model
17 pages
NIT Warangal Curriculum
No ratings yet
NIT Warangal Curriculum
90 pages
Introduction To Soft Computing
0% (1)
Introduction To Soft Computing
19 pages
Ict 423 - Deep Learning
No ratings yet
Ict 423 - Deep Learning
18 pages
Deeplob: Deep Convolutional Neural Networks For Limit Order Books
No ratings yet
Deeplob: Deep Convolutional Neural Networks For Limit Order Books
12 pages
A Beginner's Guide To Understanding Convolutional Neural Networks Part 1 - Adit Deshpande - CS Under
100% (1)
A Beginner's Guide To Understanding Convolutional Neural Networks Part 1 - Adit Deshpande - CS Under
14 pages
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
No ratings yet
III-II CSM (Ar 20) DL - Units - 1 & 2 - Question Answers As On 4-3-23
56 pages
Energy Optimization Prediction Write Up
No ratings yet
Energy Optimization Prediction Write Up
44 pages
2020-Physics-Guided Deep Learning For Power System State Estimation
No ratings yet
2020-Physics-Guided Deep Learning For Power System State Estimation
9 pages
22MDT1005 CapstoneMDT6099
No ratings yet
22MDT1005 CapstoneMDT6099
65 pages
Image Classification Using Pre-Trained Convolutional Neural Network in COLAB
No ratings yet
Image Classification Using Pre-Trained Convolutional Neural Network in COLAB
6 pages
Deep Learning (Syllabus)
No ratings yet
Deep Learning (Syllabus)
1 page
Solomatine 2004
No ratings yet
Solomatine 2004
11 pages
Provided For Non-Commercial Research and Educational Use. Not For Reproduction, Distribution or Commercial Use
No ratings yet
Provided For Non-Commercial Research and Educational Use. Not For Reproduction, Distribution or Commercial Use
40 pages
AI To Solve TSP
No ratings yet
AI To Solve TSP
15 pages

3D Obstacle Avoidance For UAV Based On RL and RealSense

Uploaded by

3D Obstacle Avoidance For UAV Based On RL and RealSense

Uploaded by

The Journal of Engineering

The 3rd Asian Conference on Artificial Intelligence Technology (ACAIT

Three-dimensional obstacle avoidance for eISSN 2051-3305

UAV based on reinforcement learning and

Deqiang Han1, Qishan Yang1 , Rui Wang1

2.2 Design of training

J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544 541

Fig. 5 Structure of Critic

Fig. 7 Map of simulation environment

algorithms, perform regression testing, and train AI system using

Fig. 7 is a simulation environment for testing. The map contains

J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544 543

Fig. 11 Process of obstacle avoidance in real-world experiments

Fig. 10 Results of Real-world experiments 5 References

544 J. Eng., 2020, Vol. 2020 Iss. 13, pp. 540-544

You might also like