2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
The Deep Learning Based Power Line Defect
Detection System Built on Data Collected by the
Cablewalker Drone
Evgenii Titov Oksana Limanovskaya Alexandr Lemekh
FutureLab LLC Ural Federal University FutureLab LLC
Yekaterinburg, Russia Yekaterinburg, Russia Yekaterinburg, Russia
[email protected] [email protected] [email protected] Daria Volkova
FutureLab LLC
Yekaterinburg, Russia
[email protected]
Abstract—traditional technologies used for servicing and In terms of payloads used for data acquisition purposes,
inspecting power line condition are expensive, time consuming and drones often get equipped with different photo/video cameras
put people’s lives at risk. Employing UAVs equipped with and other scanners such as laser scanners. However, in vast
payloads for power line inspection combined with machine majority of cases the data collected gets processed by
vision and machine learning techniques to analyze the data personnel because detecting defects is not a straightforward
collected can offer great improvements in terms of defect detection task even for humans mainly because defects can take
efficiency, cost and safety. The article talks about the neural numerous shapes and forms. On top of that, often certain
network created to detect and classify power line poles on defects like cracks in internal structure of power line wires
images or videos. The neural network in question is the first part
cannot be found directly that complicates the task even more.
of the larger end-to-end power line defect detection software being
developed. Indeed, there have been some research papers dedicated to
the topic of defect detection in the field of transmission power
Keywords—deep learning, object detection, darknet, defect lines. To name a few, Zahid et al. [4] described a system able
detection, power line maintenance, metal wire maintenance, fault to recognize defects on power line insulators of different types
detection, image processing, ROV, YOLO, transmission line, employing convolutional neural networks and rotation
suspension insulators, UAV, cablewalker, drone.
normalization and ellipse detection methods introduced by the
I. INTRODUCTION authors. However, despite performing well this approach is
limited to insulators only which leaves the rest of the structure
The accumulated experience of the power unattended. Jalil Bushra et al. [5] in his paper suggested the
transmission lines inspection has shown that defects on deep learning based solution to detect defects in power line
power lines and its components arise regularly. These wires and insulators. However, again the approach is limited
defects do not occur suddenly though but evolve over to a couple of components only. On top of that, this solution
time. Therefore, periodic inspections of lines and will not work without an IR camera, which further
components are effective means of preventing unplanned complicates the already complicated system.
downtimes and accidents in transmission lines. The
greatest concern is metal condition of power wires and Having said that, all the solutions proposed are either
lighting protection cable, as their destruction can lead to a incomplete in a sense they are limited to a specific power line
system failure. component or heavily rely on certain conditions to produce
adequate results, which questions their effectiveness in day-
Traditional power line inspection technologies have to-day power line inspection routine. The purpose of this
not changed in recent decades [1]. The traditional paper is to develop an algorithm based on deep learning
inspection of power lines is carried out by a team of techniques (block one on Fig. 1) for the end-to-end defect
specialists who take photos and videos from the ground detection system.
or a helicopter. The resulting photos and materials are then
reviewed by experts to identify defects. The main advantage II. SYSTEM ARCHITECTURE
of this approach is the ability to identify the entire range The block one being considered in the current paper will
of power line failures. However, such maintenance be tasked with detecting and classifying power line poles on
measures are expensive, time consuming and often images or videos. Then coordinates of the poles detected will
associated with risk to the lives of the personnel involved. be used in the follow-up components detecting neural
Many attempts are being made to automate the diagnosis networks (block 2). The main task of the second block will be
of the status of power lines [1-3]. One of the most promising based on the image sections containing the utility poles
methods for the automatic diagnosis of power lines is the use obtained from the block one to detect pole’s components and
of UAV (Unmanned Aerial Vehicle) technology. In then transfer both the coordinates of the components and poles
UAV inspection, the inspection is conducted by UAVs detected to the defect detecting algorithms (block 3). The last
equipped with sensors and cameras for navigating along block of the system will comprise a number of different
power lines, performing online inspection to detect machine learning, deep learning, machine vision techniques
obvious faults, and collecting data for later offline responsible for detecting defects associated with each
inspections. To detect defects of power line elements
discovered by UAVs different machine learning and deep
learning techniques can be applied.
Authorized licensed use limited to: Carnegie Mellon 978-1-7281-4401-6/19/$31.00 ©2019
Libraries. Downloaded on September IEEE
12,2022 0700
at 20:57:50 UTC from IEEE Xplore. Restrictions apply.
2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
component or utility poles themselves. To name a few, cracks In order to train poles detecting and classifying neural
on concrete poles, missing or dirty insulator plates etc. network a dataset of approximately 4500 images has been
collected. The dataset consists of both images downloaded
Any defects detected will be saved to a database alongside from the Internet (around 15%) and images taken by the team
other crucial information about the object on which this defect members and the Cablewalker drone. No images from 3D
was found, image/video name, time etc. models of utility poles were used. Images in the dataset are
mainly RGB of varying resolutions with the minimum
resolution threshold of 1920 – 1080 pixels.
In order to train the neural network, images comprising the
dataset had to be manually labelled. Labelling was done by
means of the open-source image and video labelling software
called OpenLabelling [5]. For each labelled image there was a
text document reflecting a number of objects, coordinates of
the bounding boxes drawn and classes of the objects
highlighted “Fig. 2”.
Fig. 2 Dataset labelling
Since YOLOv3 is invariant to the size of input images, no
Fig.1 Generalized defect detection system’s workflow image preprocessing except for finding and removing low
resolution images has been performed. In future we also want
III. DATA USED FOR TRAINING to bring all images to one size and retrain the model in order
The problem being solved is not a trivial classification to see if this can improve model’s performance.
problem because besides having to classify potential defects IV. INSTRUMENTS
first we need to detect bodies on which those defects can be
found. Power line poles consist of numerous components such A. YOLOv3 system
as wires, insulators, vibration dampers, cross arms etc. All Unlike traditional machine vision approaches where
these components can be subjected to certain defects. Before systems use a sliding window to locate objects across an
those elements can be analysed, we need to identify the main image, YOLO approaches the task from another angle and
object that features all of them – a utility pole itself. forwards the whole image through the network layers only
Several state-of-the-art neural network systems that can do once!
both regression and classification tasks almost in real time The last 3rd version of the network made the performance
have been considered. To name a few, we considered SSD, the even faster and more accurate than its predecessors. Another
whole R-CNN family, YOLO etc. fancy innovation is the ability to output results from three
We decided to go with the 3rd version of the well-known different layers that allows to better locate and classify objects
You Only Look Once (YOLO) system. This system has got at smaller scales [6].
several advantages compared to other approaches such as the YOLO breaks down an input image into a N x M grid cell.
ability to detect objects at different scales due to the fact that Values of N and M depend on where in the network
YOLO outputs results from three different so-called yolo downsampling takes place and vary within the 13 – 52 cells
layers, its speed and relatively high prediction accuracy. range.
Authorized licensed use limited to: Carnegie Mellon 978-1-7281-4401-6/19/$31.00 ©2019
Libraries. Downloaded on September IEEE
12,2022 0701
at 20:57:50 UTC from IEEE Xplore. Restrictions apply.
2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
For each cell the system outputs a 1-D tensor in the form: The system was trained during 4500 epochs and the
training time constituted approximately 16 hours. The system
was trained on approximately 5200 images, where 700 images
contained objects of similar shapes such as antennas, cranes,
ships, signs, poles etc as a mean of preventing model
overfitting.
Approximately ten percent of all images (~500) have been
used for testing purposes.
Fig. 3 Output vector
where pc is the object detection confidence, bx,by,bh,bw are
responsible for object location and bounding box detection
and C1,C2-Cn are class scores (up to 80 classes can be
detected simultaneously) [9].
Often, we might encounter situations where different
instances of the classes getting detected are neighboring each
other within one grid cell. In order to tackle this detection
difficulty YOLO uses anchor boxes that “add” extra
dimension to the output labels by pre-defining a number of
anchor boxes to each grid cell. In other words, anchor boxes
extend YOLO’s output for each pre-defined anchor box that
in turn allows detection of multiple objects of different shapes
and forms within one grid cell. In total YOLOv3 system uses
9 anchor boxes, 3 for each scale [7].
In order to get rid of multiple bounding boxes predicted
for the same object YOLO employs the concepts of non-
maximum suppression and intersection over union (IoU). [7].
B. Training the YOLOv3 model
The poles detecting neural network has been built in two
steps. First, in order to get the custom weights trained for Fig. 4 Mean average precision and error rate during
utility poles detection and classification the darknet build by training process
AlexeyAB was used [10]. Then, the code for the block one As we can see on the “Fig. 4”, closer to the end of training
neural network was written in Python using the Open Source (after 4000 epoch) the model began to overfit - system
Computer Vision Library (OpenCV). performance started to saturate, mean average precision
Darknet despite being an easy and convenient way to started to go down while the error rate was slowly decreasing.
relatively fast get some results and allowing to use GPU The best precision (92%) was reached around the 3950 epoch.
capabilities for training and working with the network turned Both objects (concrete and metal poles) have unique
out to be painful to get integrated into a larger python-based shapes that can be broken down into a number of distinct
defect detection software. However, Darknet turned out to be features. The precision values of metal and concrete poles
a perfect tool for transfer learning tasks. In order to get the classification constituted 91.11% and 91.08% accordingly
custom weights for the power line poles detection and with relatively small number of misclassifications.
classification we took the weights pretrained on the COCO
dataset. At first the build was copied using Git and assembled
using the Visual Studio 2017 software. All necessary steps
were taken following the original AlexeyAB guide [8]. The
network training was performed on the GeForce GTX 1070 8
Fig. 5 Confusion matrix and performance metrics
Gb GPU.
In terms of the neural network topology, for each yolo V. RESULTS
layer outputting results the number of classes being detected After the weights have been obtained, the YOLOv3 based
was changed to two. Batch size was set to 32, width and height poles detecting neural network was implemented using the
of the input images was set to standard 416 x 416. The number OpenCV library.
of filters in the layers preceding the yolo layers was
recalculated based on the number of classes being detected. According to the “Fig. 1”, the defects detection system can
Furthermore, anchors have been recalculated and updated in accept input data in many different forms: a single image, a
the system configuration file to increase learning abilities of folder of images or a video.
the neural network. Apart from that no other major changes Each object detected on an image or a video frame by the
were introduced to the system topology. neural network is represented as a class object with a number
After all preliminary steps have been taken, the build of attributes. To name a few, coordinates of the object found,
assembled, necessary files created and put into the appropriate which will be used to draw bounding boxes around the objects
places, the system was trained. detected and will be sent to the subsequent neural networks in
the block 2 for component detection task; class of the object
Authorized licensed use limited to: Carnegie Mellon 978-1-7281-4401-6/19/$31.00 ©2019
Libraries. Downloaded on September IEEE
12,2022 0702
at 20:57:50 UTC from IEEE Xplore. Restrictions apply.
2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
detected, size of the object detected relative to the whole frame
or an image getting processed etc.
The first block of the defect detection system also
accommodates the number of data pre-processing functions.
For instance, if an image is used as the input to the system
depending upon what device this image was taken on it might
have some metadata associated with it. Presence of metadata
and especially the orientation flag can deteriorate the system’s
performance by rotating images before they get processed by
the poles detection neural network. Therefore, all images to be
processed by the neural networks get “cleaned” first.
Fig. 8 Example of close-up intermediate tower detection
The system was tested on a separate dataset created for
testing purposes only. We found out that model’s performance
highly depends upon the relative size of the objects getting
detected to the whole image, image quality and some other
minor factors such as background, noise, obstructions etc.
Fig. 9 Example of close-up anchor tower detection
Further improvements to the system performance can be
achieved by getting more quality weights. In order to so do,
the dataset needs to be increased in size with quality high
resolution images of poles and more images without any poles
on them need to be added (in this case we used only 700). On
top of that, more accurate and careful labelling job needs to be
performed. Furthermore, the darknet network resolution can
be increased up to 608 or even 832 pixels that will increase
precision and make the new weights more sensitive to smaller
Fig. 6 Example of intermediate towers detection objects. However, network resolution will not be increased for
the pole detecting case since we do not want to focus on small
pole details.
Future developments
As we’ve mentioned above, the first block only detects and
classifies power line poles. These detected objects get sent to
the block 2 that specializes in detecting power line poles
components.
To organize smooth exchange of information between the
blocks of the system it was decided to use a simple python
dictionary. Keys are represented by the class objects reflecting
a section of the input frame or an image in which an object
(concrete or metal pole) was detected. For instance, let’s
consider the “Fig. 10”. Block one outputs a dictionary in the
form {whole_image : obj_detected_1 – obj_detected_4},
where whole_image is the class object that has coordinates of
the section of the image in which those objects were detected
as an attribute. For block one this is the whole image “Fig.
10”. The dictionary values – detected objects, on the other
Fig. 7 Example of anchor towers detection hand, also have their coordinates relative to the whole image
stored.
Authorized licensed use limited to: Carnegie Mellon 978-1-7281-4401-6/19/$31.00 ©2019
Libraries. Downloaded on September IEEE
12,2022 0703
at 20:57:50 UTC from IEEE Xplore. Restrictions apply.
2019 International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON)
included (especially insulators and vibration dumpers sticking
to the sides).
Once a frame has been processed, if any objects were
detected on it, they will be sent to the block 3 consisting of a
number of defect detecting algorithms. If on any of those
objects a defect is found, this object’s attribute reflecting its
status gets changed to defected and stored to a database.
VI. CONCLUSION
The deep learning-based software to detect and classify
utility poles on images and video has been developed and
considered in the current paper. The place and tasks of this
Fig. 10 Example of an image processed by the block 1
block 1 neural network in the larger system was explained and
neural network
general operational workflow of the system described.
Once complete, the system will be used to analyse data
collected by UAVs to detect different types of defects
associated with different components of power lines. This in
turn, will ensure less expensive, safer and easier maintenance
of power lines.
REFERENCES
[1] V. Nh. Nguyen, R. Jenssen and D. Roverso, “Automatic autonomous
vision-based power line inspection: A review of current status and the
potential role of deep learning,” Electrical Power and Energy Systems,
vol. 99, pp.107-120, January 2018
[2] B. Chen and X. Miao, “Distribution Line Pole Detection and Counting
Based on YOLO Using UAV Inspection Line Video,” Journal of
Electrical Engineering & Technology, в печати, 2019
(a) (b) (c) (d) [3] B. Jalil, G. R. Leone , M. Martinelli , D. Moroni , M. A. Pascali and A.
Berton, “Fault Detection in Power Equipment via an Unmanned Aerial
Fig. 10 Example of the output produced by the block 1 System Using Multi Modal Data,” Sensors, vol.19, pp. 1-15, July 201
neural network to be sent to the second block [4] Siddiqui, Zahid, et al. "Robust Powerline Equipment Inspection
System Based on a Convolutional Neural Network." Sensors 18.11
Then, in the block two each object (pole) gets sent to the (2018): 3837.
component detecting neural network that uses only the image [5] Jalil, Bushra, et al. "Fault Detection in Power Equipment via an
section of this object to detect components instead of the Unmanned Aerial System Using Multi Modal Data." Sensors 19.13
whole image. To visualize it better, have a look at the “Fig. (2019): 3014.
11”. Each object represents what gets sent to the neural [6] Keith D. Foote. A brief history of deep learning. Feb. 2017. Accessed
networks in the block two. All components detected get stored on: Sept. 1, 2019. [Online]. Avaiable:
https://www.dataversity.net/brief-history-deep-learning/
in the same dictionary following the same logic: keys – image
section associated with the power pole getting processed,
[7] Joao Cartucho. OpenLabelling: open-source image and video labeler.
values – objects (pole components) detected. Accessed on August 27, 2019. [Online]. Available:
We encountered a couple of issues that helped us catch https://github.com/Cartucho/OpenLabeling/blob/master/README.m
d
some system logic issues. For instance, if on, let’s say, an
[8] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental
image we had a power line component close-up the system in improvement." arXiv preprint arXiv:1804.02767 (2018).
one of its early versions wouldn’t be able to detect it because [9] Ayoosh Kathuria. What’s new in YOLO v3. Accessed on Sept 3, 2019.
first it was searching for poles. Having failed to find any, it [Online]. Available: https://towardsdatascience.com/yolo-v3-object-
wouldn’t even try searching for pole components. In addition, detection-53fb7d3bfe6b
often an object’s bounding box is not perfectly correct and [10] Alexey AB. Darknet. Accessed on August 29, 2019. [Online].
might not cover the whole pole. So, it was decided to widen Available: https://github.com/AlexeyAB/darknet
bounding boxes to make sure all object’s components are
Authorized licensed use limited to: Carnegie Mellon 978-1-7281-4401-6/19/$31.00 ©2019
Libraries. Downloaded on September IEEE
12,2022 0704
at 20:57:50 UTC from IEEE Xplore. Restrictions apply.