1
A Decision Tree Based Method for
Fault Classification in Transmission Lines
S. M. Shahrtash, A. Jamehbozorg
Center of Excellence for Power System Automation and Operation
Iran University of Science & Technology
shahrtash@[Link] ajamebozorg@[Link]
new case.
Abstract -- In this paper a novel and accurate method is In [2], [3] and [5], one neural network is used for classi-
proposed for fault classification in transmission lines. The fication, while several neural networks have been used for
method is based on decision tree and gets the 50 up to 950 Hz fault classification in [4]. In [6]-[8], a wavelet, and in [9], a
phasors of voltages and currents from one end of the line, as fuzzy-neuro approach is presented for classification.
the inputs. The method is applied to a 400kV transmission
line, and the results showed the highest possible accuracy In this paper, a decision tree method has been presented
within less than quarter of a cycle after fault inception. for fault classification in a single-circuit power transmission
line. According to the traveling waves initiated by the fault,
Index Terms- Decision Tree, Fault Classification, Transmis- a fault detector, which determines the exact fault inception
sion Line. time, has been used. Then the amplitude and phases of the
voltages and currents at the relaying point, calculated
I. INTRODUCTION through a data window with 2 msec post-fault samples,
B y increasing the dependency of industries to electrical
power, this vital energy must be delivered to the cus-
tomers with high quality and short interruption time. To
have been used as input features of a trained decision tree
and the classification has been accomplished. The simula-
tions have been carried out by EMTDC/PSCAD [11] soft-
achieve the second goal, detection, classification and loca- ware and the results have shown the highest accuracy for
tion of faults must be done in the shortest possible time. the presented method.
Heretofore many techniques have been presented for the
classification of faults in transmission lines. Each technique II. DECISION TREE
has performed a trade-off between accuracy and speed. In the recent years, pattern recognition methods have
The fault classification methods can be divided into two been used widely in all fields of science. Because of nonli-
categories, where one includes conventional, and the other, near property of new problems, increasing size of input data
pattern recognition methods. and uncertainty in them, conventional solutions do not sat-
In conventional methods, by performing some mathe- isfy the basic requirements such as short calculation time,
matical calculations on the voltages and currents of trans- high accuracy and etc, while pattern recognition methods,
mission line, the detected faults are classified. In [1], some inherently, can satisfy abovementioned requirements. Spe-
internal variables are defined based on three phase voltages cially, soft computing approaches have shown more robust-
and currents of both sides of the line. By using these vari- ness, speed, and accuracy rather than the other techniques.
ables fault classification has been performed. The main Pattern recognition has three different stages: feature ex-
disadvantage of this method is its need to the data from traction, feature selection, and classification [12]. In most
both sides of the protected line. of the problems in power systems, voltages and currents of
In pattern recognition methods, a number of soft com- the three phases (or a linear combination of them) are first
puting methods such as, neural networks [2]-[5], wavelet candidates as input features. Then the first two stages have
[6]-[8] and fuzzy logic [9], [10] have been used. In these less importance rather than the third.
techniques, first, the classifier must be trained by applying One of the attractive approaches of pattern recognition is
different cases of faults and then it can be used to classify a the decision tree (DT). Decision tree, irrespective of the
978-1-4244-1904-3/08/$25.00 2008 IEEE
2
procedure used for creating the tree, has two stages; training
and testing. In the first stage, after exact simulation of the
entire system under different faults, the input features, with
known relevant classes, are fed to the decision tree algo-
rithm. In the tree building process, different criteria have
been used for evaluating the effect of input features in de-
termining the output classes. Generally, the most effective
feature is selected as the first node of the tree and its border
value is used for creating two different branches. Then
again, by the same criterion, the next most effective feature
is found in each branch. This process is continued until the
final nodes (leafs of tree) obtained in all of the branches
contain only the output classes. After creating the tree, a
pruning process is performed to remove unnecessary nodes
and decreasing the size of the resulted tree. A typical deci-
sion tree is shown in Fig 1, where elliptical nodes are input
features and rectangular nodes are output classes. Border
value of each branch is written on it.
Finally and in the test stage, to determine the accuracy of
the created tree, different sets of input features, obtained
from various simulations, are fed to the tree and the outputs
are compared with the relevant classes of each set.
FIG 1 - A TYPICAL DECISION TREE
It is worth to be mentioned, that there are different pro-
cedures to create a decision tree. In general, to find the best
tree to solve any problem, different procedures may be ap-
plied and then the best one can be selected by comparing FIG 2 - SCHEMATIC DIAGRAM OF PROPOSED ALGORITHM
the accuracy of the results and the required time to create
Thus for performing the classification in the proposed
the tree.
method, first the exact time of fault inception must be de-
termined. The proposed fault detector is based on detecting
III. REPRESENTING PROPOSED METHOD the backward traveling wave measured at the relaying
point. In order to reveal this wave, the high frequency com-
The proposed method uses the voltage of one phase and
ponents of the sampled voltage and current are extracted
the currents of the other two phases from one end of the
and f is calculated by the following relation:
line. Then by performing HCDFT (Half Cycle Fourier
Transform), the odd harmonics up to 19th harmonics are f = v Z C .i (1)
calculated. These values (amplitudes and phases) are ap- where Zc is the surge impedance of the line. If no fault has
plied to the tree, as input features for performing fault clas- been occurred, the magnitude of f is low, and after the fault
sification (as shown in Fig. 2). occurrence this value is increased considerably.
The accuracy of the method is considerably affected by After determining the exact time of fault inception, a da-
the data window which is selected for extracting the phasor ta window (including 100 samples with 10 kHz sampling
values. Based on various simulations, accurate classifica- frequency with 80 pre-fault samples and 20 post-fault ones)
tion can be performed with a data window which only con- is considered and an HCDFT extracts odd harmonics of the
tains 2 msec post fault samples (and 8 msec pre-fault ones), signals up to the nineteenth (950 Hz in a 50 Hz system).
with 10 kHz as the sampling frequency. These phasors are used as input features of DT and classifi-
cation is accomplished by using this method.
3
IV. SIMULATION RESULTS
To apply the proposed method, a single-circuit, 400 kV,
100 km length transmission line, as shown in Fig 3 and
Table 1, has been considered.
Various faults with different conditions (fault resistance,
power flow angle, fault inception time, and fault location)
have been simulated which are divided to training set
(Table 2), and test set (Table 3). As an example, the voltage
waveform of phase "A" and its harmonics during an "ABC"
fault are shown in Fig 4 and Fig 5, respectively.
Finally, the calculated phasors have been fed to WEKA
software [13] for fault classification. Among different DT
algorithms, Random Forest algorithm was used for creating
the tree. This algorithm creates definable numbers of sepa- FIG 4 VOLTAGE OF PHASE "A" BEFORE AND AFTER AN "ABC" FAULT
rate trees by using seven random features and trains each of
them with input data. Then for each case, the most repeated
results from the trees would be considered as final result.
Using this algorithm, the training time for 6600 cases is
about five minutes that is very lower than the other methods
e.g. neural network (the final trees are not shown, because
of their big sizes).
FIG 5 - AMPLITUDE AND PHASE OF THE W AVEFORM SHOWN IN FIG 4
TABLE 4 - ACCURACY AND SPEED OF CLASSIFICATION METHODS
Classification Method Accuracy Classification Time
Presented Method in [1] 100% 5msec
FIG 3 - TRANSMISSION LINE MODEL
Presented Method in [2] 100% 6msec
Presented Method in [3] - 5msec
TABLE 1 - TRANSMISSION LINE PARAMETERS Presented Method in [4] 99.5% 7-8msec
Specification Value Presented Method in [5] 100% -
Conductor Radius [m] 0.015 Presented Method in [6] - 40msec
DC Resistance [Ohm/Km] 0.05 Presented Method in [7] 99% 40msec
Sag for All Conductors [m] 5 Presented Method in [8] - 10msec
No. of Sub-Cond. in a Bundle 1 Presented Method in [9] - 10mse
Proposed Method 100% 2msec
TABLE 2 - SYSTEM PARAMETERS FOR TRAINING SET By the proposed method, all of the 1920 test cases have
Parameter Values been classified accurately. Also, according to the selection
Fault Resistance [Ohm] 0, 30, 50 of the data window with just only 2 msec post-fault sam-
Power Transmission Angle [Degree] -20,-10,10,20
Fault Inception Time [msec] 20, 22, 24, 26, 28
ples, the time required for the classification limits to 2msec,
5,10,20,30,40, 50, too. This length of time, in comparison to the published
Location of Fault [Km]
60,70, 80, 90, 95 methods [1]-[9] is very short and suitable. Table 4 com-
Type of Faults All 10 Types pares the accuracy and speed of those methods, as proposed
by them, with the presented method.
TABLE 3 - SYSTEM PARAMETERS FOR TEST SET
Parameter Values
Fault Resistance [Ohm] 0,20,40 V. DISCUSSION
Power Transmission Angle [Degree] -15,-7,7,15 In this section, the effect of the sampling frequency,
Fault Inception Time [msec] 21,23,25,27 number of harmonics used for training, and applying phases
Location of Fault [Km] 15,35,65,85
Type of Faults All 10 Types
of the harmonics in training process will be discussed. Also,
the accuracy of the proposed method in fault classification
for the cases which were beyond the training set is shown.
4
Finally, the contribution of the fault detector error is de- fication accuracy will be decreased considerably. To inves-
scribed. tigate the accuracy of the proposed method in these cases,
960 faults in 2.5km and 97.5km far from the relaying point
A. Effect of the sampling frequency
(the training cases were considered from 5 to 95 km), with
According to Table 5, the accuracy of the proposed me- same conditions as the ones considered for training stage
thod has been increased from 98.85 to 100 percent, while (Table 3), have been simulated and classified. The results
increasing the sampling frequency from 5 kHz to 10 kHz. are shown in Table 8.
The main reason for this change is that larger sampling fre-
quency results in higher accuracy in obtaining phasors by TABLE 8 - EXTERIOR CASES AND THE ACCURACY
HCDFT. Fault Location Accuracy
The remarkable point is that the misclassifications oc- 2.5km 83.95%
curred were only some "LLG" faults that had been classi- 97.5km 95.83%
fied as "LL" faults. Half of these wrong classifications oc-
cur during "ABG" and the other half is during "BCG" faults According to these results, the accuracy of the method
that have been classified as "AB" and "BC" faults, respec- decreases considerably for faults that are very close to re-
tively. Also 75 percent of these misclassifications occurred laying point. Table 9 shows the details of the misclassified
in faults that were located in 65km far from the relaying results and the percentage of them for the cases shown in
point. Table 8. As can be inferred, for faults in 97.5km, 70% of
misclassifications were in classifying "BC" faults as
TABLE 5 - SAMPLING FREQUENCY AND THE ACCURACY "BCG". In close-in-faults, in addition to these misclassifica-
Sampling Frequency Accuracy tions, a number of "LL" and "LLG" faults were classified as
5 kHz 98.85% "LLL" faults, because of large amplitude of short circuit
10 kHz 100% currents.
Obviously, considering the data of close-in and remote
faults in the training set will result in correct classification
B. Effect of number of applied harmonics of abovementioned faults, as well.
Generally, increasing the number of input features in-
creases the accuracy of classification by DT. According to TABLE 9 - MISCLASSIFIED TYPES FOR THE CASES IN TABLE 8
Table 6, the accuracy of the method has been increased by Fault
Fault Type
Predicted Percentage between
Location Fault Misclassified Cases
using 10 harmonics w.r.t. first 5 harmonics from 99.48 to
BC BCG 21.3%
100 percent. Again, misclassifications are "ABG" faults that
BG ABC 13.3%
have been classified as "AB" fault. It is worth noting that
CG ABC 13.3%
any misclassification between "LLG" and similar "LL" 2.5km
ACG ABC 12%
faults (e.g. "ABG" and "AB" faults) has a minor contribu- BCG ABC 10.6%
tion for many practical applications, such as auto- Others 29.5%
reclosures. BC BCG 70%
97.5km
Others 30%
TABLE 6 - NUMBER OF APPLIED HARMONICS AND THE ACCURACY
Number of Applied Harmonics Accuracy
5 (50-450Hz) 99.48%
10 (50-950Hz) 100% E. Contribution of fault detector error
According to considering the surge impedance as a con-
stant value (equal to its magnitude in very high frequen-
C. Effect of applying phases of the harmonics
cies), to detect the backward traveling waves in the function
Also, the effect of using phases (along with amplitudes) f; and also due to the effect of the noises in the system (the
on the accuracy of method has been studied through creat- power network and measuring devices), the fault detector
ing new training and testing sets containing only the ampli- may suffer at most one sample error in determining the fault
tude of phasors. According to Table 7 the accuracy of the inception time. This case (training with data window with
proposed method has been decreased by discarding the 2msec post-fault samples, while testing with data window
phases as input features of DT, and therefore using phasors with 2.1msec post-fault ones) was examined and the created
angles are necessary for accurate classification. DT showed only 2.5% error in classifications.
However, in order to maintain the highest possible accu-
TABLE 7 - ANGLES IN INPUT FEATURES AND THE ACCURACY
racy, the cases shown in Table 2 has been considered again
Accuracy with using phasor angles 100%
Accuracy without using phasor angles 95.27% for training, but with a data window containing 2.1msec
post-fault samples (one sample further, according to 10 kHz
sampling frequency) for calculating the phasors, i.e. the
D. Effect of exterior cases input features of the DT. Thus, the training set has been
formed by the two subsets (one as the result of 2 msec data
Obviously, for faults whose conditions are beyond the
window and the other as above mentioned). The number of
conditions that were considered in training stage, the classi-
5
training cases has been increased to 13200 and a new DT [10] B. Das, J. V. Reddy, "Fuzzy-Logic-Based Fault Classification
Scheme for Digital Distance Protection", IEEE Trans. On Pow-
has been created. The results have shown 100% accuracy er Delivery, Vol. 20, No. 2, October 2005.
and insensitive to the fault detector error. [11] Electromagnetic Transient Program (EMTDC/PSCAD), and
Indeed, the training time has been increased to 9 min- Real Time Digital Simulator (RTDS) manual, Manitoba
HVDC Research Center, Winnipeg, Canada, 1994 release.
utes which is still acceptable.
[12] Y. Sheng, S. M. Rovnyak, "Decision Tree-Based Methodology
for High Impedance Fault Detection", IEEE Transactions on
VI. CONCLUSION Power Delivery, Vol. 19, No. 2, April 2004.
[13] [Link]
In this paper a novel decision tree based method has
been introduced for fault classification in single circuit
power transmission lines. The proposed method only re- VIII. BIBLIOGRAPHY
quires the data from one side of the protected line and deci-
sion making is performed in just 2msec that is the best time
among previous published methods. The input features of S. Mohammad Shahrtash was born in Tehran, Iran,
1960. He received the B.S. degree in Electrical Engi-
the DT contain the phasor amplitudes and phases of voltage
neering from the AIT, Abadan, Iran in 1980, and the
of one phase and currents of the other two, from the funda- M.S. degree in Electrical Engineering from UMIST,
mental frequency up to the nineteenth harmonics. The pro- England in 1985, and PhD from Sharif University of
posed method, according to the simulations and correct Technology, Iran in 1995. Since 1985, he is academic
fault inception time detection, has shown 100 percent in staff in Electrical Engineering Department of Iran
accuracy. University of Science & Technology (IUST). His main research areas are
The influence of sampling frequency, number of har- Protection, Electromagnetic Transient Analysis and Power System Studies.
monics and phases of the phasors have been shown and the
Arash Jamehbozorg was born in Hamedan, Iran,
suitability of the chosen values for these parameters has 1982. He received the B.S. degree in Electrical
been proved. Also it was shown that the accuracy can be Engineering from the Iran University of Science and
maintained, even if the detector had an error in revealing Technology (IUST), Iran in 2005, and currently is a
the fault inception time, by completing the training set for student in M.S. degree in Power System Engineer-
this condition. ing in IUST.
According to these results, authors are going to extend
the presented method for the classification of double-circuit
lines.
VII. REFERENCES
[1] J. A. Jiang, C. S. Chen, C. W. Liu, "A New Protection Scheme
for Fault Detection, Direction, Discrimination, Classification,
and Location in Transmission Lines", IEEE Trans. on Power
Delivery, Vol. 18, NO. 1, January 2003.
[2] T. Dalstein, B. Kulicke, "Neural Network Approach To Fault
Classification for High Speed Protective Relaying", IEEE
Trans. On Power Delivery, Vol. 10, NO. 2, April 1995.
[3] W. M. Lin, C. D. Yang, J. H. Lin, M. T. Tsay, "A Fault Classi-
fication Method by RBF Neural Network with OLS Learning
Procedure", IEEE Trans. On Power Delivery, Vol. 16, NO. 4,
October 2001.
[4] M. Oleskovicz, D. V. Coury, R. K. Aggarwal, "A Complete
Scheme For Fault Detection, Classification And Location In
Transmission Lines Using Neural Networks", Developments in
Power System Protection, Conference Publication No. 479, IEE
2001.
[5] B. H. Chowdhury, K. Wang, "Fault Classification Using Koho-
nen Feature Mapping", International Conference on Intelligent
Systems Applications top Power System 1996.
[6] D. Das, N. K. Singh, A. K. Sinha, "A Comparison of Fourier
Transform and Wavelet Transform Methods for Detection and
Classification of Faults on Transmission Lines", IEEE Power
India Conference, 10-12 April.
[7] K. M. Silva, B. A. Souza, N. S. D. Brito, "Fault Detection and
Classification in Transmission Lines Based on Wavelet Trans-
form and ANN", IEEE Trans. On Power Delivery, Vol. 21, No.
4, October 2006.
[8] O. A. S. Youssef, "Combined Fuzzy-Logic Wavelet-Based
Fault Classification Technique for Power System Relaying",
IEEE Trans. On Power Delivery, Vol. 19, No. 2, October 2004.
[9] H. Wang, W. W. L. Keerthipala, "Fuzzy-Neuro Approach to
Fault Classification for Transmission Line Protection", IEEE
Trans. on Power Delivery, Vol. 13, No. 4, October 1998.