Machine Learning for 802.11 Troubleshooting
Machine Learning for 802.11 Troubleshooting
Abstract—The rapidly increasing popularity of 802.11 WLANs performance degradation of end-user applications, which is
along with the co-existence of multiple heterogeneous devices difficult to be attributed to a specific cause from users lacking
in the unlicensed frequency bands have created unprecedented a sufficient expertise such as most users are.
levels of congestion, especially in densely populated urban
areas. Under such complex setups, WLAN under-performance Detection and troubleshooting of performance problems in
issues experienced by end-users are hard to interpret even by WiFi networks is a particularly difficult and frustrating task.
experts. In this paper, we develop an intelligent, easy to deploy This is due to the complex and dynamic nature of the wireless
mechanism that takes advantage of MAC-layer exported data
and employs machine learning techniques to accurately diagnose
medium that demands the collection of specific information
the five most common WiFi pathologies (contention, low-SNR, from the lower levels of 802.11 protocol, which is hard to be
non-802.11 Interference, etc.). The collected data are fed to four interpreted even from skilled experts. In enterprise networks,
different classification algorithms, which we fine-tune, in order to responsible for troubleshooting problems are networked ad-
optimize their hyper-parameters in regards to their precision and ministrators having a clear picture of the network topology,
accuracy. The resulting solution provides two different mecha-
nisms, with the first targeting low-overhead passive detection and
sufficient knowledge of the 802.11 protocol operation and
the second offering more accurate performance relying on active specialized equipment. Thus, they are in a position to draw
probing. Detection performance is evaluated through extensive safe conclusions. On the other side, home users are asked
testbed experiments and exhibits that the K-Nearest Neighbors to solve a much more complex problem given the random
classifier achieves almost 100% accuracy and precision for the deployment of home WLANs, along with the lack of expertise
active probing and 95% accuracy and precision for the passive
detection across the five considered pathologies.
and equipment. According to the above and due to the ever-
growing spread of WLANs, it becomes increasingly necessary
I. I NTRODUCTION to develop automated techniques and mechanisms that can
detect the causes of performance degradation in today’s dense
Till the present day, the IEEE 802.11 standard, most
and complex home wireless networks.
commonly known as WiFi, has grown enormously and has
intruded into every single home and enterprise. Private and In this paper, we extend our previous work [2] of turning
public WiFi networks are responsible for transferring a signif- low-cost commercial APs to intelligent devices able to detect
icant percentage, 45%, of the total IP traffic and are expected performance impairments, diagnose the underlying patholo-
to grow even more, turning this percentage to 49% by 2020 gies and potentially dynamically adjust operation in order to
[1]. improve or even restore maximum performance. We achieve
In addition, the growing adoption of IoT through the emer- that by following a data-driven analysis [3] over the plenty and
gence of smart TVs, wearable devices (smartwatches, activity freely available data coming from drivers of wireless cards
trackers), security cameras, energy monitoring devices and that is exported to user-level. This MAC-layer data, revealed
even smart light bulbs has greatly expanded common home from AP devices, is collected as a part of the physical rate
WLANs. All these devices, although having different charac- control mechanism and is exported to user-level for debugging
teristics and often operating under a different protocol (WiFi, purposes. From the data available to us, we choose the part
ZigBee, Bluetooth), are bound to coexist in dense urban that serves us better in characterizing the root causes of
environments and in the limited unlicensed wireless spectrum. performance degradation, in specific, transmission attempts
However, this coexistence can often become problematic for and percentage of successful transmissions. We, then, evaluate
WiFi users, due to the underlying contention for access to the employment and fine-tuning of popular classification
the wireless medium or due to the interference caused by machine learning techniques for diagnosing WiFi networks’
non-WiFi devices (even from microwave ovens). Furthermore, pathologies.
inexperienced users are prone to mistakes when it comes to As in our previous work, the five most important and well-
deployment of Access Points (AP), leading to poor channel known pathologies are replicated. This occurs in a controlled
conditions for their end-devices. The flawless operation of environment in our testbed by injecting traffic in the network
a home WiFi network becomes even more complex when covering a wide range of scenarios, in order to monitor and
we also consider inherent impairments of the 802.11 protocol gather data of a significant size. This data is fed to four differ-
such as the Hidden Terminal and Capture Effect phenomena. ent models, each featuring one of the following notable clas-
All these factors contribute to the frequently anticipated sification algorithms: a) Decision Trees, b) Random Forests,
horized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on September 23,2024 at [Link] UTC from IEEE Xplore. Restrictions app
978-1-5386-5553-5/19/$31.00 ©2019 IEEE
2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC)
c) Support Vector Machines and d) K-Nearest Neighbors. The notable example of such a work is [9], where authors feed
models are then cross-validated and their parameters are fine- results of the spectral scan function performed by Atheros
tuned, in order to maximize accuracy and most importantly wireless cards to Decision Tree and SVM classifiers, in
precision. Data is modeled into two modes, one suitable for order to distinguish between non-WiFi devices sharing the
passive monitoring and another suitable for active probing. unlicensed spectrum. The work in [10], based on information
Thus, we overcome our former work’s limitation of having provided by commercial cards (sequences of receiver errors),
to inject traffic in the network for performing detection, by aims at detecting sources of non-WiFi interference by the em-
offering the option of a passive mechanism, although without ployment of Artificial Neural Networks and hidden Markov
excluding a more accurate, active probing option. All models chains. Cross-technology interference detection has also been
in both data modes are trained based on the gathered data and considered in 802.15.4 (Zigbee) networks[11],[12].
evaluated in regards to their accuracy and precision. In contrast to the aforementioned body of work, our de-
Our work aims at providing an intelligent framework that tection methodology takes a step further and considers all
can be easily deployed on commercial AP devices, able to the potential pathologies. It covers 802.11 Contention, non-
diagnose the underlying pathologies with high accuracy and 802.11 interference, low Signal-to-Noise ratio and specific to
precision. It differentiates from related approaches by offering 802.11 anomalies, such as the Hidden Terminal and Capture
the ability of passive detection and covering the whole range Effect phenomena, thus providing more insight on the under-
of WiFi pathologies. lying causes of underperformance.
The rest of the paper is organized as follows. Section II
discusses related work. In Section III, a brief overview of III. 802.11 BACKGROUND I NFORMATION
802.11 background information is presented. In Section IV, A. 802.11 Related Pathologies
our methodology for obtaining data is given, followed by the
evaluation of the classification models in Section V. Finally, The performance of an 802.11 device is mainly impacted
we conclude in Section VI commenting in our findings and by two factors. The first one is the availability of channel
proposing possible extensions. access opportunities, while the second one is the efficiency of
the frame delivery, whenever the device is given the chance
II. R ELATED W ORK to transmit. Based on this key observation we can categorize
There are a few similar approaches that have been followed pathologies into two classes: Medium Contention and Frame
in the literature for detecting causes of poor performance in Loss.
WiFi networks. As an example, the work by Kanuparthy et 1) Medium Contention: In this category, we consider
al.[4] takes advantage of user-level information for distin- pathologies that occur when a WiFi device senses the medium
guishing between the different 802.11 pathologies, by em- as busy and thus defers from transmitting. This type of
ploying active probing and estimating simple metrics such pathologies are frequently encountered in dense urban envi-
as transmission rate. WiSlow, [5] also exploits MAC-layer ronments, as multiple WiFi networks are concentrated in small
information gathered after injecting traffic to the network for areas, while also coexisting along with non-802.11 devices
discriminating between 802.11 and non-802.11 interference (ZigBee, Bluetooth, microwave ovens) that share the same
with a high accuracy, when interfering devices are close to the limited unlicensed spectrum. Consequently, we consider two
suffering node. However, our work differentiates by providing pathologies in this category, 802.11 Contention and Non-
a mode of passive pathology detection that does not incur 802.11 Contention.
any additional overhead and moreover does not impose any 2) Frame Loss: This category includes the pathologies that
significant accuracy penalty. occur when 802.11 devices identify the medium as idle and
Machine learning techniques have been heavily employed attempt a transmission that ultimately fails due to the link
in recent years, when abundance of data exists. WiFi is not conditions experienced at the side of the receiving device.
an exception, with several frameworks being developed for As a consequence, a delay in the next transmission attempt
either estimating key performance indicators of WLANs or takes place, due to the subsequent doubling of the Contention
classifying sources of underperformance. In [6], authors apply Window (CW). A primary reason for this failure can be the
the classification algorithms, considered also in our work, for Low SNR conditions experienced on the receiving end. It
characterizing and estimating WiFi latency. Artificial Neural can be attributed to either low signal power due to fading,
Networks are used in [7] for estimating packet delivery rate, path loss etc. or to high noise caused by interfering devices
while a hidden Markov model estimates the probability of operating outside the sensing range of the WiFi device.
interference between 802.11 nodes from traces collected from In addition, frame delivery failures may also be attributed
multiple sniffers in [8]. to inherent 802.11 impairments such as Hidden Terminal
In addition, several works have put an effort on investi- and Capture Effect phenomena that occur when concurrent
gating the detection and classification of the various causes transmissions lead to frame collisions. More specifically,
of poor performance in WiFi networks with the application the Hidden-Terminal phenomenon occurs in cases that the
of machine learning algorithms. However, most of them receiving device lies within the transmission range of two
have focused basically on the identification of interference, active 802.11 devices that are mutually hidden and cannot
especially cross-technology, using statistics gathered from sense each other resulting in frame collisions. In cases that
commodity hardware or traces from monitoring devices. A no remarkable difference is observed in the received signal
horized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on September 23,2024 at [Link] UTC from IEEE Xplore. Restrictions app
2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC)
horized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on September 23,2024 at [Link] UTC from IEEE Xplore. Restrictions app
2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC)
horized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on September 23,2024 at [Link] UTC from IEEE Xplore. Restrictions app
2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC)
Table I Table II
D ECISION T REES SETS OF HYPER - PARAMETERS VALUES R ANDOM F ORESTS SETS OF HYPER - PARAMETERS VALUES
a classification or decision. The topmost decision node in a pathology, inducing saturated traffic of no value for the user
tree which corresponds to the best predictor called root node. in the network. However, in the first model, called Passive
Decision trees can handle both categorical and numerical data. Model onwards, samples can be obtained with no need for
2) Random Forests: Random forests or random decision extra probing traffic, as described in Subsection IV-C.
forests are an ensemble learning method for classification,
regression and other tasks, that operate by constructing a C. Classifiers Implementation
multitude of decision trees at training time and outputting
the class that is the mode of the classes (classification) of the After processing data for feature extraction, we shuffled
individual trees. Random decision forests correct for decision and splitted it into a training and a test set. For the first
trees’ habit of over fitting to their training set. model we selected the 80% of the samples for training and
3) Support Vector Machine Classification: A Support Vec- the remaining 20% for testing, while for the second model
tor Machine (SVM) is a discriminative classifier formally the percentages were 85% and 15% respectively, where the
defined by a separating hyperplane. In other words, given number of instances was decreased by the feature aggregation.
labeled training data (supervised learning), the algorithm out- All classifiers were implemented in Python using the scikit-
puts an optimal hyperplane which categorizes new examples. learn library [13]. We then performed hyper-parameter tuning
In two dimensional space this hyperplane is a line dividing a for both models and all classifiers, in regards to optimize
plane in two parts where in each class lay in either side. classification precision. We chose precision over accuracy in
4) K-Nearest Neighbors: The K-Nearest Neighbors algo- an effort to avoid the accuracy paradox that can often occur
rithm is a classification algorithm, and it is supervised: it takes in unbalanced training sets. In order to avoid overfitting, we
a bunch of labeled points and uses them to learn how to label performed a 5-fold [Link] hyper-parameters for
other points. To label a new point, it looks at the labeled points each classifier along with their set of values that were tested
closest to that new point (those are its nearest neighbors), and are given in Tables I, II, III and IV accordingly.
has those neighbors vote, so whichever label the most of the More specifically, for the Decision Trees classifier, we
neighbors have is the label for the new point (the ”k” is the found that for both models ”entropy” was the best function
number of neighbors it checks). to measure the quality of a split, which uses the information
gain. The best random split was found as the best strategy for
B. Data Modeling and Feature Extraction the splitter, while the minimum number of samples required to
The experimental data gathered during the sessions de- split an internal node was 4. The differentiation between the
scribed in Section IV, was fed to classification algorithms, two models, Active and Passive was the number of features to
modeled in two distinct ways. The first model fed classifiers consider when looking for the best split, which for the former
with samples as 3-dimensional vectors of features, in the the best value was the ”auto”, considering the total number
following format: of features, while for the latter was the default value 3.
Regarding, Random Forests classifier, again, ”entropy” was
{NCA, FDR, MODULATION}
the best function to measure the quality of a split. For the
along with their labels denoting the pathology, while the sec- Active model, the optimized value for the number of features
ond model aggregated the results of Varying Bitrate Probing to consider when looking for the best split was 8 and the
and fed them as 16-dimensional vectors of features, in the number of estimators was the default 10. In contrast, for
following format: the Passive model, the corresponding values were the ”auto”,
representing the square root of the total number of 3 features
{NCAM CS0 , FDRM CS0 , ..., NCAM CS7 , FDRM CS7 }
and 30 for the number of trees in the forest.
along with their corresponding labels. The rationale behind Concerning, the SVM classifier, the default ”rbf” kernel
this differentiation was that although the second model, called yielded the best results for both models, while the kernel
Active Model onwards, can certainly offer better accuracy, coefficient ”gamma” was set to 0.001 for the Active model
it requires that the active sampling test (Varying Bitrate and to ”auto” for the Passive model that equals to 1 / total
Probing) is run every time we need to detect an underlying number of features and in our case equals to 0.33. Finally,
horized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on September 23,2024 at [Link] UTC from IEEE Xplore. Restrictions app
2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC)
Table V Table VI
C ONFUSION MATRIX K-N EAREST N EIGHBORS - ACTIVE M ODEL C ONFUSION MATRIX K-N EAREST N EIGHBORS - PASSIVE M ODEL
Low SNR WiFi Capture Hidden MW Low SNR WiFi Capture Hidden MW
Low SNR 100 0 0 0 0 Low SNR 99.3 0 0.4 0 0.3
WiFi 0 100 0 0 0 WiFi 0 100 0 0 0
Capture 0 0 97.6 2.4 0 Capture 1.4 0 90.9 7.4 0.3
Hidden 0 0 0 100 0 Hidden 0 0 6.2 93.8 0
MW 0 0 0 0 100 MW 0 3.1 3.1 0.9 92.9
penalty parameter ”C” of the error term was set to 1 for the classifier was chosen as a part of a software framework we
Active model and 10 for the Passive. built for detection based on two modes, a highly-accurate
Lastly, the hyper-parameters of K-Nearest Neighbors clas- Active and a no-overhead Passive one, by taking advantage
sifier were also fine-tuned, in order to optimize precision. As of data exported by the driver of the wireless card. We
regards to the Active model, the number of neighbors used exhibited remarkable results of 99.2% and 95.1% accuracy
was set to 3, while the ”brute force” search was the algorithm accordingly. As a future work, we plan to employ multi-
used to compute the nearest neighbors with a ”uniform” label classification in an attempt to accurately detect multiple
weighting function. For the Passive model, we set the default coexisting pathologies.
value for the number of neighbors equal to 5. The algorithm ACKNOWLEDGMENT
used was set to ”auto”, in order to decide the appropriate
between KDTree, BallTree and brute force according to the The research leading to these results has received funding
data and used the ”distance” weighting function that weights by GSRT, under the act of HELIX-National Infrastructures
points by the inverse of their distance. We set ’p’ equal to for Research, MIS no 5002781.
1 that corresponds to the Manhattan distance for the power R EFERENCES
parameter for the Minkowski metric, in both models. [1] V. Cisco, “Cisco visual networking index: Forecast and methodology
2016–2021.(2017),” 2017.
D. Pathology Detection Performance [2] I. Syrigos, S. Keranidis, T. Korakis, and C. Dovrolis, “Enabling wireless
lan troubleshooting,” in International Conference on Passive and Active
Having determined the best hyper-parameters for each of Network Measurement. Springer, 2015, pp. 318–331.
the considered classifiers and for both the Active and Passive [3] C. Fortuna, E. De Poorter, P. Škraba, and I. Moerman, “Data driven
Model, we compared them in terms of accuracy, precision and wireless network design: A multi-level modeling approach,” Wireless
Personal Communications, vol. 88, no. 1, pp. 63–77, 2016.
recall. The overall results are shown in Table VII, where it is [4] P. Kanuparthy, C. Dovrolis, K. Papagiannaki, S. Seshan, and
evident that the K-Nearest Neighbors classifier is superior to P. Steenkiste, “Can user-level probing detect and diagnose common
the rest for all metrics in both models. More specifically, it home-wlan pathologies,” ACM SIGCOMM Computer Communication
Review, vol. 42, no. 1, pp. 7–15, 2012.
exhibits an outstanding accuracy of 99.2% and 95.1% for the [5] K.-H. Kim, H. Nam, and H. Schulzrinne, “Wislow: A wi-fi network
corresponding models. This can be attributed to the fact that performance troubleshooting tool for end users,” in INFOCOM, 2014
the K-Nearest Neighbors algorithm tends to perform very well Proceedings IEEE. IEEE, 2014, pp. 862–870.
[6] K. Sui, M. Zhou, D. Liu, M. Ma, D. Pei, Y. Zhao, Z. Li, and
in cases when there are many data points and few dimensions T. Moscibroda, “Characterizing and improving wifi latency in large-
of the feature vector. It is also known that as a non-parametric scale operational networks,” in Proceedings of the 14th Annual Inter-
algorithm it performs better than parametric ones (SVM), national Conference on Mobile Systems, Applications, and Services.
ACM, 2016, pp. 347–360.
when the considered classes are overlapping. In Tables V [7] M. O. Khan and L. Qiu, “Accurate wifi packet delivery rate estimation
and VI, the confusion matrices of the K-Nearest Neighbors and applications,” in Computer Communications, IEEE INFOCOM
classifier for each model are presented. At this point, it is 2016-The 35th Annual IEEE International Conference on. IEEE, 2016,
pp. 1–9.
also remarkable to note that there is some misclassification [8] U. Paul, A. Kashyap, R. Maheshwari, and S. R. Das, “Passive measure-
between Hidden Terminal and Capture Effect pathologies, a ment of interference in wifi networks with application in misbehavior
fact that is quite expected, as these pathologies differ only in detection,” IEEE transactions on mobile computing, vol. 12, no. 3, pp.
434–446, 2013.
their symmetry. [9] S. Rayanchu, A. Patro, and S. Banerjee, “Airshark: detecting non-
wifi rf devices using commodity wifi hardware,” in Proceedings of the
2011 ACM SIGCOMM conference on Internet measurement conference.
Table VII
ACM, 2011, pp. 137–154.
E VALUATION RESULTS
[10] N. Inzerillo, D. Croce, D. Garlisi, F. Giuliano, and I. Tinnirello, “Error-
Active Passive based interference detection in wifi networks,” in GLOBECOM 2017-
2017 IEEE Global Communications Conference. IEEE, 2017, pp. 1–6.
Accuracy Precision Recall Accuracy Precision Recall
[11] A. Hithnawi, H. Shafagh, and S. Duquennoy, “Tiim: technology-
DT 97.8 97.8 97.8 92.8 92.8 92.8
independent interference mitigation for low-power wireless networks,”
RF 98.1 98.1 98.1 94.6 94.6 94.6 in Proceedings of the 14th International Conference on Information
SVM 98.7 98.7 98.7 94.1 94.3 94.1 Processing in Sensor Networks. ACM, 2015, pp. 1–12.
KNN 99.2 99.2 99.2 95.1 95.1 95.1 [12] F. Hermans, L.-Å. Larzon, O. Rensfelt, and P. Gunningberg, “A
lightweight approach to online detection and classification of inter-
ference in 802.15. 4-based sensor networks,” ACM SIGBED Review,
VI. C ONCLUSION vol. 9, no. 3, pp. 11–20, 2012.
[13] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
In this paper we presented the employment, the fine- O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg et al.,
tuning and evaluation of multiple classification algorithms for “Scikit-learn: Machine learning in python,” Journal of machine learning
detecting the causes of WiFi underperformance. The prevalent research, vol. 12, no. Oct, pp. 2825–2830, 2011.
horized licensed use limited to: AMRITA VISHWA VIDYAPEETHAM AMRITA SCHOOL OF ENGINEERING. Downloaded on September 23,2024 at [Link] UTC from IEEE Xplore. Restrictions app