0% found this document useful (0 votes)
34 views

C1a - Anomaly Detection

Anomaly detection is the identification of rare or abnormal events or observations within data that differ significantly from normal patterns or behaviors. There are three main types of anomaly detection techniques: unsupervised, semi-supervised, and supervised. Anomaly detection is important for network administrators to identify changing operational conditions that could signal risks to businesses, such as security threats, changes in traffic or resource usage, or opportunities for growth. It helps extract insights and maintain core operations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

C1a - Anomaly Detection

Anomaly detection is the identification of rare or abnormal events or observations within data that differ significantly from normal patterns or behaviors. There are three main types of anomaly detection techniques: unsupervised, semi-supervised, and supervised. Anomaly detection is important for network administrators to identify changing operational conditions that could signal risks to businesses, such as security threats, changes in traffic or resource usage, or opportunities for growth. It helps extract insights and maintain core operations.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.

com/glossary/anomaly-detection/

avinetworks.com

What is Anomaly Detection? Definition


& FAQs | Avi Networks
15-19 minutes

< << Back to Technical Glossary

Anomaly Detection

Anomaly detection, also called outlier detection, is the identification


of unexpected events, observations, or items that differ significantly
from the norm. Often applied to unlabeled data by data scientists in
a process called unsupervised anomaly detection, any type of
anomaly detection rests upon two basic assumptions:

Anomalies in data occur only very rarely

The features of data anomalies are significantly different from those


of normal instances

Typically, anomalous data is linked to some sort of problem or rare


event such as hacking, bank fraud, malfunctioning equipment,
structural defects / infrastructure failures, or textual errors. For this
reason, identifying actual anomalies rather than false positives or
data noise is essential from a business perspective.

1 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

Anomaly Detection FAQs

What is Anomaly Detection?

Anomaly detection is the identification of rare events, items, or


observations which are suspicious because they differ significantly
from standard behaviors or patterns. Anomalies in data are also
called standard deviations, outliers, noise, novelties, and
exceptions.

In the network anomaly detection/network intrusion and abuse


detection context, interesting events are often not rare—just
unusual. For example, unexpected jumps in activity are typically
notable, although such a spurt in activity may fall outside many
traditional statistical anomaly detection techniques.

Many outlier detection methods, especially unsupervised


techniques, do not detect this kind of sudden jump in activity as an
outlier or rare object. However, these types of micro clusters can
often be identified more readily by a cluster analysis algorithm.

There are three main classes of anomaly detection techniques:


unsupervised, semi-supervised, and supervised. Essentially, the
correct anomaly detection method depends on the available labels
in the dataset.

Supervised anomaly detection techniques demand a data set with a


complete set of “normal” and “abnormal” labels for a classification

2 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

algorithm to work with. This kind of technique also involves training


the classifier. This is similar to traditional pattern recognition, except
that with outlier detection there is a naturally strong imbalance
between the classes. Not all statistical classification algorithms are
well-suited for the inherently unbalanced nature of anomaly
detection.

Semi-supervised anomaly detection techniques use a normal,


labeled training data set to construct a model representing normal
behavior. They then use that model to detect anomalies by testing
how likely the model is to generate any one instance encountered.

Unsupervised methods of anomaly detection detect anomalies in


an unlabeled test set of data based solely on the intrinsic properties
of that data. The working assumption is that, as in most cases, the
large majority of the instances in the data set will be normal. The
anomaly detection algorithm will then detect instances that appear
to fit with the rest of the data set least congruently.

What Are Anomalies?

Anomalies can classified generally in several ways:

Network anomalies: Anomalies in network behavior deviate from


what is normal, standard, or expected. To detect network
anomalies, network owners must have a concept of expected or
normal behavior. Detection of anomalies in network behavior
demands the continuous monitoring of a network for unexpected
trends or events.

Application performance anomalies: These are simply


anomalies detected by end-to-end application performance
monitoring. These systems observe application function, collecting

3 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

data on all problems, including supporting infrastructure and app


dependencies. When anomalies are detected, rate limiting is
triggered and admins are notified about the source of the issue with
the problematic data.

Web application security anomalies: These include any other


anomalous or suspicious web application behavior that might
impact security such as CSS attacks or DDOS attacks.

Detection of each type of anomaly relies on ongoing, automated


monitoring to create a picture of normal network or application
behavior. This type of monitoring might focus on point
anomalies/global outliers, contextual anomalies, and/or collective
anomalies; the context of the network, the performance of the
application, or the web application security is more important to the
goal of the anomaly detection system.

Anomaly detection and novelty detection or noise removal are


similar, but distinct. Novelty detection identifies patterns in data that
were previously unobserved so users can determine whether they
are anomalous. Noise removal is the process of removing noise or
unneeded observations from a signal that is otherwise meaningful.

To track monitoring KPIs such as bounce rate and churn rate, time
series data anomaly detection systems must first develop a
baseline for normal behavior. This enables the system to track
seasonality and cyclical behavior patterns within key datasets.

Why Anomaly Detection Is Important

It is critical for network admins to be able to identify and react to


changing operational conditions. Any nuances in the operational
conditions of data centers or cloud applications can signal

4 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

unacceptable levels of business risk. On the other hand, some


divergences may point to positive growth.

Therefore, anomaly detection is central to extracting essential


business insights and maintaining core operations. Consider these
patterns—all of which demand the ability to discern between normal
and abnormal behavior precisely and correctly:

An online retail business must predict which discounts, events, or


new products may trigger boosts in sales which will increase
demand on their web servers.

An IT security team must prevent hacking and needs to detect


abnormal login patterns and user behaviors.

A cloud provider has to allot traffic and services and has to assess
changes to infrastructure in light of existing patterns in traffic and
past resource failures.

A evidence-based, well-constructed behavioral model can not only


represent data behavior, but also help users identify outliers and
engage in meaningful predictive analysis. Static alerts and
thresholds are not enough, because of the overwhelming scale of
the operational parameters, and because it’s too easy to miss
anomalies in false positives or negatives.
To address these kinds of operational constraints, newer systems
use smart algorithms for identifying outliers in seasonal time series
data and accurately forecasting periodic data patterns.

Anomaly Detection Techniques

In searching data for anomalies that are relatively rare, it is


inevitable that the user will encounter relatively high levels of noise
that could be similar to abnormal behavior. This is because the line

5 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

between abnormal and normal behavior is typically imprecise, and


may change often as malicious attackers adapt their strategies.

Furthermore, because many data patterns are based on time and


seasonality, there is additional baked-in complexity to anomaly
detection techniques. The need to break down multiple trends over
time, for example, demands more sophisticated methods to identify
actual changes in seasonality versus noise or anomalous data.

For all of these reasons, there are various anomaly detection


techniques. Depending on the circumstances, one might be better
than others for a particular user or data set. A generative approach
creates a model based solely on examples of normal data from
training and then evaluates each test case to see how well it fits the
model. In contrast, a discriminative approach attempts to
distinguish between normal and abnormal data classes. Both kinds
of data are used to train systems in discriminative approaches.

Clustering-Based Anomaly Detection

Clustering-based anomaly detection remains popular in


unsupervised learning. It rests upon the assumption that similar
data points tend to cluster together in groups, as determined by
their proximity to local centroids.

K-means, a commonly-used clustering algorithm, creates ‘k’ similar


clusters of data points. Users can then set systems to mark data
instances that fall outside of these groups as data anomalies. As an
unsupervised technique, clustering does not require any data
labeling.

Clustering algorithms might be deployed to capture an anomalous


class of data. The algorithm has already created many data
clusters on the training set in order to calculate the threshold for an

6 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

anomalous event. It can then use this rule to create new clusters,
presumably capturing new anomalous data.

However, clustering does not always work for time series data. This
is because the data depicts evolution over time, yet the technique
produces a fixed set of clusters.

Density-Based Anomaly Detection

Density-based anomaly detection techniques demand labeled data.


These anomaly detection methods rest upon the assumption that
normal data points tend to occur in a dense neighborhood, while
anomalies pop up far away and sparsely.

There are two types of algorithms for this type of data anomaly
evaluation:

K-nearest neighbor (k-NN) is a basic, non-parametric, supervised


machine learning technique that can be used to either regress or
classify data based on distance metrics such as Euclidean,
Hamming, Manhattan, or Minkowski distance.

Local outlier factor (LOF), also called the relative density of data, is
based on reachability distance.

Support Vector Machine-Based Anomaly Detection

A support vector machine (SVM) is typically used in supervised


settings, but SVM extensions can also be used to identify
anomalies for some unlabeled data. A SVM is a neural network that
is well-suited for classifying linearly separable binary patterns
—obviously the better the separation is, the clearer the results.

Such anomaly detection algorithms may learn a softer boundary


depending on the goals to cluster the data instances and identify
the abnormalities properly. Depending on the situation, an anomaly

7 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

detector like this might output numeric scalar values for various
uses.

Anomaly Detection Machine Learning

As mentioned briefly above, supervised, semi-supervised, or


unsupervised machine learning techniques provide the foundation
for anomaly detection algorithms.

Supervised Machine Learning for Anomaly Detection

Supervised machine learning builds a predictive model using a


labeled training set with normal and anomalous samples. The most
common supervised methods include Bayesian networks, k-nearest
neighbors, decision trees, supervised neural networks, and SVMs.

The advantage of supervised models is that they may offer a higher


rate of detection than unsupervised techniques. This is because
they can return a confidence score with model output, incorporate
both data and prior knowledge, and encode interdependencies
between variables.

Unsupervised Machine Learning for Anomaly Detection

Unsupervised methods do not demand manual labeling of training


data. Instead, they operate based on the presumption that only a
small, statistically different percentage of network traffic is malicious
and abnormal. These techniques thus assume collections of
frequent, similar instances are normal and flag infrequent data
groups as malicious.

The most popular unsupervised anomaly detection algorithms


include Autoencoders, K-means, GMMs, hypothesis tests-based
analysis, and PCAs.

8 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

Semi-Supervised Anomaly Detection

The term semi-supervised anomaly detection may have different


meanings. Semi-supervised anomaly detection may refer to an
approach to creating a model for normal data based on a data set
that contains both normal and anomalous data, but is unlabelled.
This train-as-you-go method might be called semi-supervised.

A semi-supervised anomaly detection algorithm might also work


with a data set that is partially flagged. It will then build a
classification algorithm on just that flagged subset of data, and use
that model to predict the status of the remaining data.

Anomaly Detection Use Cases

Some of the primary anomaly detection use cases include anomaly


based intrusion detection, fraud detection, data loss prevention
(DLP), anomaly based malware detection, medical anomaly
detection, anomaly detection on social platforms, log anomaly
detection, internet of things (IoT) big data system anomaly
detection, industrial/monitoring anomalies, and anomalies in video
surveillance.

An anomaly based intrusion detection system (IDS) is any system


designed to identify and prevent malicious activity in a computer
network. A single computer may have its own IDS, called a Host
Intrusion Detection System (HIDS), and such a system can also be
scaled up to cover large networks. At that scale it is called Network
Intrusion Detection (NIDS).

This is also sometimes called network behavior anomaly detection,


and this is the kind of ongoing monitoring network behavior
anomaly detection tools are designed to provide. Most IDS depend

9 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

on signature-based or anomaly-based detection methods, but since


signature-based IDS are ill-equipped to detect unique attacks,
anomaly-based detection techniques remain more popular.

Fraud in banking (credit card transactions, tax return claims, etc.),


insurance claims (automobile, health, etc.), telecommunications,
and other areas is a significant issue for both private business and
governments. Fraud detection demands adaptation, detection, and
prevention, all with data in real-time.

Data loss prevention (DLP) is similar to prevention of fraud, but


focuses exclusively on loss of sensitive information at an early
stage. In practice, this means logging and analyzing accesses to
file servers, databases, and other sources of information in near-
real-time to detect uncommon access patterns.

Malware detection is another important area, typically divided into


feature extraction and clustering/classification stages. Sheer scale
of data is a tremendous challenge here, along with the adaptive
nature of the malicious behavior.

Detecting anomalies in medical images and records enables


experts to diagnose and treat patients more effectively. Massive
amounts of imbalanced data means reduced ability to detect and
interpret patterns without these techniques. This is an area that is
ideal for artificial intelligence given the tremendous amount of data
processing involved.

Detecting anomalies in a social network enables administrators to


identify fake users, online fraudsters, predators, rumor-mongers,
and spammers that can have serious business and social impact.

Log anomaly detection enables businesses to determine why


systems fail by reconstructing faults from patterns and past

10 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

experiences.

Monitoring data generated in the field of the Internet of things (IoT)


ensures that data generated by IT infrastructure components, radio-
frequency identification (RFID) tags, weather stations, and other
sensors are accurate and identifies faulty and fraudulent behavior
before disaster strikes. The same is true of monitoring industrial
systems such as high-temperature energy systems, power plants,
wind turbines, and storage devices that are exposed to massive
daily stress.

How Does Avi Networks Help With Anomaly


Detection?

The Avi Vantage Platform’s distributed load balancers collect


telemetry in the path of application traffic in real-time. This enables
the platform to leverage its position, monitoring traffic spikes,
detecting any anomalies—and, ultimately, the distributed denial of
service (DDoS) attacks that often underlie these spikes in
anomalous traffic.

As an intrinsic feature of the Application Delivery Controller (ADC),


Avi’s approach is inline, and does not demand any additional
anomaly detection software integration. Avi’s intelligent approach
allows for better decisions on detection of outliers, elastic capacity
planning, prediction of load, and more.

The system is actionable, and with anomaly detection happening in


real-time and mitigation actions built into operations, it’s also fast.
Admins do not need to manually start a new server, block offending
clients, or increase capacity based on predictions about resource
utilization. The result is real-time, automatic anomaly detection.

11 of 12 07-Apr-22, 8:29 PM
What is Anomaly Detection? Definition & FAQs | Avi Networks about:reader?url=https://avinetworks.com/glossary/anomaly-detection/

For more information on Avi’s anomaly detection platform see:


“Supercharging Anomaly Detection with the holt-winters Algorithm”

12 of 12 07-Apr-22, 8:29 PM

You might also like