Time-Efficient and Cost-Effective Network Hardening Using Attack Graphs
Time-Efficient and Cost-Effective Network Hardening Using Attack Graphs
net/publication/235979981
CITATIONS READS
72 169
3 authors:
Steven Noel
George Mason University
53 PUBLICATIONS 2,108 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Natural Intelligence Neuromorphic Engineering (NINE) about Unsupervised Deep Learning Rule View project
Modeling and Analysis of Moving Target Defense Mechanisms in MANET View project
All content following this page was uploaded by Massimiliano Albanese on 21 December 2015.
Abstract—Attack graph analysis has been established as a hardening options itself scales exponentially with the size of
powerful tool for analyzing network vulnerability. However, the attack graph. In applying network hardening to realistic
previous approaches to network hardening look for exact network environments, it is crucial that the algorithms are
solutions and thus do not scale. Further, hardening elements
have been treated independently, which is inappropriate for able to scale. Progress has been made in reducing the
real environments. For example, the cost for patching many complexity of attack graph manipulation so that it scales
systems may be nearly the same as for patching a single quadratically (linearly within defined security zones) [1].
one. Or patching a vulnerability may have the same effect However, previous approaches for generating hardening rec-
as blocking traffic with a firewall, while blocking a port ommendations search for exact solutions [2], which is an
may deny legitimate service. By failing to account for such
hardening interdependencies, the resulting recommendations intractable problem.
can be unrealistic and far from optimal. Instead, we formalize Another limitation of previous work is the assumption
the notion of hardening strategy in terms of allowable actions, that network conditions are hardened independently. This
and define a cost model that takes into account the impact of assumption does not hold true in real network environments.
interdependent hardening actions. We also introduce a near- Realistically, network administrators can take actions that
optimal approximation algorithm that scales linearly with the
size of the graphs, which we validate experimentally. affect vulnerabilities across the network, such as pushing
patches out to many systems at once. Further, the same
Keywords-network hardening, vulnerability analysis, attack hardening result may be obtained through more than one
graphs, intrusion prevention, reliability.
action. Overall, to provide realistic recommendations, our
I. I NTRODUCTION hardening strategy must take such factors into account.
We remove the assumption of independent hardening
Attackers can leverage the complex interdependencies of actions. Instead, we define a network hardening strategy as a
network configurations and vulnerabilities to penetrate seem- set of allowable atomic actions that involve hardening mul-
ingly well-guarded networks. In-depth analysis of network tiple network conditions. We introduce a formal cost model
vulnerabilities must consider attacker exploits not merely that accounts for the impact of these hardening actions.
in isolation, but in combination. Attack graphs reveal such This allows the definition of hardening costs that accurately
threats by enumerating potential paths that attackers can take reflect realistic network environments. Because computing
to penetrate networks. This helps determine whether a given the minimum-cost hardening solution is intractable, we
set of network hardening measures provides safety of given introduce an approximation algorithm for optimal hardening.
critical resources. This algorithm finds near-optimal solutions while scaling al-
Attack graph analysis can be extended to automatically most linearly – for certain values of the parameters – with the
generate recommendations for hardening networks. One size of the attack graph, which we validate experimentally.
must consider combinations of network conditions to harden, Finally, we determine the theoretical upper bound for the
which has corresponding impact on removing paths in the worst-case approximation ratio, and show that, in practice,
attack graph. Further, one can generate hardening solutions the approximation ratio is much lower than such bound.
that are optimal with respect to some notion of cost. Such The paper is organized as follows. Section II discusses
hardening solutions prevent the attack from succeeding, related work. Section III recalls some preliminary defini-
while minimizing the associated costs. tions, whereas Section IV provides a motivating example.
However, as we show, the general solution to optimal Then Section V introduces the proposed cost model, and
network hardening scales exponentially as the number of Section VI describes our approach to time-efficient and cost-
effective network hardening. Finally, Section VII reports ex-
The work presented in this paper is supported in part by the Army
Research Office under MURI award number W911NF-09-1-0525, and by perimental results, and Section VIII gives some concluding
the Office of Naval Research under award number N00014-12-1-0461. remarks and indicates further research directions.
cost(∅) = 0 (1)
Figure 3. Possible hardening actions (orange rectangles) for the attack
graph of Figure 2
(∀S1 , S2 ∈ S) (C(S1 ) ⊆ C(S2 ) ⇒ cost(S1 ) ≤ cost(S2 )) (2)
approximation algorithm to find reasonably good hardening attack graph. Worst case complexity is then O(nn ). The
strategies in a time efficient manner. proof is omitted for reasons of space.
The authors of [2] rely on the assumption that the attack
A. Limitations of Previous Approach graph of a small and well-protected network is usually small
The algorithm presented in [2] starts from a set Ct of and sparse (the in-degree of each node is small), thus, even
target conditions and traverses the attack graph backwards, if the complexity is exponential, running time should be
making logical inferences. At the end of the graph traversal, acceptable in practice. However, the result above shows that
a logic proposition of the initial conditions is derived as computing an optimal solution may be impractical even for
the necessary and sufficient condition for hardening the relatively small attack graphs. For instance, consider the at-
network with respect to Ct . This proposition then needs to be tack graph of Figure 4, where n = 2, Ct = {c21 }, and d = 4.
converted to its disjunctive normal form (DNF), with each According to Equation 6, there are 64 possible hardening
disjunction in the DNF representing a particular sufficient strategies in the worst case, each of size 4. The strategy
option to harden the network. Although the logic proposition that disables the set of initial conditions {c1 , c3 , c9 , c11 } is
can be derived efficiently, converting it to its DNF may incur one of such possible strategies. When d = 6, the number of
into an exponential explosion. initial condition is 64, and the number of possible strategies
Algorithm BackwardSearch (Algorithm 1) is function- becomes 16,384. For d = 8, |Ci | = 256 and the number of
ally equivalent to the one described in [2] – in that it possible strategies is over a billion.
generates all possible hardening solutions4 – under the sim-
B. Approximation Algorithm
plifying hypothesis that initial conditions can be individually
disabled, i.e., (∀ci ∈ Ci )(∃A ∈ A)(A = {ci }). However, To address the limitations of the previous network harden-
our rewriting of the algorithm has several advantages over ing algorithm, we now propose an approximation algorithm
its original version. First, it is more general, as it does not as- that computes reasonably good solutions in a time efficient
sume that initial conditions can be individually disabled, and manner. We will show that, under certain conditions, the
incorporates the notions of allowable action and hardening solutions computed by the proposed algorithm have a cost
strategy defined in Section V. Second, it directly computes that is bound to be within a constant factor of the optimal
a set of possible hardening strategies, rather then a logic cost.
proposition that requires additional processing in order to Algorithm F orwardSearch (Algorithm 2) traverses the
provide actionable intelligence. Last, in a time-constrained attack graph forward, starting from initial conditions. A
or real-time scenario where one may be interested in the first key advantage of traversing the attack graph forward is
available hardening solution, the rewritten algorithm can be that intermediate solutions are indeed network hardening
easily modified to terminate as soon as a solution is found. strategies with respect to intermediate conditions. In fact,
To this aim, it is sufficient to change the condition of the in a single pass, Algorithm F orwardSearch can compute
main while loop (Line 3) to (S ∈ S)(S ⊆ Ci ). Such hardening strategies with respect to any condition in C.
variant of the algorithm will generate hardening strategies To limit the exponential explosion of the search space,
that disable initial conditions closer to the target conditions. intermediate solutions can be pruned – based on some
However, when used to find the minimum-cost hardening pruning strategy – whereas pruning is not possible for the
solution, Algorithm BackwardSearch still faces the com- algorithm that traverses the graph backwards. In fact, in
binatorial explosion described below. Instead, the algorithm this case, intermediate solutions may contain exploits and
introduced in Section VI-B provides a balance between the intermediate conditions, and we cannot say anything about
optimality of the solution and the time to compute it. their cost until all the exploits and intermediate conditions
Under the simplifying hypothesis that initial conditions have been replaced with sets of initial conditions.
can be individually disabled – i.e., (∀ci ∈ Ci )(∃A ∈ In this section, for ease of presentation, we consider
A)(A = {ci }) – and allowable actions are pairwise disjoint hardening problems with a single target condition. The
– i.e., (∀Ai , Aj ∈ A)(Ai ∩ Aj = ∅) – it can be proved generalization to the case where multiple target conditions
that, in the worst case, the number of possible hardening need to be hardened at the same time is straightforward and
strategies is is discussed below.
Given a set Ct of target conditions, we add a dummy
d
|S| = |Ct | · n
2
k=1 nk
(6) exploit ei for each condition ci ∈ Ct , such that ei has ci as
its only precondition, as shown in Figure 5. Then, we add a
4 For ease of presentation, the pseudocode of Algorithm 1 does not show
how cycles are broken. This is done as in the original algorithm. 5 Note that d is always an even number.
Algorithm 1 BackwardSearch(G, Ct )
Input: Attack graph G = (E ∪ C, Rr ∪ Ri ), and set of target conditions Ct .
Output: Optimal hardening strategy.
1: // Initialize the set of all solutions and iterate until solutions contain initial conditions only
2: S ← {Ct }
3: while (∃S ∈ S)(S ⊆ Ci ) do
4: // Replace each non-initial condition with the set of exploits that imply it
5: for all S ∈ S do
6: for all c ∈ S s.t. c ∈ / Ci do
7: S ← S \ {c} ∪ {e ∈ E | (e, c) ∈ Ri }
8: end for
9: end for
10: // Replace exploits with required conditions and generate all possible combinations
11: for all S = {e1 , . . . , em } ∈ S do
12: S ← S \ {S} ∪ {{c1 , . . . , cm } | (∀i ∈ [1, m]) (ci , ei ) ∈ Rr }
13: end for
14: end while
15: // Replace initial conditions with allowable actions and generate all possible combinations
16: for all S = {c1 , . . . , cn } ∈ S do
17: S ← S \ {S} ∪ {{A1 , . . . , An } | (∀i ∈ [1, n]) Ai ∈ A ∧ ci ∈ Ai }
18: end for
19: return argmaxS∈S cost(S)
e1 e2 e3 e4 e5 e6 e7 e8
e9 e10
c21
dummy target condition ct , such that all the dummy exploits example of how this can be achieved using the mechanism
ei have ct are their only postcondition. It is clear that any to break cycle adopted by the algorithm in [2]. If the attack
strategy that hardens the network with respect to ct implicitly graph is not a tree, it can be converted to this form by
hardens the network with respect to each ci ∈ Ct . In fact, as using such mechanism. Looking at the attack graph from the
ct is reachable from any dummy exploit ei , all such exploits point of view of a given target condition has the additional
need to be prevented, and the only way to achieve this is by advantage of ignoring exploits and conditions that do not
disabling the corresponding preconditions, that is hardening contribute to reaching that target condition.
the network with respect to all target conditions in Ct .
On Line 1, the algorithm performs a topological sort of the
Additionally, we assume that, given a target condition ct , nodes in the attack graph (exploits and security conditions),
the attack graph is a tree rooted at ct and having initial and pushes them into a queue, with initial conditions at the
conditions as leaf nodes. In Section IV, we showed an front of the queue. While the queue is not empty, an element
Algorithm 2 F orwardSearch(G, k)
Input: Attack graph G = (E ∪ C, Rr ∪ Ri ), and optimization parameter k.
Output: Mapping σ : C ∪ E → 2S , and mapping minCost : C ∪ E → R+ .
1: Q ← T opologicalSort(C ∪ E)
2: while Q = ∅ do
3: q ← Q.pop()
4: if q ∈ Ci then
5: σ(q) ← {{A} | q ∈ A}
6: else if q ∈ E then
7: σ(q) ← c∈C | (c,q)∈Ri σ(q)
8: else if q ∈ C \ Ci then
9: {e1 , . . . , em } ← {e ∈ E | (e, q) ∈ Ri }
10: σ(q) ← {S1 ∪ . . . ∪ Sm | Si ∈ σ(ei )}
11: end if
12: σ(q) ← topK(σ(q), k)
13: minCost(q) = minS∈σ(q) cost(S)
14: end while
that, for k = 2, the algorithm returns minCost(c5 ) = 18, initial conditions), where m = nd is the number of initial
i.e., the optimal solution. This confirms that larger values of conditions; (ii) for each exploit ed−1,i , all the preconditions
k make solutions closer to the optimal one. not disabled by A∗ are disabled by an action Ai such that
cost({Ai }) = x − ε, where ε is an arbitrarily small positive
A1 A2 A3 real number; and (iii) actions Ai are pairwise disjoint.
Base case. When choosing a strategy for ed−1,i , the algo-
c1 c2 c3 c4 rithm picks the one with the lowest cost, that is strategy {Ai }
with cost x − ε. Then, when choosing a strategy for cd−2,i ,
the algorithm combines strategies for its n predecessors,
e1 e2 which all cost x − ε. Since such strategies are disjoint and
cost is additive, the cost to harden any condition at level
d − 2 of the attack tree is n · (x − ε).
c5
Inductive step. If hardening strategies for conditions at
level d−j of the attack tree cost nj/2 ·(x−ε), then hardening
Figure 6. Example of attack graph with d = 2 and n = 2
strategies for exploits at level d−j −1 of the attack tree also
cost nj/2 · (x − ε). When choosing a strategy for conditions
We now show that, in the worst case – when k = 1 – at level d − j − 2, the algorithm combines strategies for its
the approximation ratio is upper-bounded by nd/2 . However, n predecessors, which all cost nj/2 · (x − ε). Since such
experimental results indicate that, in practice, the approx- strategies are disjoint and cost is additive, the cost to harden
imation ratio is much smaller than its theoretical bound. any condition at level d − j − 2 of the attack tree is n · nj/2 ·
j+2
First, let us consider the type of scenario in which solutions (x − ε) = n 2 · (x − ε).
may not be optimal. To this aim, consider again the attack Although this result indicates that the bound may increase
graph configuration of Figure 6. When computing solutions exponentially with the depth of the attack tree, the bound
for e1 and e2 respectively, we make local decisions without is in practice – as confirmed by experimental results –
considering the whole graph, i.e., we independently compute much lower than the theoretical bound. In fact, the worst
the optimal solution for e1 and the optimal solution for e2 , case scenario depicted in Figure 7 is quite unrealistic.
given hardening strategies for their preconditions. However, Additionally, the bound can be reduced by increasing the
at a later stage, we need to merge solutions for both e1 and value of k. For instance, by setting k = n, the bound
d−2
e2 in order to obtain solutions for c5 . At this point, since becomes n 2 , that is the bound for a graph with depth
there exists an allowable action (i.e., A2 ) that would have d − 2 and in-degree n.
disabled preconditions of both e1 and e2 , with a cost lower Example 3: Consider the attack graph configuration of
than the combined cost of their locally optimal solutions, Figure 6 (with n = 2 and d = 2), and assume that
but the strategy including A2 has been discarded for k = 1, cost({A2 }) = x, cost({A1 }) = x − ε, and cost({A3 }) =
the solution is not optimal. This suggests that both k and the x − ε. For k = 1, if the cost function is additive, we obtain
maximum in-degree n of nodes in the graph play a role in minCost(c5 ) = 2 · (x − ε) ≈ 2 · x, which means that in the
determining the optimality of the solution. Additionally, as worst case the cost is twice the optimal cost.
the algorithm traverses the graph towards target conditions,
there may be a multiplicative effect in the approximation VII. E XPERIMENTAL R ESULTS
error. In fact, the depth d of the tree also plays a role In this section, we report the experiments we conducted to
in determining the outcome of the approximation, but this validate our approach. Specifically, our objective is to evalu-
effect can be compensated by increasing the value of k. We ate the performance of algorithm F orwardSearch in terms
can prove the following theorem. of processing time and approximation ratio for different
Theorem 1: Given an attack graph G with depth d and values of the depth d of the attack graph and the maximum
maximum in-degree n, the upper bound of the approxima- in-degree n of nodes in the graph. In order to obtain graphs
A1 A* A2 Am/2 shows how processing time increases when n increases and
for d = 4, and compares processing times of the exact algo-
rithm with processing times of algorithm F orwardSearch
for different values of k. It is clear that the time to compute
cd,1 cd,2 cd,3 cd,4 cd,m-1 cd,m
the exact solution starts to diverge at n = 4, whereas
processing time of algorithm F orwardSearch is still well
under 0.5 seconds for k = 10 and n = 5. Similarly,
Figure 9 shows how processing time increases when d
ed-1,1 ed-1,2 ed-1,m/2
increases and for n = 2, and compares processing times
of the exact algorithm with processing times of algorithm
F orwardSearch for different values of k. The time to
cd-2,1
compute the exact solution starts to diverge at d = 5,
whereas processing time of algorithm F orwardSearch is
still under 20 milliseconds for k = 10 and d = 10.
e1,1 e1,2
Exact solution k=1 k=2 k=5 k = 10
30
n=2
ct 25
10
of k.
1.0
We also observed the relationship between processing
time and size of the graphs (in terms of number of nodes).
0.5 Figure 12 shows a scatter plot of average processing times
for given pairs of d and n vs. the corresponding graph size.
0.0
This chart suggests that, in practice, processing time is linear
2 3 4 5 in the size of the graph for small values of k.
n
Finally, we evaluated the approximation ratio achieved
Figure 8. Processing time vs. n for d = 4 and different values of k
by the algorithm. Figure 13 shows how the ratio changes
when k increases and for a fixed value of n (n = 2) and
different values of d. It is clear that the approximation ratio
First, we show that, as expected, computing the optimal improves when k increases, and, in all cases, the ratio is
solution is feasible only for very small graphs. Figure 8 clearly below the theoretical bound. Additionally, relatively
d=2 d=4 d=6 d=8 d = 10 – for a fixed value of d (d = 4) and different values of
30
n=4
n – improves as k increases. Similar conclusions can be
25
drawn from this chart. In particular, the approximation ratio
d = 10, n = 4 is always below the theoretical bound.
1,398,101 nodes
Processing time (s)
20
5 1.4
Approximation ratio
0 1.3
0 2 4 6 8 10 12
k 1.2
1.1
Figure 10. Processing time vs. k for n = 4 and different values of d
1
0.9
n=2 n=3 n=4 n=5 0 2 4 6 8 10 12
60 k
d=8
40
30
n=2 n=3 n=4
1.35
20
d=4
1.3
10
1.25
Approximation ratio
1.2
0
0 2 4 6 8 10 12 1.15
k
1.1
Figure 11. Processing time vs. k for d = 8 and different values of n 1.05
0.95
k=1 k=2 Linear regression line (k = 1) Linear regression line (k = 2) 0.9
4.0 0 2 4 6 8 10 12
k
3.5
R² = 0.999
3.0 Figure 14. Approximation ratio vs. k for d = 4 and different values of n
Processing time (s)
2.5
R² = 0.9924
2.0
[5] M. Dacier, “Towards quantitative evaluation of computer [18] S. Jajodia, S. Noel, P. Kalapa, M. Albanese, and J. Williams,
security,” Ph.D. dissertation, Institut National Polytechnique “Cauldron: Mission-centric cyber situational awareness with
de Toulouse, 1994. defense in depth,” in Proceedings of the Military Communi-
cations Conference (MILCOM 2011), Baltimore, MD, USA,
[6] S. Jajodia, S. Noel, and B. O’Berry, Managing Cyber Threats: November 2011.
Issues, Approaches, and Challenges, ser. Massive Computing.
Springer, 2005, vol. 5, ch. Topological Analysis of Network
Attack Vulnerability, pp. 247–266.