0% found this document useful (0 votes)

172 views12 pages

Pcre Jit

The document discusses extending the PCRE regular expression library with static backtracking based just-in-time compilation support. It introduces a new technique called static backtracking that allows simultaneous optimization of matching and backtracking. The author developed a JIT compiler for PCRE using this technique and showed it provided faster matching speeds than another JIT accelerated regular expression engine.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

172 views12 pages

Pcre Jit

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/262309709

Extending the PCRE Library with Static Backtracking Based Just-in-Time

Compilation Support

Conference Paper · February 2014

DOI: 10.1145/2544137.2544146

CITATIONS READS
0 442

1 author:

Zoltan Herczeg
University of Szeged
10 PUBLICATIONS 71 CITATIONS

SEE PROFILE

All content following this page was uploaded by Zoltan Herczeg on 21 August 2019.

The user has requested enhancement of the downloaded file.

Extending the PCRE Library with Static Backtracking
Based Just-in-Time Compilation Support

Zoltán Herczeg
University of Szeged
Department of Software Engineering
13 Dugonics Square Szeged, Hungary
zherczeg@inf.u-szeged.hu

ABSTRACT 1. INTRODUCTION
High matching performance of regular expressions is a crit- Regular expressions are among the most popular tools for
ical requirement for many widely used software tools today, advanced text processing today, providing a rich and flexible
including web servers, firewalls, and intrusion detection sys- solution for custom pattern matching. Kleene defined reg-
tems. Backtracking regular expression engines have been ular sets [16] in the 1950s and Thompson implemented the
considerably improved in the last decade as a result of this first regular expression library [28] in the 1960s, since when
requirement. Today, state of the art engines use just-in-time regular expressions have evolved considerably to the point
(JIT) compilation support to generate machine code from where they cannot be called regular anymore. So far regular
regular expressions, and they use new, innovative techniques expression engines have belonged to one of the following two
to further improve the speed of the generated code. major groups:
In the present paper, we introduce a new technique called NFA based: a directed graph, a non-deterministic finite-
static backtracking, which allows simultaneous optimization state automaton, is constructed by these engines. Vertices
of both matching and backtracking. Based on this tech- of the graph represent states, and edges represent state tran-
nique, we developed a JIT compiler for the widely used sitions. Starting from the start state, the engine traverses
PCRE regular expression library. Our compiler supports all these states through the available state transitions with a
valid PCRE patterns, which shows that static backtracking backtracking depth-first search algorithm until the first ac-
is a viable choice for Perl compatible engines. cept state is reached, which means a successful match. When
We also show that our balanced, Abstract Syntax Tree all transition options are exhausted, the engine returns with
based code generator efficiently improves the performance of a failed match. Unlike their definition in computation the-
long-running, backtracking heavy regular expressions. Com- ory, state transitions of real world engines are not limited
pared to another JIT accelerated regular expression engine, to single character matches. They can represent any ac-
PCRE-JIT was able to run these patterns 1.95 times faster. tions such as matching a backreference or evaluating condi-
Since these long-running patterns dominate the total run- tional expressions. The runtime of the depth-first search al-
time, PCRE-JIT achieved 1.63 times faster matching speed gorithm is exponential in the worst case: when /(x|x){n}y/
overall. We also observed 6.36 times average speedup com- is matched to a string of n + 1 x literals, the engine performs
pared to the PCRE interpreter on 5 different CPU architec- 2n backtracks before it returns with a failed match.
tures. DFA based: a deterministic finite-state automaton is
constructed by these engines. Similar to the directed graph
Categories and Subject Descriptors of the NFA based engines, the state machine has states and
state transitions. However, this state machine has exactly
D.3.4 [Programming Languages]: Processors—Compil- one state transition for each state and input character pair,
ers, Code generation, Optimization which eliminates the need for backtracking. These engines
are closer to the DFA engines described in computation the-
General Terms ory, since their state transitions can only depend on fixed
Algorithms, Performance character sets. The runtime of a DFA based engine depends
only on the input length. The trade-off is state explosion,
e.g. /a[ˆb]{n}a/ has 3 ∗ 2n different states. Due to this ex-
Keywords ponential number of states, the runtime cost of this pattern
Regular expressions, JIT compiling, Static backtracking is exponential for those engines that generate their state
machines in advance, such as RE2 [8] from Google.
In the present paper, we introduce a new code genera-
tor technique called static backtracking, which is a radically
new approach for backtracking engines. Unlike the prior
art, it generates machine code from the Abstract Syntax
Tree (AST) [2] representation which provides more informa-
tion about the structure of a regular expression than NFA
or DFA. Using this extra information, it can optimize both
CGO ’14, February 15 - 19 2014, Orlando, FL, USA matching and backtracking simultaneously rather than fo-
.
cusing only on matching. Because our state machine is a PCRE byte code
bi-directional tree, which is traversed by a backtracking al-
gorithm different from NFA, our algorithm belongs to a third
group. ❄
The first freely available NFA based engine was created Static Backtracking based
by Henry Spencer in 1987. It was chosen as the regular PCRE-JIT compiler PCRE-JIT
expression library of the Perl language [29]. The engine in-
side Perl was extended with new, innovative features, which ❄ engine
made it increasingly popular among developers. Eventu- SLJIT compiler
ally it became the standard of regular expressions, adopted
by many languages including Python, JavaScript and .NET
❄
Framework. The Perl Compatible Regular Expression li-
brary (PCRE) [14] made by Philip Hazel also uses the Perl Machine code
syntax with minor differences.
The aim of just-in-time compilation for PCRE (PCRE-
JIT) is to speed up pattern matching of the PCRE library Figure 1: Overview of PCRE-JIT engine
by transforming the PCRE byte code into the low-level in-
termediate representation (LIR) language of the Stackless
Just-In-Time Compiler (SLJIT), which then translates it not been finished and seems to be abandoned now.
into machine executable code. Both PCRE-JIT and SLJIT The third group generates executable machine code on
projects were developed by us, and they have been part of the fly. Yet Another Regex Runtime [4] (YARR) in the
the PCRE library since version 8.20. After the initial re- open source WebKit browser engine, Irregexp [7] in the V8
lease, we have continued this work, and since version 8.32, JavaScript engine and our PCRE-JIT belong to this group.
the whole feature set of PCRE has been supported. At the Since YARR and Irregexp are the closest counterparts of
time of writing this paper, PCRE-JIT supports more Perl our engine, we compare their backtracking performance in
compatible syntax rules than any other regular expression Section 5.
engine with JIT compiler support. To make the paper self- The lightweight JIT compilers can be grouped according
contained, we outline both the PCRE-JIT and SLJIT com- to their low-level intermediate representation.
pilers. The rest of the paper is organized as follows: NanoJIT [1] and LibJIT [30] accept their LIR in Static
In Section 2, we review related work. In Section 3, we Single Assignment (SSA) [24] form. They use lightweight
outline the SLJIT compiler. In Section 4, we introduce the register allocation mechanisms, e.g. linear scan [31].
PCRE-JIT compiler, and describe its AST based code gen- Virtual machine accelerators compile the byte code repre-
erator, including the new static backtracking method. In sentation of a VM to machine code. Many of them compile
Section 5, we compare the performance of PCRE-JIT to hotspots only. They usually target the Java VM, and some
other JIT compiler accelerated engines, and we compare the of them are lightweight enough for embedded systems such
performance of PCRE-JIT and the PCRE interpreter on var- as Swift [32] and HotpathVM [12].
ious CPU architectures. Finally, in Section 6, we summarize Dynamic assemblers, namely AsmJit [18], DynASM [25]
our paper and present some plans for future work. in LuaJIT, and the macro assemblers in the Irregexp engine
target a single architecture. They provide the fastest code
2. RELATED WORK generation speed due to their simplicity.
Compiling regular expressions to machine code has a long Dynamic assemblers can be extended to become platform
history: the first engine [28] for IBM 7094 CPUs was intro- independent by defining generic architectures, whose instruc-
duced in 1968 by Ken Thompson. His technique has never tions are translated to the current CPU. Such projects are
become popular however, due to the complexity of on-the-fly GNU lightning [5], VCODE [10], SICStusAbstractMachine
machine code generation. (SAM) [13] and our SLJIT compiler.
The machine code generators for regular expressions can
be divided into three major groups: 3. OVERVIEW OF THE STACKLESS JUST-
The first group generates source code for a given lan-
guage, mostly C/C++, but JAVA is getting more popular IN-TIME COMPILER (SLJIT)
as well. To compile this source code, a fully operating com- The SLJIT compiler is designed to fit the commonly used
piler toolchain is required, which limits the use cases of these code generation techniques in lightweight compilation envi-
engines. Typical examples are lexical analyzers and/or scan- ronments. These techniques are different from static com-
ners, such as lex [21], flex [20], JFlex [17], re2c [6] and the piler optimizations, since they are balanced between compi-
lexer part of the ANTLR [26] tool. lation and execution speed. Because of their dynamic na-
The second group generates byte code for a given vir- ture, some of them are unique to JIT compilers, such as
tual machine (VM), and the VM is responsible for dynami- inline caching. These dynamic code modifications allow for
cally loading and executing the compiled regular expression. fine tuning the generated code after the compilation.
These engines can depend on the availability of a VM, since The primary difference of SLJIT from existing projects is
the very same VM is required to run them. The JAVA based the way that the instruction set was designed: instead of
FIRE/J [15] and the C# based regular expression engine of creating a simplified RISC like architecture, the common
.NET [23] and Mono [9] Frameworks can be mentioned here. features of existing, widely used CPU architectures were
The LLVM-RE [22] branch of the Unladen-Swallow Python merged together. The resulting instruction set can be effi-
engine generates LLVM [19] byte code, but this project has ciently translated to the supported CPUs, but this efficiency
sub.e.s unused, s2, s3 % Set signed result (.s)
% and equal (.e) status Root node
% flag bits after s3 is
% substracted from s2.
flags.mov s1, sig greater % Copy the value of a
% status flag bit. Alternative 1
add.k s1, s1, s1 % Multiply s1 by 2, and ❳
% keep current status ✁ ❍❍ ❳ ❳❳❳
✁ ❍❍ ❳❳❳
% flags (.k).
flags.or s1, s1, equal % Set the lowest bit of s1 (?:)+ a
% if s2 and s3 were equal.
b
✟ ❍
✟✟ ❍❍
sub s1, s1, #1 % Decrease s1 by 1. Sta-
% tus flags are undefined. ✟ ❍
Alternative 1 Alternative 2
Figure 2: Compare s2 and s3, and set s1 to -1, if s2
is lower, 0, if they are equal, and 1 otherwise. ❅
❅
a a a
may not be achieved on other platforms, such as Very Long
Instruction Word architectures. This reduces the portability
of the SLJIT compiler.
SLJIT is one of the two major components of the PCRE- Figure 3: AST representation of /(?:aa|a)+ab/ pat-
JIT engine as shown in Figure 1. It translates the output tern
of the PCRE-JIT compiler to executable code. We should
note here that the term “stackless” in the name of the SLJIT
Function call and return are implemented as single in-
compiler does not relate to static backtracking. SLJIT is an
structions on all CPUs; however the Application Binary In-
independent work, developed a few years before the PCRE-
terface (ABI) of these architectures specifies various stack
JIT compiler. Its name refers to its inability to manage local
and register handling rules, which can make these func-
variables, which are usually temporarily stored on the stack.
tion calls computationally heavy, especially for small, utility
Instead, it provides direct access to real machine registers.
functions. To reduce this overhead, SLJIT supports a fast
SLJIT provides a low-level platform independent assembly
calling mechanism across JIT generated functions, which
like language. The LIR instructions of this language are
does not save or restore any registers, or manipulate the
emitted through a simple API. This approach offers better
stack. By using this technique, the branch predictor of most
performance than compiling source code, since there is no
CPUs is capable of predicting the return address, so these
need for parsing the input. As we mentioned before, the
lightweight functions are fast. The PCRE-JIT compiler uses
language is not tied to any existing architecture, it is rather
these calls to decode UTF byte sequences and detect char-
a combination of their features.
acter types among other things.
SLJIT covers most arithmetic, shift, and bitwise operators
known from higher level languages, such as addition, arith-
metic right shift, bitwise exclusive-or, etc. In addition to the 4. PCRE-JIT COMPILER OVERVIEW
common operators, we found some other instructions that In this section we outline the code generator of the PCRE-
are similar in these CPUs. Signed or unsigned long multi- JIT compiler and the static backtracking algorithm. Before
ply, where the size of the result is twice the size of the source we go into details, we show the matching algorithm of other
operands, count leading zero, swap endianness, prefetch and backtracking engines. This overview will allow us to com-
user breakpoint instructions are widely supported. CPU pare our approach to the prior art.
status flags are also supported by all CPUs except MIPS. NFA based engines construct a directed graph from a reg-
These status flags can be set by certain arithmetic or logic ular expression. The matching algorithm of these engines
instructions, and their values can be used later. is quite similar to the try-catch based exception handling:
An SLJIT LIR dump that uses these status flags can be each state sets up a generic catch handler, and in the event
seen in Figure 2. This code fragment compares two machine of backtrack, control is transferred here by indirect jumps.
registers, s2 and s3, and sets the value of s1 according to The performance of a try-catch based approach heavily de-
the result, without using conditional branches. Typically, pends on the frequency of the exceptional event, so these
compare operators perform such tasks. The first operand engines expect that backtracking happens rarely. However,
of most LIR instructions is the destination (unless it has no we show later that this assumption is not proved in practice.
destination at all), followed by the source operands. The To reduce the number of catches, the only states that are
unused keyword can only be used as a destination operand, kept by these engines are those that provide at least one
and it indicates that the result of the computation should more state transition. For example, the state which is as-
be discarded. The s prefix in a register name means that signed to the non-capturing bracket of (?:a|b|cc) provides
it is a scratch register, so its value is not preserved across three alternatives. When the last (cc) alternative is selected,
function calls. We should mention here that SLJIT emulates there are no more possible choices, so the catch handler of
all missing instruction forms and features if they are not this state can be discarded. This optimization is called tail
available on the current CPU, so the LIR code above works recursion. We can always apply this optimization to those
even on MIPS. states that only have one possible state transition, such as
Start match ✲ Match ✲ Match ✲
Match found
Match failed ✛
A ✛ B
No match No match
Backtrack Backtrack

Figure 4: Backtracking mechanism of /AB/ regular expression

Try match
✲ M node Try match
✲
...
✛
pre ✲
Try match
✲
Try match post
✛ ...
Backtrack M ✛ N ✛ M Backtrack
Backtrack Backtrack

Figure 5: Execution interface of a generic M AST node with one child node (N )

character sets or backreferences. The minimum case of iter- tern and x+ such as /a(b|c)x+/, but A cannot be a|b, since
ators is another typical candidate, e.g. when x+ is matched /a|bx+/ can match a plain a without an x at the end. This
to a single x. case can be made valid by enclosing the subpattern in a non-
The PCRE-JIT engine generates machine code from the capturing group: /(?:a|b)x+/. If a pattern does not contain
AST representation, since the AST provides more informa- any capital letters, it represents a single pattern.
tion about the structure of a pattern than the NFA. For This section is divided into multiple subsections. First,
example, each node has exactly one parent, so the previ- in subsection 4.1 we overview the static backtracking algo-
ous node, where the engine backtracks, is unambiguous and rithm. In subsection 4.2 we show an example of the gener-
always known at compile time as shown in Figure 3. There- ated code, and in subsection 4.3 we extend this example to
fore, the engine can generate context sensitive backtracking cover an important corner case.
code paths, which can be connected by direct jumps. Un-
like indirect jumps on many architectures, these jumps can 4.1 Overview of the Static Backtracking Code
be conditional, which simplifies the control transfer to the Generator Algorithm
backtracking handler when a check fails. Furthermore, cer- In the following paragraph we outline the code generator
tain jump instructions can be totally eliminated by efficient of PCRE-JIT. Figure 4 shows the matching process of the
ordering of code paths. /AB/ regular expression which represents the concatenation
The code generator of PCRE-JIT is not NFA based. The of two subpatterns. Perl compatible engines are free to use
state transitions of an NFA are uni-directional, and NFA any matching algorithm as long as they produce the same
based engines use recovery points to fully restore the pre- result.
vious state when the engine backtracks. In contrast, our Existing JIT accelerated engines use two techniques for
state machine is a tree with bi-directional state transitions, backtracking. The .NET Framework and Irregexp gener-
which can be used for going back and forth. We should also ate code from the NFA representation and they follow the
note that the AST can always be converted to NFA, but the traditional NFA backtracking mechanism described in the
reverse of this statement is not true. However, all Perl com- beginning of this section. YARR uses another approach. It
patible patterns have an AST representation, so PCRE-JIT allocates global variables to store the matching progress, so
does not need to support those extra NFA cases. it supports only those constructs which can be represented
An important difference between NFA and AST based en- by a fixed number of global variables. An example of such a
gines is their optimization strategies. NFA based engines construct is a character literal with a greedy plus quantifier,
focus on efficient tail recursion, while our approach focuses because the number of matched characters and the position
on efficient code path ordering. Another key difference is of the last matched character are enough to determine the
that NFA based engines prefer complex backtracking code next possible state on backtrack. The simplicty of the gener-
paths, which perform as many tasks as possible to reduce ated code provides high performance, but many constructs
the number of indirect jumps. Instead, we prefer simple, such as /(?:ab)+/ are not supported by YARR, and it must
often empty code paths, where control can be transferred to fall back to interpreted mode.
the next code path without a jump instruction. PCRE-JIT uses a third approach. Instead of the NFA rep-
In the following, many regular expression examples representation, we generate code from the AST. More precisely
resent a group of patterns instead of a single one. To de- a code path pair is generated from each AST node. The
fine these groups, we introduce a simple notation. The gen- code generator is recursive, code paths for child nodes are
eral style of the patterns is that of Perl-style regular ex- generated by their parent, so the machine code representa-
pressions [11]. All pattern groups are enclosed in slashes. tion of a node includes all code paths of the subtree rooted
All characters between these slashes represent themselves, from this particular node.
except capital letters, which can represent any valid sub- The backtracking, depth-first search algorithm of PCRE-
pattern where the subpattern does not have any side-effect. JIT is based on a pre-defined traversing order of the syntax
E.g. /Ax+/ means the concatenation of any valid subpat- tree. Each node type defines the traversing order of its child
Try a new match. Find another match. Start
Input position A valid context
contains a valid is provided on the
starting position. top of the stack. Matching path A
Matching
❄ ❄ path of
✲ ❄
Matching path ✛ Backtracking path concat-
enation Matching path B

A match is found. ❄
No match is found.
A valid context is another does
Removes the top- Match
stored on the top of match is not
most context if ap- found match
the stack. The input
propriate. Input po-
position contains the
sition is undefined.
end of the match.
❄ ❄ Backtrack- Backtracking path B ✛
ing path of
concat-
Figure 6: Code path types and the role of their entry
and exit points
enation Backtracking path A ✛

nodes, and the traversing order of the syntax tree is the No match
combination of them. The traversing path is bi-directional,
both the successor and the predecessor nodes are known at
compile time. The only exception is the execution of child Figure 7: Simplified control flow graph of the /AB/
nodes: a parent node can transfer control to any of its child pattern
nodes any number of times, and the order can be decided at
runtime.
The various code paths are connected together using the Backtracking engines require a stack to store the status of
interface shown in Figure 5. The most important aspect a given node, since this data is required to determine the
of the Figure is that each M node has exactly two entry next possible alternative in the event of a backtrack. NFA
and two exit points, which are represented by the four outer based engines push the context and the entry address of the
arrows attached to the box of M node. The reason for gen- backtracking handler onto the stack, so the context is always
erating two code paths from each node is that a code path processed by the appropriate handler. A static backtrack-
pair has two entry and exit points. Other than this aspect, ing based compiler cannot rely on this mechanism. Instead,
these two code paths form a single function. To emphasize each parent node must know or record the calling order of its
this connection, a single box represents both M and N nodes child nodes, so it can choose the appropriate backtracking
in the Figure. handler. The recording is usually a cheap operation, because
The Figure also shows the conditional execution of a child the number of sub-nodes is usually less than or equal to one
(N ) node. As we mentioned before, code paths for child and it can be combined with other tasks. We will see an
nodes are generated by their parent node, and their exe- example for such optimization in subsection 4.2.
cution is also controlled by their parent. For example the To summarize this subsection, we show the simplified con-
greedy plus (?:)+ node matches its child node as many times trol flow graph of a frequently used AST node, namely the
as possible. When a child node finishes its execution, it must concatenation, which is shown in Figure 7. Both arrow types
transfer the control back to its parent. Due to the code lay- represent control transfers in the Figure, the difference is
out, these transfers do not require any jump instructions. that the thick arrows do not require a jump instruction.
The pre-M and post-M phases represent the instructions The concatenation is a free operation in static backtrack-
executed before and after the child node is executed, respec- ing, because the appropriate code path ordering is enough
tively. to satisfy all requirements, no extra instructions are needed
Normal functions use arguments and return values to pass for anything. These code layout optimizations are essential
data. In PCRE-JIT, the two entry and exit points of a single for an efficient depth-first search.
function can be used to represent passing and returning a
boolean value without putting these booleans into machine 4.2 An Example
registers. Assigning roles to these points makes a lot of The PCRE-JIT compiler is based on templates. All 150
argument and return value checks unnecessary, which can byte code types of PCRE have two templates, one for the
be used for optimizing the code. Figure 6 shows the names matching and another for the backtracking path. Each tem-
of these code paths (inside the boxes) and the roles of the plate is a list of LIR instructions and references to other tem-
entry and exit points. Both code paths are named after the plates. Hence templates for smaller subtasks can be shared
role of their entry point. which improves maintainability.
In Figure 6, new terms, namely input position and con- Figure 8 shows the structure of the machine code gener-
text are introduced as well. The input position is a global ated from the /(?:A)+B/ pattern. To improve readability,
variable, and it contains the position of the next input char- certain templates are not expanded in the Figure. Instead,
acter. The context is the atomic unit of stack management. they are kept as pseudo function calls such as push() or back-
// Matching path of (?:A)+ // Matching path of (?:A)+
// NULL = Beginning of the // NULL = Beginning of the
// context list mark // context list mark
push(NULL) push(NULL)
L1: if (match(A) = FAILED) L1: push(private slot)
goto L6 private slot ← input position
L2: push(input position) if (match(A) = FAILED)
// Greedy match: try again goto L6
goto L1 L2: push(input position)
// Matching path of B if (private slot != input position)
L3: if (match(B) = FAILED) goto L1
goto L5 // Discard input position
L4: return SUCCESS pop()
// Backtracking path of B // Matching path of B
L5: if (backtrack(B) = MATCHED) L3: if (match(B) = FAILED)
goto L4 goto L5
// Backtracking path of (?:A)+ L4: return SUCCESS
L6: if (backtrack(A) = MATCHED) // Backtracking path of B
goto L2 L5: if (backtrack(B) = MATCHED)
L7: input position ← pop() goto L4
if (input position != NULL) // Backtracking path of (?:A)+
goto L3 L6: if (backtrack(A) = MATCHED)
return FAIL goto L2
private slot ← pop();
input position ← pop()
Figure 8: Template structure of /(?:A)+B/ where if (input position != NULL)
A never matches an empty string goto L3
return FAIL

track(). The match(P) and backtrack(P) templates represent

the appropriate code path for any P subpattern. Condi- Figure 9: Template structure of /(?:A)+B/ pattern
tional jumps are transformed to if-goto statements. For ex- where A matches to an empty string
ample the if (match(A) = FAILED) goto L6 statement is
an easier to understand form of the match(A,L6) template,
which jumps to the L6 label when the match fails. The data loaded to the input position in the backtracking
Instead of a line by line explanation of Figure 8, we show path can be a valid pointer or NULL. If it is NULL, the
the matching process of the /(?:aa|a)+ab/ pattern to the control leaves at the exit point of the backtracking path.
aaaab input. This example fits the generic /(?:A)+B/ pat- Otherwise it represents the end of the last iteration. Be-
tern where A ← aa|a and B ← ab. The various code paths cause a single value is used for two purposes (filling the in-
in Figure 8 are generated from the nodes in Figure 3 which put position and selecting the next code path), this is an
is the AST representation of this pattern. example for combining the recording mechanism with other
The matching path of a greedy plus repetition is a loop, tasks mentioned in subsection 4.1. This time the loaded
which stops when its enclosed subpattern does not match value (O:4 ) is not NULL, so the control is transferred after
anymore. In our case, the first alternative (aa) matches the main loop, and the stack contains the following:
twice before the loop terminates, because neither alternative NULL, A:0, O:2, A:0
matches to the b literal. When the execution reaches the line
with the L7 label, the stack contains the following, where The input position points to the b literal now. This literal
the topmost item is the rightmost one: does not match ab, so backtrack(A) is executed again. We
should notice that the top of the stack contains the con-
NULL, A:0, O:2, A:0, O:4 text of the aa|a subpattern, which must be accepted by the
backtrack(A) template as it was shown in Figure 6. Because
The A:number represents the context of the aa|a subpat- backtrack(A) finds another match (the second alternative),
tern. The number shows the index of the matched alter- the matching loop is restarted. When the execution reaches
native starting from 0, i.e. A:0 means that the left alter- the L7 line again, the stack contains the following:
native was matched successfully. This context also contains
NULL, A:0, O:2, A:1, O:3, A:1, O:4
other values, such as the start of the match, but these values
are omitted for simplicity. The O:number is pushed by the The CPU performs similar steps as before, and match(B)
matching path of the greedy plus repetition, and it repre- fails again because of the b literal. This time backtrack(A)
sents an offset inside the input, where the first character has does not find another match, since both alternatives were
offset 0. Since the input position is updated by match(A), exhausted. Instead it removes the context of A from the
this offset always contains the end of the character sequence stack, and the execution leaves the backtracking path of A.
matched to A. These offsets are always valid pointers that The next operation is loading offset 3 (O:3) into the in-
cannot be NULL which allows us to mark the beginning of put position. The ab literal matches from this input position,
the context list by a NULL value. so the engine returns with a successful match.
Pattern Input YARR YARR Irregexp Irregexp PCRE Int. PCRE Int. PCRE JIT PCRE JIT
(ms) ratio (ms) ratio (ms) ratio (ms) ratio
1 /(?:(?:(?:a)*)*)*c|/ a12 bc 117.1 9.68 12.1 1.00 2408.9 199.08 293.8 24.28
2 /a?26 c|/ a26 bc 297.5 1.00 777.0 2.61 2759.9 9.28 352.5 1.18
3 /a+14 c|/ a26 b14 c 158.3 1.00 221.6 1.40 966.3 6.10 182.8 1.15
4 /(?:a+)+c|/ a23 bc 976.2 13.85 101.2 1.44 466.3 6.61 70.5 1.00
5 /(?:(?:a)+)+c|/ a23 bc 1880.0 21.89 102.8 1.20 930.8 10.84 85.9 1.00
6 /(?:(?:aa)+)+c|/ a46 bc 1725.3 20.49 99.5 1.18 976.4 11.60 84.2 1.00
7 /((a)+)+c|/ a23 bc 2213.0 17.13 220.1 1.70 1219.7 9.44 129.2 1.00
8 /(?:(?:(?:aa)+)+)+c|/ a30 bc 2117.7 20.82 217.7 2.14 1307.7 12.86 101.7 1.00
9 /(((aa)+)+)+c|/ a28 bc 2704.0 16.04 364.6 2.16 1956.7 11.61 168.6 1.00
10 /(?:a|a)26 c|/ a26 bc 777.6 7.04 110.5 1.00 4566.1 41.32 400.7 3.63
11 /(?:aa|a)24 c|/ a48 bc 183.1 1.82 194.2 1.93 1219.7 12.15 100.4 1.00
24
12 /(aa|a) c|/ a48 bc 225.3 1.45 179.4 1.15 1285.8 8.26 155.6 1.00

Table 1: Runtime comparison of pathological cases.

Pattern Input Irregexp Irregexp PCRE JIT PCRE JIT

Table 2: Runtime comparison of pathological cases.

The general stack layout of the /(?:A)+B/ pattern is the any context data onto the stack, so the position of the saved
following after a successful match: value will be unknown later. Instead, PCRE-JIT uses global
variables called slots. These slots are allocated separately for
NULL, [context A, input position]*, ←֓ each byte code that requires them, and can only be modified
context A, context B by that particular byte code. The value of a slot cannot be
The line is broken into two segments by a ←֓ to fit into one changed by any child byte codes, since the matching process
column. The star metacharacter means that the content in- of a byte code cannot leave its own boundaries (no goto like
side the square brackets can be repeated from zero to any operation in Perl). Therefore no byte code can restart the
number of times. When a match fails, all context data is re- match of its parent byte code, and implicitly modify the
moved from the top of the stack, which satisfies the “remove slot. The only exception is recursions (a call like operation),
the topmost context if appropriate” condition in Figure 6. which must preserve the appropriate slots.
There is another issue we need to solve: a single slot is
4.3 A Slightly Extended Example not enough to store every input position for a repetition.
The example in Figure 8 has a limitation: the A subpat- Instead only the last input position is kept in the slot, and
tern cannot match an empty string, because the loop would its previous values are preserved on the stack as seen in
run forever in this case. Perl stops a repetition after an Figure 9.
empty match, and PCRE-JIT must follow this behaviour The context data of a might be empty greedy plus repe-
to remain compatible. Since PCRE defines a different byte tition is quite similar to the non-empty repetition, except it
code for might be empty matches, this case can be easily contains the previous value of private slot. Saving this slot
recognized at compile time, and the code generator uses a before the first repetition might seem unnecessary. How-
different template shown in Figure 9. ever, if the repetition itself is inside another repetition, e.g.
An empty match can be detected by checking whether the /(a(aa)+a){3}/, we must preserve the value of the slot. The
input position has been changed after a successful match. If following line shows the stack layout of the extended exam-
the input position keeps its value, the loop must be aborted. ple in a format we used for the original:
To do this comparison, we need to save the input position
before the subpattern is matched. The stack cannot be used NULL, [private slot, context A, input position]*, ←֓
for saving this value however, since the subpattern can push private slot, context A, context B
5. TEST RESULTS We emphasize here that the purpose of the following mea-
Our first measurement compares the number of backtracks surement is comparing the raw backtracking performance.
performed by the PCRE interpreter and the JIT compiler. To exclude all other optimizations, we either need to heav-
The interpreter is a traditional NFA based engine with ag- ily modify these engines or use custom tailored regular ex-
gressive tail recursion. We disabled certain optimizations in pressions. Since any modification affects the performance of
the JIT compiler, which would affect the following results, an engine, we choose the latter, which allows precise mea-
and are not supported by the interpreter. To perform this surement without any side effects. Finding the appropri-
measurement under real-world conditions we decided to dis- ate patterns, where all engines perform the same number of
able only the extra optimizations. A brief overview about backtracks, is still a challenge as we see later.
these so called backtracking elimination techiques will be Table 2 shows the runtimes and runtime ratios of our back-
provided later in this section. tracking heavy, and elimination unfriendly patterns, where
During the measurement, all 1020 HTTP content filtering the ratio is calculated by dividing the actual runtime with
patterns of the open source Snort [27] Intrusion Detection the lowest runtime in each line. The ratio is always 1.00 for
System (IDS) were matched against the HTTP stream of the fastest engine. These patterns are usually called patho-
the top 10 web sites listed by Alexa [3]. Both the PCRE logical cases, because the engine is forced to do a lot of
interpreter and JIT compiler searched for all occurences of backtracks, and they are not part of the Snort benchmark.
these Perl compatible regular expressions. The results are All measurements were performed on an Intel Xeon based
the following, where B means billion (109 ): 64 bit x86 system using CPU cycle counters. The PCRE
interpreter is also included in this measurement to show the
The interpreter attempted 30.57B byte code matches general performance progression of the JIT accelerated en-
and performed 14.06B backtracks. The engine also set gines compared to the interpreted ones.
up 6.42B backtracking handler sites. To improve readability of both pattern and input strings,
a simple notation is introduced for their repeating parts:
The code generated by PCRE-JIT attempted the the strings are divided into fragments by numbers in su-
same number of matches and performed 27.24B back- perscript, and each number denotes the repetition count
tracks, of which 12.81B (44%) were empty. The ex- of the previous fragment. Delimiters are excluded. Thus,
ecution entered the backtracking handler by a jump /(a)?2 b+c3 de/ represents the /(a)?(a)?b+cb+cb+cde/ pat-
instruction 8.72B times (32%). tern. In the following we explain the rationale behind each
pattern:
The characteristics of these engine types can be clearly
Our first example in line 1 shows the effect of a backtrack-
seen: our static backtracking based engine performed far
ing elimination, which explains the lack of certain constructs
more (89%) backtracks compared to the NFA based one.
in the following examples. Unlike Perl, JavaScript compat-
However, if we exclude the empty handlers, which do noth-
ible engines must backtrack when a repetition matches an
ing, the number is roughly the same (only 2.6% bigger). The
empty string. For example, when /(w?)+/ is matched to a
key feature of static backtracking is also visible: even if the
single character long w string, the capturing group returns
empty handlers are excluded, 40% of the backtracking han-
with a w in JavaScript and an empty string in Perl. This op-
dlers are still entered without a jump, and the remaining
timization substantially improves the matching performance
ones only need direct jumps.
of the pattern in line 1 for JavaScript, but also makes it un-
We should also note that backtracking is not a rare event.
suitable for our comparisons.
On the contrary, the number of backtracks is close to half
Patterns 2 and 3 focus on single character repetitions.
(46%) of the matching attempts, even for an NFA based
YARR is the fastest on these two patterns, closely followed
engine with aggressive tail recursion. Therefore the cost of
by PCRE-JIT. The matching algorithm of YARR makes it
backtracking is not negligible, which was the primary moti-
particularly efficient for single character repetitions as we
vation for inventing the static backtracking algorithm.
discussed in Section 4. The trade-off is lower efficiency in all
Before the next result is presented, we discuss the dif-
other cases.
ference between backtracking optimization and backtrack-
Patterns from 4 to 9 cover repetition inside repetition
ing elimination. Static backtracking is a backtracking op-
cases. YARR switches to interpreted execution for these
timization technique, because it accelerates the speed of
cases, which explains its decreased performance, so the fol-
backtracking. In contrast, backtracking elimination tries to
lowing discussion is focused on Irregexp and PCRE-JIT. In
reduce the number of backtracks in various ways. There
general, PCRE-JIT is the fastest on these patterns due to its
are dozens of such techniques, e.g. minimum match length
improved backtracking method, but some results show an in-
checks, expected character checks, or auto-possessive rewrit-
teresting behaviour. All participating engines optimize sin-
ing: /a+b/ is automatically replaced by /a++b/. Although
gle character repetitions, so pattern 4 is matched faster than
backtracking elimination can dramatically improve the per-
pattern 6. However, only Irregexp recognizes that pattern
formance, it is outside the scope of this paper, and we need
4 and 5 are essentially the same, because the inner, non-
to avoid its side effects during the following comparison.
capturing bracket can be ignored. We noticed that Irregexp
The purpose of the next measurement is comparing the
is particularly efficient in recognizing such cases. Patterns
raw backtracking performance of three JIT accelerated en-
7 and 9 show the matching overhead of capturing brackets.
gines. Since PCRE-JIT is not an improvement of an exist-
This overhead ratio is worse for JIT accelerated engines than
ing approach, and no other engine uses an AST based code
interpreted ones, so it is recommended to avoid unnecessary
generator at the moment, we choose Irregexp and YARR
capturing brackets when JIT acceleration is used.
for this comparison. All of these engines aim for very high
Patterns 10 to 12 show the matching performance of al-
performance, so they can represent the efficiency of their
ternatives. Again, Irregexp is the only engine that notices
corresponding matching algorithm described in Section 4.1.
Average < 1.5 x as fast 1.5 - 4.0 x as fast 4.0 - 8.0 x as fast 8.0 - ∞ x as fast
Target speedup % of % of total % of % of total % of % of total % of % of total
CPU (x as fast) patterns runtime patterns runtime patterns runtime patterns runtime
x86/32 6.84 24.53% 0.79% 63.49% 3.37% 7.46% 40.65% 4.51% 55.18%
x86/64 5.55 42.89% 1.36% 46.03% 5.67% 11.09% 92.97% 0.00% 0.00%
ARM-V7/32 6.76 65.85% 2.14% 21.00% 2.27% 6.87% 19.25% 6.28% 76.34%
ARM-THUMB2/32 7.24 65.85% 2.01% 22.87% 2.66% 4.42% 13.14% 6.87% 82.19%
PowerPC/32 5.48 72.82% 2.22% 17.08% 6.47% 10.11% 91.30% 0.00% 0.00%
PowerPC/64 5.55 73.01% 2.20% 16.49% 5.34% 10.50% 92.46% 0.00% 0.00%
SPARC/32 5.85 66.44% 2.04% 21.98% 3.17% 11.29% 92.08% 0.29% 2.71%
MIPS/32 7.59 65.95% 1.77% 18.55% 1.91% 8.44% 12.88% 7.07% 83.44%
Average 6.36 59.67% 1.82% 28.43% 3.86% 8.77% 56.84% 3.13% 37.48%
Std. dev. 0.79 15.92% 0.47% 15.95% 1.61% 2.26% 36.26% 3.14% 37.68%

Table 3: Matching performance improvement provided by the JIT compiler on all CPU architectures sup-
ported by SLJIT (runtime refers to interpreted runtime).

Average < 3.0 x as fast 3.0 - ∞ x as fast

Target speedup % of % of total % of % of total
CPU (x as fast) patterns runtime patterns runtime
x86/32 6.84 84.99% 3.66% 15.01% 96.34%
x86/64 5.55 82.92% 3.80% 17.08% 96.20%
ARM-V7/32 6.76 82.24% 3.71% 17.76% 96.29%
ARM-THUMB2/32 7.24 84.59% 3.78% 15.41% 96.22%
PowerPC/32 5.48 88.13% 6.47% 11.87% 93.53%
PowerPC/64 5.55 88.42% 6.38% 11.58% 93.62%
SPARC/32 5.85 83.61% 4.08% 16.39% 95.92%
MIPS/32 7.59 81.35% 3.27% 18.65% 96.73%
Average 6.36 84.53% 4.39% 15.47% 95.61%
Std. dev. 0.79 2.42% 1.19% 2.42% 1.19%

Table 4: Matching performance improvement provided by the JIT compiler on all CPU architectures sup-
ported by SLJIT (runtime refers to interpreted runtime).

that (?:a|a) is the same as a, which greatly improves its of the patterns are not accelerated at all; but these patterns
matching performance in case of pattern 10. However, if only take 2% of the total interpreted runtime! On the other
we tweak the pattern a little by adding another a to the hand, around 10% of the patterns take about 90% of the
first alternative, this optimization cannot be used anymore, total runtime, and the PCRE-JIT generated machine code
and PCRE-JIT becomes faster. YARR JIT supports these runs 4+ times faster for these patterns. In other words, the
patterns, and has similar speed to Irregexp. JIT compiler helps where help is most needed, and efficiently
The conclusion of this measurement is that each engine accelerates the matching performance of long running pat-
has its own strengths: YARR has better optimizations for terns.
character repetitions, Irregexp can generate an optimized The aim of our last measurement is to compare the match-
code path for several special cases, and PCRE-JIT has the ing performance of PCRE-JIT and Irregexp. Only those pat-
most efficient backtracking algorithm. Since these aproaches terns which produced the same matches are included in this
are independent, they can be used to further improve PCRE- measurement. The overall speedup was 1.63 times faster,
JIT in the future. but we observed large runtime differences again. Hence we
The following two measurements show the general perfor- created three groups: patterns with short (S) medium (M)
mance progression of PCRE-JIT on the Snort benchmark and long (L) runtime. A pattern belongs to a given group if
set, i.e. the results were affected by all optimizations, not both engines match it faster than the upper limits in the box
just static backtracking. The first measurement, which com- at the bottom of Figure 10. The pie chart in the left shows
pares the PCRE interpreter and PCRE-JIT on various CPU the proportion of each group. Not surprisingly, the group S
architectures, is shown in Table 4. To provide a fair com- is the largest, nearly 80% of the patterns belong here. How-
parison, the interpreter is optimized for speed by passing ever, the majority of the runtime is spent on matching the
-O3 to the GCC compiler. The second column contains the group L, as we can see in the middle chart. Similar to the
average speedup, which is about 5 to 7-fold. We noticed previous measurement, the strength of PCRE-JIT is match-
however that the gain varies greatly for different patterns, ing long running patterns: our engine was almost twice as
so we organized them into speedup classes. Each class con- fast as Irregexp in case of the group L. On the other hand,
tains those patterns whose speedup is between the minimum group S was matched 1.5 times slower by PCRE-JIT so there
and the maximum of the class. As we can see, nearly 60-70% is still room for improvement.
70 Irregexp PCRE-JIT 100%
Number of patterns

Patterns (%) unsupported by DFA

in each group

Total runtime (s) of each group

1.95x as fast
L 80%
M 50

60%
40

1.56x as slow
30

1.19x as fast
40%
S 20
20%
S: 799 (78.56%) 10
M: 123 (12.09%)
L: 95 (9.34%) 0 0%
S M L S M L
S: Patterns with short runtime (0-20 ms)
95 (9.34%) M: Patterns with medium runtime (20-200 ms)
L: Patterns with long runtime (200-2000ms)
123 (12.09%)

Figure 10: Compare PCRE-JIT to Irregexp on an x86-64 machine using the Snort pattern set.

Lastly we discuss whether a DFA based engine would be 7. ACKNOWLEDGMENTS

a better choice for matching these patterns. The rightmost The author would like to thank Philip Hazel and Ákos Kiss
chart in Figure 10 shows the ratio of patterns which are for their valuable comments and suggestions to improve the
not supported by DFA based engines in each group. The paper.
most frequent reasons are backreferences and look-around
assertions. Although DFA could be a viable choice for group
S, and perhaps group M, these groups take only a small 8. REFERENCES
portion of the total runtime. Accelerating the patterns in the [1] Adobe Corporation and Mozilla Corporation.
long running group L is more important, but these patterns NanoJIT. https://developer.mozilla.org/En/Nanojit.
are simply not supported by DFA based engines. Therefore [2] A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman.
our only option is to optimize backtracking engines, and Compilers: Principles, techniques and tools.
static backtracking technique can be used for this purpose. Addison-Wesley Publishing, Boston: Pearson/Addison
Wesley, 2007.
6. CONCLUSIONS [3] Alexa Internet. Top Sites, jun 2012.
http://www.alexa.com/topsites.
In this paper we introduced our PCRE-JIT engine, which [4] G. Barraclough. Yarr - Yet Another Regex Runtime.
compiles machine executable code from regular expressions. http://trac.webkit.org/browser/trunk/
The engine has two major components: the PCRE-JIT and Source/JavaScriptCore/yarr.
SLJIT compilers. Both components were developed by us
[5] P. Bonzini. GNU lightning.
from scratch. The PCRE-JIT compiler uses a new, AST
http://ftp.gnu.org/gnu/lightning/.
based code generator technique called static backtracking.
Unlike DFA, our technique is able to cover the whole feature [6] P. Bumbulis and D. D. Cowan. RE2C - A More
set of the widely used PCRE library. Versatile Scanner Generator. ACM Lett. Program.
We also showed that PCRE-JIT can efficiently improve Lang. Syst, 2:70–84, 1994.
the speed of long running patterns compared to the PCRE [7] E. Corry, C. P. Hansen, and L. R. H. Nielsen. Irregexp,
interpreter and other JIT accelerated engines. We also ob- Google Chrome’s New Regexp Implementation.
served that many long running patterns are not supported http://blog.chromium.org/2009/02/irregexp-google-
by DFA based engines, which explains the need for improv- chromes-new-regexp.html.
ing the backtracking engines. [8] R. Cox. RE2 an efficient, principled regular expression
The comparison of various engines revealed that the per- library. http://code.google.com/p/re2/.
formance of backtracking engines can be improved in mul- [9] M. de Icaza. Mono project.
tiple ways. These essentially different approaches could be http://www.mono-project.com/.
integrated into a single engine, which is a viable way to fur- [10] D. R. Engler. VCODE: a retargetable, extensible, very
ther improve the PCRE-JIT engine. We also hope to find fast dynamic code generation system. SIGPLAN Not.,
more undiscovered approaches in the future. 31:160–170, may 1996.
[11] J. E. F. Friedl. Mastering Regular Expressions (3rd SIGPLAN/SIGOPS conference on Virtual Execution
edition). O’Reilly Media, 2006. Environments, pages 63–74, 2012.
[12] A. Gal, C. W. Probst, and M. Franz. HotpathVM: an
effective JIT compiler for resource-constrained devices.
In Proceedings of the 2nd international conference on
Virtual execution environments, pages 144–153, 2006.
[13] R. C. Haygood. Native code compilation in SICStus
Prolog. In Proceedings of the eleventh international
conference on Logic programming, pages 190–204,
1994.
[14] P. Hazel. Perl Compatible Regular Expression library.
http://www.pcre.org.
[15] V. Karakoidas and D. Spinellis. FIRE/J - Optimizing
Regular Expression Searches with Generative
Programming. Software: Practice and Experience,
pages 557–573, 2008.
[16] S. C. Kleene. Represenations of Events in Nerve Nets
and Finite Automata. Annals of Mathematics Studies,
34:3–42, 1956.
[17] G. Klein. JFlex: The fast lexical analyzer generator
for Java, 1998.
[18] P. Kobalicek. AsmJit.
http://code.google.com/p/asmjit/.
[19] C. Lattner and V. Adve. LLVM: A Compilation
Framework for Lifelong Program Analysis &
Transformation. In Proceedings of the 2004
International Symposium on Code Generation and
Optimization (CGO’04), Palo Alto, California, Mar
2004.
[20] J. Levine. Flex & Bison. O’Reilly Media, 2009.
[21] J. R. Levine, T. Mason, and D. Brown. LEX & YACC
(2nd edition). O’Reilly Media, 1992.
[22] I. McKellar. llvm-re.
https://github.com/ianloic/unladen-swallow.
[23] Microsoft Corporation. Regular Expressions for .NET
Framework. http://msdn.microsoft.com/en-
us/library/hs600312.aspx.
[24] S. Muchnick. Advanced Compiler Design and
Implementation. Morgan Kaufmann, 1997.
[25] M. Pall. DynASM - Dynamic Assembler for code
generation engines, 2005.
http://luajit.org/dynasm.html.
[26] T. Parr and K. Fisher. LL(*): the foundation of the
ANTLR parser generator. PLDI ’11, June 2011.
[27] Sourcefire. Snort Intrusion Detection System.
http://www.snort.org/.
[28] K. Thompson. Regular Expression Search Algorithm.
Communications of the ACM, pages 419–422, 1968.
[29] L. Wall, T. Christiansen, and J. Orwant. Programming
Perl (3rd edition). O’Reilly Media, 2000.
[30] R. Weatherley, K. Treichel, A. Demakov, and
K. Kononenko. libJIT, 2004.
http://download.savannah.gnu.org/releases/dotgnu-
pnet/libjit-releases/.
[31] C. Wimmer and M. Franz. Linear scan register
allocation on SSA form. In Proceedings of the 8th
annual IEEE/ACM international symposium on Code
generation and optimization, pages 170–179, 2010.
[32] Y. Zhang, M. Yang, B. Zhou, Z. Yang, W. Zhang, and
B. Zang. Swift: a register-based JIT compiler for
embedded JVMs. In Proceedings of the 8th ACM

View publication stats

TOC Diff
No ratings yet
TOC Diff
4 pages
Fast Hash-Based Pattern Matching in Networks
No ratings yet
Fast Hash-Based Pattern Matching in Networks
5 pages
Icl Utk 1031 2017
No ratings yet
Icl Utk 1031 2017
45 pages
Dynamic Graph Malware Classifier
No ratings yet
Dynamic Graph Malware Classifier
119 pages
A Data Structure Optimizing Compiler For tUPL
No ratings yet
A Data Structure Optimizing Compiler For tUPL
102 pages
Applications of Flat
No ratings yet
Applications of Flat
3 pages
Theory of Computation Submission 1
No ratings yet
Theory of Computation Submission 1
12 pages
Research Statement
No ratings yet
Research Statement
2 pages
Dakota 6.3 Reference Manual
No ratings yet
Dakota 6.3 Reference Manual
2,263 pages
500 Advanced C++
No ratings yet
500 Advanced C++
201 pages
Graph Databases in Malware Analysis
No ratings yet
Graph Databases in Malware Analysis
32 pages
Fast and Memory-Efficient Regular Expression Matching For Deep Packet Inspection
No ratings yet
Fast and Memory-Efficient Regular Expression Matching For Deep Packet Inspection
15 pages
Engineering Simple Efficient Code Generator Genera
No ratings yet
Engineering Simple Efficient Code Generator Genera
15 pages
Advanced Parallel Computing Concepts
No ratings yet
Advanced Parallel Computing Concepts
43 pages
Compiler Scanner Overview and Strategies
No ratings yet
Compiler Scanner Overview and Strategies
20 pages
The Fastest 1.3.6 User's Guide: Automating Software Testing
No ratings yet
The Fastest 1.3.6 User's Guide: Automating Software Testing
60 pages
Master Thesis
No ratings yet
Master Thesis
100 pages
Malware Analysis via Symbolic Execution
No ratings yet
Malware Analysis via Symbolic Execution
94 pages
Software Test Bed Generation Patent
No ratings yet
Software Test Bed Generation Patent
28 pages
Theory of Computation Overview
No ratings yet
Theory of Computation Overview
7 pages
Regular Expressions and Automata Overview
No ratings yet
Regular Expressions and Automata Overview
39 pages
TCS QB
No ratings yet
TCS QB
20 pages
Design Algorithm
No ratings yet
Design Algorithm
5 pages
Compiler Construction Tools Explained
No ratings yet
Compiler Construction Tools Explained
77 pages
Pcre 2
No ratings yet
Pcre 2
194 pages
Understanding Pushdown Automata and Complexity
No ratings yet
Understanding Pushdown Automata and Complexity
2 pages
Essential Text Editors for Programmers
No ratings yet
Essential Text Editors for Programmers
51 pages
Modeling A Non-Uniform Memory Access Architecture For Optimizing
No ratings yet
Modeling A Non-Uniform Memory Access Architecture For Optimizing
79 pages
Differences Between AI Agent Types
No ratings yet
Differences Between AI Agent Types
36 pages
Maria Master2017
No ratings yet
Maria Master2017
50 pages
Fast Regex-Based Static Analysis
No ratings yet
Fast Regex-Based Static Analysis
12 pages
Syllabus For Guest Lecture Subject
No ratings yet
Syllabus For Guest Lecture Subject
4 pages
Inferring Image Bases of ARM32 Binaries
No ratings yet
Inferring Image Bases of ARM32 Binaries
46 pages
Petri Nets Project Final Report
No ratings yet
Petri Nets Project Final Report
62 pages
JavaScript Call Graph Algorithm Evaluation
No ratings yet
JavaScript Call Graph Algorithm Evaluation
76 pages
RAVE in Crew Planning
No ratings yet
RAVE in Crew Planning
61 pages
Faceted Browsing Benchmark Report
No ratings yet
Faceted Browsing Benchmark Report
17 pages
Regex Matching in High-Performance Computing
No ratings yet
Regex Matching in High-Performance Computing
9 pages
Matching Linear Algebra and Tensor Code To Specialized
No ratings yet
Matching Linear Algebra and Tensor Code To Specialized
13 pages
IEEE 2012-2013 Software Projects Guide
No ratings yet
IEEE 2012-2013 Software Projects Guide
7 pages
Session 3
No ratings yet
Session 3
21 pages
HPC - Unit-2 Insem Notes
No ratings yet
HPC - Unit-2 Insem Notes
99 pages
Users Guide To The PGAPack Parallel Genetic Algorithm Library
No ratings yet
Users Guide To The PGAPack Parallel Genetic Algorithm Library
78 pages
Julia for Scientific Computing Experts
No ratings yet
Julia for Scientific Computing Experts
34 pages
Part-B - Toc
No ratings yet
Part-B - Toc
20 pages
COS214 Tut10
No ratings yet
COS214 Tut10
3 pages
Improving Implementation of Code Generators A Regular-Expression Approach
No ratings yet
Improving Implementation of Code Generators A Regular-Expression Approach
10 pages
Module 1
No ratings yet
Module 1
20 pages
Ai Desc
No ratings yet
Ai Desc
35 pages
ASCET V5.2 Reference
No ratings yet
ASCET V5.2 Reference
276 pages
PACE Design for Compiler Experts
No ratings yet
PACE Design for Compiler Experts
146 pages
Development of A Control Path VHDL Code Generator For Hardware Development
No ratings yet
Development of A Control Path VHDL Code Generator For Hardware Development
31 pages
Rose Ada
No ratings yet
Rose Ada
158 pages
CodeGen4Libs: Library-Oriented Code Generation
No ratings yet
CodeGen4Libs: Library-Oriented Code Generation
12 pages
A Regular Expression Engine in C# - CodeProject
No ratings yet
A Regular Expression Engine in C# - CodeProject
9 pages
Theory of Automata Presentation
No ratings yet
Theory of Automata Presentation
16 pages
M Tech ADS Question Paper With Answers
No ratings yet
M Tech ADS Question Paper With Answers
72 pages
Full Lab Report 8086
No ratings yet
Full Lab Report 8086
3 pages
Debugger
No ratings yet
Debugger
9 pages
Data Structures and Algorithms Overview
No ratings yet
Data Structures and Algorithms Overview
11 pages
Data Structure & Algorithms
No ratings yet
Data Structure & Algorithms
123 pages
Recursive Programming Lab Exercises
No ratings yet
Recursive Programming Lab Exercises
136 pages
Stacks and Queues in Data Structures
No ratings yet
Stacks and Queues in Data Structures
34 pages
DATA STRUCTURES Important Questions
No ratings yet
DATA STRUCTURES Important Questions
2 pages
Data Structures in C: Comprehensive Guide
No ratings yet
Data Structures in C: Comprehensive Guide
212 pages
CP264 - Final Review
No ratings yet
CP264 - Final Review
18 pages
Benefits (Use) of Pointers in C
No ratings yet
Benefits (Use) of Pointers in C
5 pages
Unit 2 Co
No ratings yet
Unit 2 Co
66 pages
REXX Complete Reference
100% (2)
REXX Complete Reference
51 pages
Module3 - 8051 Stack, IO Port Interfacing and Programming - Updated
No ratings yet
Module3 - 8051 Stack, IO Port Interfacing and Programming - Updated
10 pages
8085: Subroutine - Stack (Part 1)
No ratings yet
8085: Subroutine - Stack (Part 1)
18 pages
Practical Daa Soham
No ratings yet
Practical Daa Soham
33 pages
Stack Organization & Instruction Formats
No ratings yet
Stack Organization & Instruction Formats
33 pages
Understanding Stack Data Structure
No ratings yet
Understanding Stack Data Structure
7 pages
Stacks Operations - PUSH, POP With Examples.
No ratings yet
Stacks Operations - PUSH, POP With Examples.
15 pages
Computer Science Practical Record 2024-25
No ratings yet
Computer Science Practical Record 2024-25
42 pages
Chapter 2 - Cortex-M Architecture - 1
No ratings yet
Chapter 2 - Cortex-M Architecture - 1
40 pages
Introduction to Data Structures & Algorithms
No ratings yet
Introduction to Data Structures & Algorithms
72 pages
Minecraft Forge 1.20.1 Changelog
No ratings yet
Minecraft Forge 1.20.1 Changelog
23 pages
Java Multiple Choice Questions Guide
No ratings yet
Java Multiple Choice Questions Guide
23 pages
Python Output for List Operations
No ratings yet
Python Output for List Operations
8 pages
Stack Data Structure Guide
No ratings yet
Stack Data Structure Guide
30 pages
VHDL Stack Implementation Guide
No ratings yet
VHDL Stack Implementation Guide
5 pages
Piet - Esolang
No ratings yet
Piet - Esolang
1 page
ShellCoder's Programming Uncovered (Kris Kaspersky)
100% (1)
ShellCoder's Programming Uncovered (Kris Kaspersky)
747 pages
Python Programs for User Input Tasks
No ratings yet
Python Programs for User Input Tasks
52 pages

Pcre Jit

Uploaded by

Pcre Jit

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Extending the PCRE Library with Static Backtracking Based Just-in-Time

Conference Paper · February 2014

The user has requested enhancement of the downloaded file.

Figure 4: Backtracking mechanism of /AB/ regular expression

track(). The match(P) and backtrack(P) templates represent

Table 1: Runtime comparison of pathological cases.

Pattern Input Irregexp Irregexp PCRE JIT PCRE JIT

Table 2: Runtime comparison of pathological cases.

Average < 3.0 x as fast 3.0 - ∞ x as fast

Patterns (%) unsupported by DFA

Total runtime (s) of each group

Lastly we discuss whether a DFA based engine would be 7. ACKNOWLEDGMENTS

View publication stats

You might also like