Kernel HTTPS/TCP/IP stack
for HTTP DDoS mitigation
Alexander Krizhanovsky
Tempesta Technologies, Inc.
ak@tempesta-tech.com
Who am I?
CEO at Tempesta Technologies, INC
Developing Tempesta FW – open source Linux
Application Delivery Controller (ADC)
Custom software development in:
●
high performance network traffic processing
e.g. WAF mentioned in Gartner magic quadrant
https://www.ptsecurity.com/ww-en/products/af/
●
Databases
e.g. MariaDB SQL System-Versioned Tables
https://mariadb.com/kb/en/library/system-versioned-tables/
https://mariadb.com/conference/session/querying-data-previous-
point-time
Problem: HTTP filtration
2013: WAF development by request of Positive Technologies
●
Web attacks
●
L7 HTTP/HTTPS DDoS attacks
Nginx, HAProxy, etc. - perfect HTTP accelerators, not HTTP filters
Netfilter works in TCP/IP stack (softirq) => HTTP(S)/TCP/IP stack
Tempesta FW: a hybrid of HTTP accelerator & firewall
Tempesta FW:
Application Delivery Controller (ADC)
Web Application Firewall (WAF) acceleration
WAFs are slow (Machine learning, DOM, regexps etc.)
Advanced load balancing among more powerful and slow WAFs
Simple & fast web attacks filtering
Some DDoS attacks can be normally serviced from fast web cache
Web-accelerators are slow
Slow & non-scalable network I/O (queues are bad for CPU caches)
Data copyings & syscalls
Dummy HTTP FSMs
HTTP strings are special: LIBC functions don’t work well
Don’t care about the corner cases (good DDoS targets)
TLS data copies (even with kTLS & QUIC), no TCP awarness
Filesystem-based Web-cache (except ATS)
Sometimes request blocking is slower than serving it :)
Application layer DDoS
Service from Cache Rate limit
Nginx 22us 23us
(Additional logic in limiting module)
Fail2Ban: write to the log, parse the log, write to the log, parse the
log…
Application layer DDoS
Service from Cache Rate limit
Nginx 22us 23us
(Additional logic in limiting module)
Fail2Ban: write to the log, parse the log, write to the log, parse the
log… - really in 21th century?!
tight integration of Web accelerator and a firewall is needed
Other DDoS filters: firewall & NIDS
IPtables strings, BPF, XDP, NIC filters
●
HTTP headers can cross packet bounds
●
Scan large URI or Cookie for Host value?
NIPS (e.g. Suricata)
●
powerful rules syntax at L3-L7
●
Not a TCP end point => evasions are possible
●
TLS terminator is required (data copies & context switches)
or double TLS processing
●
Double HTTP parsing
●
Doesn’t improve Web server performance
(mitigation != prevention)
Web-accelerators are slow: SSL/TLS copying
User-kernel space copying
●
Copy network data to user space
●
Encrypt/decrypt it
●
Copy the date to kernel for transmission
Kernel-mode TLS (Linux kTLS)
●
Modern kTLS eliminates ingress & egress data copyings
●
Unaware about TCP transmission state (cwnd & rwnd)
●
Doesn’t use SIMD for memcpy() & memset()
●
TLS 1.3 is good, but it’s profitable for DDoS bots to be legacy clients
●
TLS handshake is still an issue
Linux kernel TLS & DDoS
Most Facebook users have
established sessions
TLS handshake is still an issue
●
TLS 1.3 has 1-RTT handshake
●
TLS 1.2 must live for a long
time for legacy clients
https://www.netdevconf.org/0x12/se
ssion.html?kernel-tls-handshakes-
for-https-ddos-mitigation
9.11% libcrypto.so.1.1 [.] __ecp_nistz256_mul_montx
7.80% libc-2.24.so [.] _int_malloc
7.03% libcrypto.so.1.1 [.] __ecp_nistz256_sqr_montx
3.54% libcrypto.so.1.1 [.] sha512_block_data_order_avx2
3.05% libcrypto.so.1.1 [.] BN_div
2.43% libc-2.24.so [.] _int_free
1.89% libcrypto.so.1.1 [.] OPENSSL_cleanse
1.61% libc-2.24.so [.] malloc_consolidate
1.49% libcrypto.so.1.1 [.] ecp_nistz256_avx2_gather_w7
1.41% libc-2.24.so [.] malloc
1.24% libcrypto.so.1.1 [.] ecp_nistz256_point_doublex
1.20% libcrypto.so.1.1 [.] ecp_nistz256_ord_sqr_montx
1.01% libcrypto.so.1.1 [.] __ecp_nistz256_sub_fromx
1.00% libcrypto.so.1.1 [.] BN_lshift
0.87% libcrypto.so.1.1 [.] BN_num_bits_word
0.86% libcrypto.so.1.1 [.] bn_correct_top
0.84% libcrypto.so.1.1 [.] BN_CTX_get
0.81% libc-2.24.so [.] __memset_avx2_unaligned_erms
0.77% libc-2.24.so [.] free
0.74% libcrypto.so.1.1 [.] __ecp_nistz256_mul_by_2x
0.71% libcrypto.so.1.1 [.] BN_rshift
0.59% libcrypto.so.1.1 [.] BN_uadd
0.59% libcrypto.so.1.1 [.] int_bn_mod_inverse
0.54% libc-2.24.so [.] __memmove_avx_unaligned_erms
0.53% libcrypto.so.1.1 [.] aesni_ecb_encrypt
Web-accelerators are slow: profile
% symbol name
1.5719 ngx_http_parse_header_line
1.0303 ngx_vslprintf
0.6401 memcpy
0.5807 recv
0.5156 ngx_linux_sendfile_chain
0.4990 ngx_http_limit_req_handler
=> flat profile
Web-accelerators are slow: syscalls
epoll_wait(.., {{EPOLLIN, ....}},...)
recvfrom(3, "GET / HTTP/1.1rnHost:...", ...)
write(1, “...limiting requests, excess...", ...)
writev(3, "HTTP/1.1 503 Service...", ...)
sendfile(3,..., 383)
recvfrom(3, ...) = -1 EAGAIN
epoll_wait(.., {{EPOLLIN, ....}}, ...)
recvfrom(3, "", 1024, 0, NULL, NULL) = 0
close(3)
Web-accelerators are slow: HTTP parser
Start: state = 1, *str_ptr = 'b'
while (++str_ptr) {
switch (state) { <= check state
case 1:
switch (*str_ptr) {
case 'a':
...
state = 1
case 'b':
...
state = 2
}
case 2:
...
}
...
}
Web-accelerators are slow: HTTP parser
Start: state = 1, *str_ptr = 'b'
while (++str_ptr) {
switch (state) {
case 1:
switch (*str_ptr) {
case 'a':
...
state = 1
case 'b':
...
state = 2 <= set state
}
case 2:
...
}
...
}
Web-accelerators are slow: HTTP parser
Start: state = 1, *str_ptr = 'b'
while (++str_ptr) {
switch (state) {
case 1:
switch (*str_ptr) {
case 'a':
...
state = 1
case 'b':
...
state = 2
}
case 2:
...
}
... <= jump to while
}
Web-accelerators are slow: HTTP parser
Start: state = 1, *str_ptr = 'b'
while (++str_ptr) {
switch (state) { <= check state
case 1:
switch (*str_ptr) {
case 'a':
...
state = 1
case 'b':
...
state = 2
}
case 2:
...
}
...
}
Web-accelerators are slow: HTTP parser
Start: state = 1, *str_ptr = 'b'
while (++str_ptr) {
switch (state) {
case 1:
switch (*str_ptr) {
case 'a':
...
state = 1
case 'b':
...
state = 2
}
case 2:
... <= do something
}
...
}
Web-accelerators are slow: HTTP parser
Web-accelerators are slow: strings
We have AVX2, but GLIBC doesn’t still use it
HTTP strings are special:
● No ‘0’-termination (if you’re zero-copy)
● Special delimiters (‘:’ or CRLF)
●
strcasecmp(): no need case conversion for one string
●
strspn(): limited number of accepted alphabets
switch()-driven FSM is even worse
Fast & secure HTTP parser
http://natsys-lab.blogspot.ru/2014/11/the-fast-finite-state-machine-for-
http.html
●
1.6-1.8 times faster than Nginx’s
HTTP optimized AVX2 strings processing:
http://natsys-lab.blogspot.ru/2016/10/http-strings-processing-using-c-
sse42.html
●
injection attacks prevention: allowed strict character sets
●
strncasecmp() ~x3 faster than GLIBC’s
●
URI matching ~x6 faster than GLIBC’s strspn()
●
kernel_fpu_begin()/kernel_fpu_end() for whole softirq shot
Web-accelerators are slow: async I/O
Web-accelerators are slow: async I/O
Web-accelerators are slow: async I/O
Web-accelerators are slow: async I/O
Web cache also
resides In CPU
caches and evicts
requests
Web cache: TempestaDB
In-memory database for Web-cache and
firewall rules
Cache conscious Burst Hash Trie
●
short offsets instead of pointers
●
(almost) lock-free
lock-free block allocator on huge pages
for virtually contiguous memory
https://www.percona.com/live/data-
performance-conference-
2016/sessions/linux-kernel-extension-
databases
The HTTPS/TCP/IP stack
(Interbreed an HTTP accelerator and a firewall)
Alternative to user space TCP/IP stacks
HTTPS is built into TCP/IP stack
●
HTTP pipelining even for HTTP/1.1
Kernel TLS handshakes (fork from mbedTLS)
HTTP/L7 firewall plus to nftables and BPF
●
TCP & TLS end point (vs. NIPS such as Suricata)
Very fast HTTP parser and strings processing using AVX2
Cache-conscious in-memory Web-cache for DDoS mitigation
TODO: HTTP QoS for asymmetric DDoS mitigation
L7 DDoS mitigation: sticky cookie
User/session identification
●
Cookie challenge for dummy DDoS bots
●
Persistent/sessions scheduling (no rescheduling on a server failure)
timestamp | HMAC(Secret User-Agent timestamp Client IP)
enforce: HTTP 302 redirect
sticky name=__tfw_user_id__ enforce;
L7 DDoS mitigation: JavaScript challenge
Effectively slows bots down
L7 DDoS mitigation: limits
Rate limits
●
request_rate, request_burst
●
connection_rate, connection_burst
●
concurrent_connections
●
http_resp_code_block – blocks password crackers
Slow HTTP
●
client_header_timeout, client_body_timeout
●
http_header_cnt
●
http_header_chunk_cnt, http_body_chunk_cnt
Web Application Security (WAF acceleration)
Length limits: http_uri_len, http_field_len, http_body_len
Content validation: http_host_required, http_ct_required,
http_ct_vals, http_methods
HTTP Response Splitting: count and match requests and responses
Injections: verify allowed (by an administrator) character sets
●
Resistant to large HTTP fields (AVX2)
https://natsys-lab.blogspot.ru/2016/10/http-strings-processing-using
-c-sse42.html
TODO: decoding before character sets validation
HTTP tables
HTTP load balancer and a firewall (~nftables)
mark-integration with nftables
# nft add rule inet filter input ip saddr 192.168.100.1 mark set 1
# cat etc/tempesta_fw.conf
srv_group backend { server 127.0.0.1:8080; }
vhost protected_host { proxy_pass backend; }
http_chain multi_layer_rules {
hdr “Referer” == “badhost.com/*” -> block;
-> protected_host; # all checks are passed
}
http_chain {
mark == 1 -> multi_layer_rules;
-> protected_host; # pass all by default
}
Performance
https://github.com/tempesta-tech/tempesta/wiki/HTTP-cache-performance
Performance
https://github.com/tempesta-tech/tempesta/wiki/HTTP-cache-performance
Most HTTP floods can be
mitigated w/o any special filtering!
Performance analysis: comparison w/ Nginx
0
500000
1x10
6
1.5x10
6
2x10 6
2.5x10
6
1 10 100 1000 10000
rps
connections
Tempesta FW vs Nginx; E5-1650v3; HTTP/1.1, 8B response, keep-alive
Nginx 1.11.5
Tempesta FW 0.5.0-pre5
Performance analysis: kernel bypass
Similar to DPDK/user-space TCP/IP stacks
http://www.seastar-project.org/
http-performance/
...bypassing Linux TCP/IP
isn’t the only way to get a fast Web
server
...lives in Linux infrastructure:
LVS, tc, IPtables, eBPF, tcpdump etc.
User space HTTP proxying
1. Receive request at CPU1
2. Copy request to user space
3. Update headers
4. Copy request to kernel space
5. Send the request from CPU2
3 data copies
Access TCP control blocks and
data buffers from different CPUs
Synchronous sockets: HTTPS/TCP/IP stack
Socket callbacks call TLS and
HTTP processing
Everything is processing in
softirq (while the data is hot)
No receive & accept queues
No file descriptors
Less locking
Synchronous sockets: HTTPS/TCP/IP stack
Socket callbacks call TLS and
HTTP processing
Everything is processing in
softirq (while the data is hot)
No receive & accept queues
No file descriptors
Less locking
Lock-free inter-CPU transport
=> faster socket reading
=> lower latency
skb page allocator:
zero-copy HTTP messages adjustment
Add/remove/update HTTP
headers w/o copies
skb and its head are
allocated in the same
page fragment or
a compound page
skb page allocator:
zero-copy HTTP messages adjustment
Add/remove/update HTTP
headers w/o copies
skb and its head are allocated
in the same page fragment or a
compound page
Beta (exp. early 2019)
We’re in alpha (0.5.x)
Beta (1.0, exp. early 2019)
●
Tempesta TLS (GPU offload - TBD)
https://www.netdevconf.org/0x12/session.html?kernel-tls-
handshakes-for-https-ddos-mitigation
●
TLS 1.3
●
HTTP/2
●
Tunable HTTP proxy buffering & streaming (like Tengine)
●
HTTP QoS for asymmetric DDoS mitigation (some ML)
●
HTTP URI/Cookie/POST normalization
(protection against injection attacks)
Thanks!
Web-site: http://tempesta-tech.com
Availability: https://github.com/tempesta-tech/tempesta
Blog: http://natsys-lab.blogspot.com
E-mail: ak@tempesta-tech.com

Linux HTTPS/TCP/IP Stack for the Fast and Secure Web

  • 1.
    Kernel HTTPS/TCP/IP stack forHTTP DDoS mitigation Alexander Krizhanovsky Tempesta Technologies, Inc. [email protected]
  • 2.
    Who am I? CEOat Tempesta Technologies, INC Developing Tempesta FW – open source Linux Application Delivery Controller (ADC) Custom software development in: ● high performance network traffic processing e.g. WAF mentioned in Gartner magic quadrant https://www.ptsecurity.com/ww-en/products/af/ ● Databases e.g. MariaDB SQL System-Versioned Tables https://mariadb.com/kb/en/library/system-versioned-tables/ https://mariadb.com/conference/session/querying-data-previous- point-time
  • 3.
    Problem: HTTP filtration 2013:WAF development by request of Positive Technologies ● Web attacks ● L7 HTTP/HTTPS DDoS attacks Nginx, HAProxy, etc. - perfect HTTP accelerators, not HTTP filters Netfilter works in TCP/IP stack (softirq) => HTTP(S)/TCP/IP stack Tempesta FW: a hybrid of HTTP accelerator & firewall
  • 4.
  • 5.
    Web Application Firewall(WAF) acceleration WAFs are slow (Machine learning, DOM, regexps etc.) Advanced load balancing among more powerful and slow WAFs Simple & fast web attacks filtering Some DDoS attacks can be normally serviced from fast web cache
  • 6.
    Web-accelerators are slow Slow& non-scalable network I/O (queues are bad for CPU caches) Data copyings & syscalls Dummy HTTP FSMs HTTP strings are special: LIBC functions don’t work well Don’t care about the corner cases (good DDoS targets) TLS data copies (even with kTLS & QUIC), no TCP awarness Filesystem-based Web-cache (except ATS) Sometimes request blocking is slower than serving it :)
  • 7.
    Application layer DDoS Servicefrom Cache Rate limit Nginx 22us 23us (Additional logic in limiting module) Fail2Ban: write to the log, parse the log, write to the log, parse the log…
  • 8.
    Application layer DDoS Servicefrom Cache Rate limit Nginx 22us 23us (Additional logic in limiting module) Fail2Ban: write to the log, parse the log, write to the log, parse the log… - really in 21th century?! tight integration of Web accelerator and a firewall is needed
  • 9.
    Other DDoS filters:firewall & NIDS IPtables strings, BPF, XDP, NIC filters ● HTTP headers can cross packet bounds ● Scan large URI or Cookie for Host value? NIPS (e.g. Suricata) ● powerful rules syntax at L3-L7 ● Not a TCP end point => evasions are possible ● TLS terminator is required (data copies & context switches) or double TLS processing ● Double HTTP parsing ● Doesn’t improve Web server performance (mitigation != prevention)
  • 10.
    Web-accelerators are slow:SSL/TLS copying User-kernel space copying ● Copy network data to user space ● Encrypt/decrypt it ● Copy the date to kernel for transmission Kernel-mode TLS (Linux kTLS) ● Modern kTLS eliminates ingress & egress data copyings ● Unaware about TCP transmission state (cwnd & rwnd) ● Doesn’t use SIMD for memcpy() & memset() ● TLS 1.3 is good, but it’s profitable for DDoS bots to be legacy clients ● TLS handshake is still an issue
  • 11.
    Linux kernel TLS& DDoS Most Facebook users have established sessions TLS handshake is still an issue ● TLS 1.3 has 1-RTT handshake ● TLS 1.2 must live for a long time for legacy clients https://www.netdevconf.org/0x12/se ssion.html?kernel-tls-handshakes- for-https-ddos-mitigation 9.11% libcrypto.so.1.1 [.] __ecp_nistz256_mul_montx 7.80% libc-2.24.so [.] _int_malloc 7.03% libcrypto.so.1.1 [.] __ecp_nistz256_sqr_montx 3.54% libcrypto.so.1.1 [.] sha512_block_data_order_avx2 3.05% libcrypto.so.1.1 [.] BN_div 2.43% libc-2.24.so [.] _int_free 1.89% libcrypto.so.1.1 [.] OPENSSL_cleanse 1.61% libc-2.24.so [.] malloc_consolidate 1.49% libcrypto.so.1.1 [.] ecp_nistz256_avx2_gather_w7 1.41% libc-2.24.so [.] malloc 1.24% libcrypto.so.1.1 [.] ecp_nistz256_point_doublex 1.20% libcrypto.so.1.1 [.] ecp_nistz256_ord_sqr_montx 1.01% libcrypto.so.1.1 [.] __ecp_nistz256_sub_fromx 1.00% libcrypto.so.1.1 [.] BN_lshift 0.87% libcrypto.so.1.1 [.] BN_num_bits_word 0.86% libcrypto.so.1.1 [.] bn_correct_top 0.84% libcrypto.so.1.1 [.] BN_CTX_get 0.81% libc-2.24.so [.] __memset_avx2_unaligned_erms 0.77% libc-2.24.so [.] free 0.74% libcrypto.so.1.1 [.] __ecp_nistz256_mul_by_2x 0.71% libcrypto.so.1.1 [.] BN_rshift 0.59% libcrypto.so.1.1 [.] BN_uadd 0.59% libcrypto.so.1.1 [.] int_bn_mod_inverse 0.54% libc-2.24.so [.] __memmove_avx_unaligned_erms 0.53% libcrypto.so.1.1 [.] aesni_ecb_encrypt
  • 12.
    Web-accelerators are slow:profile % symbol name 1.5719 ngx_http_parse_header_line 1.0303 ngx_vslprintf 0.6401 memcpy 0.5807 recv 0.5156 ngx_linux_sendfile_chain 0.4990 ngx_http_limit_req_handler => flat profile
  • 13.
    Web-accelerators are slow:syscalls epoll_wait(.., {{EPOLLIN, ....}},...) recvfrom(3, "GET / HTTP/1.1rnHost:...", ...) write(1, “...limiting requests, excess...", ...) writev(3, "HTTP/1.1 503 Service...", ...) sendfile(3,..., 383) recvfrom(3, ...) = -1 EAGAIN epoll_wait(.., {{EPOLLIN, ....}}, ...) recvfrom(3, "", 1024, 0, NULL, NULL) = 0 close(3)
  • 14.
    Web-accelerators are slow:HTTP parser Start: state = 1, *str_ptr = 'b' while (++str_ptr) { switch (state) { <= check state case 1: switch (*str_ptr) { case 'a': ... state = 1 case 'b': ... state = 2 } case 2: ... } ... }
  • 15.
    Web-accelerators are slow:HTTP parser Start: state = 1, *str_ptr = 'b' while (++str_ptr) { switch (state) { case 1: switch (*str_ptr) { case 'a': ... state = 1 case 'b': ... state = 2 <= set state } case 2: ... } ... }
  • 16.
    Web-accelerators are slow:HTTP parser Start: state = 1, *str_ptr = 'b' while (++str_ptr) { switch (state) { case 1: switch (*str_ptr) { case 'a': ... state = 1 case 'b': ... state = 2 } case 2: ... } ... <= jump to while }
  • 17.
    Web-accelerators are slow:HTTP parser Start: state = 1, *str_ptr = 'b' while (++str_ptr) { switch (state) { <= check state case 1: switch (*str_ptr) { case 'a': ... state = 1 case 'b': ... state = 2 } case 2: ... } ... }
  • 18.
    Web-accelerators are slow:HTTP parser Start: state = 1, *str_ptr = 'b' while (++str_ptr) { switch (state) { case 1: switch (*str_ptr) { case 'a': ... state = 1 case 'b': ... state = 2 } case 2: ... <= do something } ... }
  • 19.
  • 20.
    Web-accelerators are slow:strings We have AVX2, but GLIBC doesn’t still use it HTTP strings are special: ● No ‘0’-termination (if you’re zero-copy) ● Special delimiters (‘:’ or CRLF) ● strcasecmp(): no need case conversion for one string ● strspn(): limited number of accepted alphabets switch()-driven FSM is even worse
  • 21.
    Fast & secureHTTP parser http://natsys-lab.blogspot.ru/2014/11/the-fast-finite-state-machine-for- http.html ● 1.6-1.8 times faster than Nginx’s HTTP optimized AVX2 strings processing: http://natsys-lab.blogspot.ru/2016/10/http-strings-processing-using-c- sse42.html ● injection attacks prevention: allowed strict character sets ● strncasecmp() ~x3 faster than GLIBC’s ● URI matching ~x6 faster than GLIBC’s strspn() ● kernel_fpu_begin()/kernel_fpu_end() for whole softirq shot
  • 22.
  • 23.
  • 24.
  • 25.
    Web-accelerators are slow:async I/O Web cache also resides In CPU caches and evicts requests
  • 26.
    Web cache: TempestaDB In-memorydatabase for Web-cache and firewall rules Cache conscious Burst Hash Trie ● short offsets instead of pointers ● (almost) lock-free lock-free block allocator on huge pages for virtually contiguous memory https://www.percona.com/live/data- performance-conference- 2016/sessions/linux-kernel-extension- databases
  • 27.
    The HTTPS/TCP/IP stack (Interbreedan HTTP accelerator and a firewall) Alternative to user space TCP/IP stacks HTTPS is built into TCP/IP stack ● HTTP pipelining even for HTTP/1.1 Kernel TLS handshakes (fork from mbedTLS) HTTP/L7 firewall plus to nftables and BPF ● TCP & TLS end point (vs. NIPS such as Suricata) Very fast HTTP parser and strings processing using AVX2 Cache-conscious in-memory Web-cache for DDoS mitigation TODO: HTTP QoS for asymmetric DDoS mitigation
  • 28.
    L7 DDoS mitigation:sticky cookie User/session identification ● Cookie challenge for dummy DDoS bots ● Persistent/sessions scheduling (no rescheduling on a server failure) timestamp | HMAC(Secret User-Agent timestamp Client IP) enforce: HTTP 302 redirect sticky name=__tfw_user_id__ enforce;
  • 29.
    L7 DDoS mitigation:JavaScript challenge Effectively slows bots down
  • 30.
    L7 DDoS mitigation:limits Rate limits ● request_rate, request_burst ● connection_rate, connection_burst ● concurrent_connections ● http_resp_code_block – blocks password crackers Slow HTTP ● client_header_timeout, client_body_timeout ● http_header_cnt ● http_header_chunk_cnt, http_body_chunk_cnt
  • 31.
    Web Application Security(WAF acceleration) Length limits: http_uri_len, http_field_len, http_body_len Content validation: http_host_required, http_ct_required, http_ct_vals, http_methods HTTP Response Splitting: count and match requests and responses Injections: verify allowed (by an administrator) character sets ● Resistant to large HTTP fields (AVX2) https://natsys-lab.blogspot.ru/2016/10/http-strings-processing-using -c-sse42.html TODO: decoding before character sets validation
  • 32.
    HTTP tables HTTP loadbalancer and a firewall (~nftables) mark-integration with nftables # nft add rule inet filter input ip saddr 192.168.100.1 mark set 1 # cat etc/tempesta_fw.conf srv_group backend { server 127.0.0.1:8080; } vhost protected_host { proxy_pass backend; } http_chain multi_layer_rules { hdr “Referer” == “badhost.com/*” -> block; -> protected_host; # all checks are passed } http_chain { mark == 1 -> multi_layer_rules; -> protected_host; # pass all by default }
  • 33.
  • 34.
  • 35.
    Performance analysis: comparisonw/ Nginx 0 500000 1x10 6 1.5x10 6 2x10 6 2.5x10 6 1 10 100 1000 10000 rps connections Tempesta FW vs Nginx; E5-1650v3; HTTP/1.1, 8B response, keep-alive Nginx 1.11.5 Tempesta FW 0.5.0-pre5
  • 36.
    Performance analysis: kernelbypass Similar to DPDK/user-space TCP/IP stacks http://www.seastar-project.org/ http-performance/ ...bypassing Linux TCP/IP isn’t the only way to get a fast Web server ...lives in Linux infrastructure: LVS, tc, IPtables, eBPF, tcpdump etc.
  • 37.
    User space HTTPproxying 1. Receive request at CPU1 2. Copy request to user space 3. Update headers 4. Copy request to kernel space 5. Send the request from CPU2 3 data copies Access TCP control blocks and data buffers from different CPUs
  • 38.
    Synchronous sockets: HTTPS/TCP/IPstack Socket callbacks call TLS and HTTP processing Everything is processing in softirq (while the data is hot) No receive & accept queues No file descriptors Less locking
  • 39.
    Synchronous sockets: HTTPS/TCP/IPstack Socket callbacks call TLS and HTTP processing Everything is processing in softirq (while the data is hot) No receive & accept queues No file descriptors Less locking Lock-free inter-CPU transport => faster socket reading => lower latency
  • 40.
    skb page allocator: zero-copyHTTP messages adjustment Add/remove/update HTTP headers w/o copies skb and its head are allocated in the same page fragment or a compound page
  • 41.
    skb page allocator: zero-copyHTTP messages adjustment Add/remove/update HTTP headers w/o copies skb and its head are allocated in the same page fragment or a compound page
  • 42.
    Beta (exp. early2019) We’re in alpha (0.5.x) Beta (1.0, exp. early 2019) ● Tempesta TLS (GPU offload - TBD) https://www.netdevconf.org/0x12/session.html?kernel-tls- handshakes-for-https-ddos-mitigation ● TLS 1.3 ● HTTP/2 ● Tunable HTTP proxy buffering & streaming (like Tengine) ● HTTP QoS for asymmetric DDoS mitigation (some ML) ● HTTP URI/Cookie/POST normalization (protection against injection attacks)
  • 43.