SlideShare a Scribd company logo
ClickHouse at Cloudflare
Marek Vavrusa
blog.cloudflare.com/how-cloudflare-analyzes-1m-dns-queries-per-second
blog.cloudflare.com/how-cloudflare-analyzes-1m-dns-queries-per-second
Check it
100+
Data centers globally
2.5B
Monthly unique visitors
10%
Internet requests
everyday
1.3M+
DNS queries/second
websites, apps & APIs
in 150 countries
6M+
5.5M+
HTTP requests/second
What did we want?
- Multidimensional query analytics
- Complex ad-hoc queries
- Capable of current and expected future scale
- Gracefully handle late arriving log data
- Roll-ups/aggregations for long term storage
- Highly available and replicated architecture
Inserted
rows /
Second
O(1M)
Edge Points
of Presence
100+
Query
Dimensions
20+
Years of
stored
aggregation
5+
Clickhouse at Cloudflare. By Marek Vavrusa
Clickhouse at Cloudflare. By Marek Vavrusa
We tried a few things...
- Kafka + Go + Citus
- Kafka + Spark Streaming
- Kafka + Flink
- Kafka + Druid
- Kafka + ClickHouse
ClickHouse
- Tabular, column-oriented data store
- Single binary, clustered architecture
- Familiar SQL query interface
Lots of very useful built-in aggregation functions
- Raw log data stored for 3 months
~7 trillion rows
- Aggregated data for ∞
1m, 1h aggregations across 3 dimensions
Attackers
Visitors
Crawlers
& bots
Your
website
Cloudflare
DNS Server
DNS
Query
Log
Forwarder
Kafka Topic
Go ClickHouse
Inserter
ClickHouse
Cluster
Clickhouse at Cloudflare. By Marek Vavrusa
Clickhouse at Cloudflare. By Marek Vavrusa
Speeding up typical queries
Fiels
- SUM() / COUNT() over a few low-cardinality dimensions
- Global overview (trends, monitoring)
- Storing intermediate state for non-additive functions
Clickhouse at Cloudflare. By Marek Vavrusa
Anatomy of a DNS query
$ dig www.cloudflare.com
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36582
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;www.cloudflare.com. IN A
;; ANSWER SECTION:
www.cloudflare.com. 5 IN A 198.41.215.162
www.cloudflare.com. 5 IN A 198.41.214.162
;; Query time: 34 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Sat Sep 2 10:48:30 2017
;; MSG SIZE rcvd: 68
Anatomy of a DNS query
$ dig www.cloudflare.com
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36582
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;www.cloudflare.com. IN A
;; ANSWER SECTION:
www.cloudflare.com. 5 IN A 198.41.215.162
www.cloudflare.com. 5 IN A 198.41.214.162
;; Query time: 34 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Sat Sep 2 10:48:30 2017
;; MSG SIZE rcvd: 68
Fields
30+
Choosing the primary key
Fiels
- Timestamp, zone
- Zone, timestamp
- Zone, timestamp, location
- Zone, toStartOfHour(timestamp), name, location
October 2016
Began evaluating technologies and
architecture, 1 instance in Docker
Finalized schema, deployed a production
ClickHouse cluster of 6 nodes
November 2016
Prototype ClickHouse cluster with 3
nodes, inserting a sample of data
August 2017
Migrated to a new cluster with
multi-tenancy
Growing interest among other
Cloudflare engineering teams,
worked on standard tooling
December 2016
ClickHouse visualisations with
Superset and Grafana
Spring 2017
TopN, IP prefix matching, Go native
driver, Analytics library, pkey in
monotonic functions
Multi-tenant ClickHouse cluster
Row Insertion/s
8M+
Raid-0 Spinning Disks
2PB+
Insertion Throughput/s
4GB+
Nodes
33
October 2016
Began evaluating technologies and
architecture
Finalized schema, deployed a production
ClickHouse cluster of 6 nodes
November 2016
Prototype ClickHouse cluster with 3
nodes, inserting a sample of data
August 2017
Migrated to a new cluster with
multi-tenancy
Growing interest among other
Cloudflare engineering teams,
worked on standard tooling
Example
SELECT toStartOfMinute(datetime) as t,
count() / 60 AS qps
FROM open.dnslogs
WHERE
date = '2017-08-01'
AND toHour(datetime) = 21
AND ...
GROUP BY t
ORDER BY t
Example
SELECT toStartOfMinute(datetime) as t,
count() / 60 AS qps,
uniq(srcIPv4) AS ip4,
uniq(srcIPv6) AS ip6,
uniq(queryName) AS qn,
countIf(queryType = 1) AS aCount,
countIf(queryType = 28) AS aaaaCount
FROM open.dnslogs
WHERE
date = '2017-08-01'
AND ...
GROUP BY t
ORDER BY t
Powerful customer analytics
FILTER BY DNS QUERY NAME
What we’re working on
- Go native driver (github.com/kshvakov/clickhouse)
- Grafana plugin (though the Vertamedia one looks nice)
- Kafka → ClickHouse inserter
- ClickHouse → API scaffolding
- ClickHouse: top K, IP trie dictionary, pkey optimisations, “pipelines”
Attackers
Visitors
Crawlers
& bots
Your
website
Cloudflare
DNS Server
DNS
Query
Log
Forwarder
Kafka Topic
ClickHouse
Cluster
Thanks!
https://blog.cloudflare.com/how-cloudflare-analyzes-1m-dns-queries-per-second

More Related Content

PDF
ClickHouse Monitoring 101: What to monitor and how
PDF
ClickHouse Keeper
PPTX
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
PDF
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
PDF
Better than you think: Handling JSON data in ClickHouse
PDF
All about Zookeeper and ClickHouse Keeper.pdf
PDF
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
PDF
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Keeper
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Better than you think: Handling JSON data in ClickHouse
All about Zookeeper and ClickHouse Keeper.pdf
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf

What's hot (20)

PDF
Using ClickHouse for Experimentation
PDF
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
PDF
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
PDF
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
PDF
ClickHouse Features for Advanced Users, by Aleksei Milovidov
PPTX
High Performance, High Reliability Data Loading on ClickHouse
PDF
Solving PostgreSQL wicked problems
PDF
ClickHouse Deep Dive, by Aleksei Milovidov
PPTX
Introduction to Storm
PDF
A Day in the Life of a ClickHouse Query Webinar Slides
PDF
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
PDF
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
PDF
Altinity Quickstart for ClickHouse
PDF
ClickHouse Intro
PDF
Linux tuning to improve PostgreSQL performance
PDF
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
PDF
쿠키런 1년, 서버개발 분투기
PPTX
Apache Spark Architecture
PDF
Adventures with the ClickHouse ReplacingMergeTree Engine
PPTX
Modeling Data and Queries for Wide Column NoSQL
Using ClickHouse for Experimentation
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
ClickHouse Features for Advanced Users, by Aleksei Milovidov
High Performance, High Reliability Data Loading on ClickHouse
Solving PostgreSQL wicked problems
ClickHouse Deep Dive, by Aleksei Milovidov
Introduction to Storm
A Day in the Life of a ClickHouse Query Webinar Slides
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...
Altinity Quickstart for ClickHouse
ClickHouse Intro
Linux tuning to improve PostgreSQL performance
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
쿠키런 1년, 서버개발 분투기
Apache Spark Architecture
Adventures with the ClickHouse ReplacingMergeTree Engine
Modeling Data and Queries for Wide Column NoSQL
Ad

Similar to Clickhouse at Cloudflare. By Marek Vavrusa (20)

PDF
How Cloudflare analyzes -1m dns queries per second @ Percona E17
PDF
DSDT Meetup Nov 2017
PDF
Dsdt meetup 2017 11-21
PDF
Google Cloud Dataflow Two Worlds Become a Much Better One
PDF
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
PPTX
how to mesure web performance metrics
PPTX
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
PPTX
Gruntwork Executive Summary
PPTX
How leading financial services organisations are winning with tech
PDF
Solving enterprise challenges through scale out storage &amp; big compute final
PPTX
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
PDF
Droplr Serverless Revolution - How we killed 50 servers in a year
PDF
Scaling 100PB Data Warehouse in Cloud
PDF
Modern MySQL Monitoring and Dashboards.
PDF
Scaling Hadoop at LinkedIn
PPT
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
PDF
Fluentd Overview, Now and Then
PPTX
"What's New With Globus" Webinar: Spring 2018
PDF
Unconference Round Table Notes
PPTX
Druid at naver.com - part 1
How Cloudflare analyzes -1m dns queries per second @ Percona E17
DSDT Meetup Nov 2017
Dsdt meetup 2017 11-21
Google Cloud Dataflow Two Worlds Become a Much Better One
Interactive Analytics with the Starburst Presto + Alluxio stack for the Cloud
how to mesure web performance metrics
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
Gruntwork Executive Summary
How leading financial services organisations are winning with tech
Solving enterprise challenges through scale out storage &amp; big compute final
Webinar 2017. Supercharge your analytics with ClickHouse. Vadim Tkachenko
Droplr Serverless Revolution - How we killed 50 servers in a year
Scaling 100PB Data Warehouse in Cloud
Modern MySQL Monitoring and Dashboards.
Scaling Hadoop at LinkedIn
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Fluentd Overview, Now and Then
"What's New With Globus" Webinar: Spring 2018
Unconference Round Table Notes
Druid at naver.com - part 1
Ad

More from Altinity Ltd (20)

PPTX
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
PDF
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
PPTX
Building an Analytic Extension to MySQL with ClickHouse and Open Source
PDF
Fun with ClickHouse Window Functions-2021-08-19.pdf
PDF
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
PDF
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
PDF
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
PDF
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
PDF
ClickHouse ReplacingMergeTree in Telecom Apps
PDF
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
PDF
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
PDF
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
PDF
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
PDF
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
PDF
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
PDF
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
PDF
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
PDF
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
PDF
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
PDF
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Fun with ClickHouse Window Functions-2021-08-19.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
ClickHouse ReplacingMergeTree in Telecom Apps
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Empathic Computing: Creating Shared Understanding
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Modernizing your data center with Dell and AMD
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPT
Teaching material agriculture food technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Approach and Philosophy of On baking technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Machine learning based COVID-19 study performance prediction
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Empathic Computing: Creating Shared Understanding
Unlocking AI with Model Context Protocol (MCP)
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
Modernizing your data center with Dell and AMD
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Teaching material agriculture food technology
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25 Week I
Reach Out and Touch Someone: Haptics and Empathic Computing
Approach and Philosophy of On baking technology
Network Security Unit 5.pdf for BCA BBA.
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Clickhouse at Cloudflare. By Marek Vavrusa

  • 4. 100+ Data centers globally 2.5B Monthly unique visitors 10% Internet requests everyday 1.3M+ DNS queries/second websites, apps & APIs in 150 countries 6M+ 5.5M+ HTTP requests/second
  • 5. What did we want? - Multidimensional query analytics - Complex ad-hoc queries - Capable of current and expected future scale - Gracefully handle late arriving log data - Roll-ups/aggregations for long term storage - Highly available and replicated architecture Inserted rows / Second O(1M) Edge Points of Presence 100+ Query Dimensions 20+ Years of stored aggregation 5+
  • 8. We tried a few things... - Kafka + Go + Citus - Kafka + Spark Streaming - Kafka + Flink - Kafka + Druid - Kafka + ClickHouse
  • 9. ClickHouse - Tabular, column-oriented data store - Single binary, clustered architecture - Familiar SQL query interface Lots of very useful built-in aggregation functions - Raw log data stored for 3 months ~7 trillion rows - Aggregated data for ∞ 1m, 1h aggregations across 3 dimensions
  • 13. Speeding up typical queries Fiels - SUM() / COUNT() over a few low-cardinality dimensions - Global overview (trends, monitoring) - Storing intermediate state for non-additive functions
  • 15. Anatomy of a DNS query $ dig www.cloudflare.com ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36582 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.cloudflare.com. IN A ;; ANSWER SECTION: www.cloudflare.com. 5 IN A 198.41.215.162 www.cloudflare.com. 5 IN A 198.41.214.162 ;; Query time: 34 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Sat Sep 2 10:48:30 2017 ;; MSG SIZE rcvd: 68
  • 16. Anatomy of a DNS query $ dig www.cloudflare.com ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36582 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;www.cloudflare.com. IN A ;; ANSWER SECTION: www.cloudflare.com. 5 IN A 198.41.215.162 www.cloudflare.com. 5 IN A 198.41.214.162 ;; Query time: 34 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Sat Sep 2 10:48:30 2017 ;; MSG SIZE rcvd: 68 Fields 30+
  • 17. Choosing the primary key Fiels - Timestamp, zone - Zone, timestamp - Zone, timestamp, location - Zone, toStartOfHour(timestamp), name, location
  • 18. October 2016 Began evaluating technologies and architecture, 1 instance in Docker Finalized schema, deployed a production ClickHouse cluster of 6 nodes November 2016 Prototype ClickHouse cluster with 3 nodes, inserting a sample of data August 2017 Migrated to a new cluster with multi-tenancy Growing interest among other Cloudflare engineering teams, worked on standard tooling December 2016 ClickHouse visualisations with Superset and Grafana Spring 2017 TopN, IP prefix matching, Go native driver, Analytics library, pkey in monotonic functions
  • 19. Multi-tenant ClickHouse cluster Row Insertion/s 8M+ Raid-0 Spinning Disks 2PB+ Insertion Throughput/s 4GB+ Nodes 33 October 2016 Began evaluating technologies and architecture Finalized schema, deployed a production ClickHouse cluster of 6 nodes November 2016 Prototype ClickHouse cluster with 3 nodes, inserting a sample of data August 2017 Migrated to a new cluster with multi-tenancy Growing interest among other Cloudflare engineering teams, worked on standard tooling
  • 20. Example SELECT toStartOfMinute(datetime) as t, count() / 60 AS qps FROM open.dnslogs WHERE date = '2017-08-01' AND toHour(datetime) = 21 AND ... GROUP BY t ORDER BY t
  • 21. Example SELECT toStartOfMinute(datetime) as t, count() / 60 AS qps, uniq(srcIPv4) AS ip4, uniq(srcIPv6) AS ip6, uniq(queryName) AS qn, countIf(queryType = 1) AS aCount, countIf(queryType = 28) AS aaaaCount FROM open.dnslogs WHERE date = '2017-08-01' AND ... GROUP BY t ORDER BY t
  • 23. What we’re working on - Go native driver (github.com/kshvakov/clickhouse) - Grafana plugin (though the Vertamedia one looks nice) - Kafka → ClickHouse inserter - ClickHouse → API scaffolding - ClickHouse: top K, IP trie dictionary, pkey optimisations, “pipelines”