Prometheus
By Kasper Nissen
@phennex
Monitoring with
Hi!
My name is Kasper
@phennex
What am I going to cover?
@phennex
+
+
+
Monitoring - why and what?
Prometheus - an introduction
Short demo
DEMO Part 1
@phennex
https://github.com/kaspernissen/automation_night_demo
Why monitor?
@phennex
What to monitor?
@phennex
Analyzing long-term trends
@phennex
What to monitor?
@phennex
Comparing over time or experiment groups
@phennex
What to monitor?
@phennex
Alerting
@phennex
What to monitor?
@phennex
Building dashboards
@phennex
@phennex
Conducting ad hoc retrospective analysis
@phennex
@phennex
Purpose:
What is broken?
and why?
What to monitor?
@phennex
What to monitor?
@phennex
Hosts
CPU, Memory, I/O, Network, Filesystem
@phennex
What to monitor?
@phennex
Containers
CPU, Memory, I/O, Restarts, Throttling
@phennex
What to monitor?
@phennex
Applications
Throughput, Latency
@phennex
The Four Golden Signals
@phennex
Site Reliability Engineering - How Google Runs Production Systems
What to monitor?
@phennex
Latency
The time it takes to service a request.
Important to distinguish between the latency of
successful and failed requests.
@phennex
What to monitor?
@phennex
Traffic
A measure of how much demand is being placed on your system,
measured in a high-level system-specific metric.
@phennex
What to monitor?
@phennex
Errors
The rate of requests that fail, either explicitly (e.g. HTTP 500s),
implicitly (HTTP 200 success with wrong content)
@phennex
What to monitor?
@phennex
Saturation
How “full” your service is. A measure of your system fraction,
emphasizing the resources that are most constrained
(e.g. in a memory-constrained system, show memory)
@phennex
Prometheus
@phennex
What to monitor?
@phennex
Prometheus
Prometheus was presented to be the protector and benefactor of mankind.
@phennex
Prometheus
@phennex
+
+
+
+
Heavily inspired by Borgmon
Built by ex-Googlers at SoundCloud
Pull-based (scrapes at regular intervals)
Many integration possibilities
The 2nd project in CNCF
What is Prometheus?
@phennex
+
+
+
+
+
+
Monitoring system and Timeseries Database
Instrumentation
Metrics collection and storage
Querying
Alerting
Dashboard / Graphing / Trending
Source: https://promcon.io/2016-berlin/talks/prometheus-design-and-philosophy/
Prometheus focus on
@phennex
+
+
Operational systems monitoring
Dynamic cloud environments
Source: https://promcon.io/2016-berlin/talks/prometheus-design-and-philosophy/
Prometheus does not do
@phennex
+
+
+
+
+
+
Raw log / event collection (use ELK stack)
Request tracing (use opentracing.io)
“Magic” anomaly detection
Durable long-term storage
Automatic horizontal scaling
User / auth management
Prometheus Architecture
@phennex
Long-lived jobs
Pushgateway AlertmanagerShort-lived jobs
Grafana
The Data model
@phennex
<metric name>{<label name>=<label value>, …}
api_http_requests_total{method="POST", handler="/messages"}
Notation:
Example:
Every time series is uniquely identified by its metric name and a set of key-
value pairs, also known as labels.
How to get metrics?
@phennex
Directly
instrumented
Not Directly
instrumented
Exporter
Source: https://promcon.io/2016-berlin/talks/so-you-want-to-write-an-exporter/
@phennex
Directly instrumented software
@phennex
cAdvisor
Doorman
Etcd
Kubernetes-Mesos
Kubernetes
RobustIRC
SkyDNS
Weave Flux
Official Prometheus Exporters
@phennex
Node/system metrics exporter
AWS CloudWatch exporter
Blackbox exporter
Collectd exporter
Consul exporter
Graphite exporter
HAProxy exporter
InfluxDB exporter
JMX exporter
Memcached exporter
Mesos task exporter
MySQL server exporter
SNMP exporter
StatsD exporter
3rd party exporters
@phennex
Databases
Aerospike exporter
ClickHouse exporter
CouchDB exporter
MongoDB exporter
PgBouncer exporter
PostgreSQL exporter
ProxySQL exporter
Redis exporter
RethinkDB exporter
SQL query result set metrics exporter
3rd party exporters
@phennex
Hardware related
apcupsd exporter
IoT Edison exporter
IPMI exporter
knxd exporter
Ubiquiti UniFi exporter
Messaging systems
NATS exporter
NSQ exporter
RabbitMQ exporter
RabbitMQ Management Plugin exporter
Mirth Connect exporter
3rd party exporters
@phennex
Storage
Ceph exporter
ScaleIO exporter
HTTP
Apache exporter
Nginx metric library
Passenger exporter
Varnish exporter
WebDriver exporter
APIs
Docker Hub exporter
GitHub exporter
OpenWeatherMap exporter
Rancher exporter
Speedtest.net exporter
Logging
Google's mtail log data extractor
Grok exporter
Other monitoring systems
Cloud Foundry Firehose exporter
scollector exporter
Heka dashboard exporter
Heka exporter
Munin exporter
New Relic exporter
Miscellaneous
BIG-IP exporter
BIND exporter
BOSH exporter
Jenkins exporter
Meteor JS web framework exporter
Minecraft exporter module
PowerDNS exporter
rTorrent exporter
SMTP/Maildir MDA blackbox prober
Xen exporter
PromQL
@phennex
+
+
+
Non-SQL Query Language
Better for metrics computation
Only does reads
Source: https://promcon.io/2016-berlin/talks/prometheus-design-and-philosophy/
PromQL - Operators
@phennex
+ (addition) == (equal)
- (substraction) != (not-equal)
* (multiplication) > (greater-than)
/ (division) < (less-than)
% (modulo) >= (greater-or-equal)
^ (exponentiation) <= (less-or-equal)
and (intersection) or (union)
unless (complement)
… and vector matching
Source: https://prometheus.io
PromQL - Aggregation Operators
@phennex
sum stddev bottomk
min stdvar topk
max count quantile
avg count_values
Source: https://prometheus.io
PromQL - Examples
@phennex
rate(api_http_requests_total[5m])
errors{job=“foo”} / total{job=“foo”}
Source: https://promcon.io/2016-berlin/talks/prometheus-design-and-philosophy/
DEMO Part 2
@phennex
https://github.com/kaspernissen/automation_night_demo
Alerting
@phennex
What to monitor?
@phennex
Symptom-based alerting
Be proactive
@phennex
What to monitor?
@phennex
Prevent alert fatigue
- Use ticketing systems (Avoid email spam)
- Warning are tasks like new features
@phennex
What to monitor?
@phennex
Provide runbooks
- Keep them concise
- Explanation, hints, links
- Dynamic - include recent observations
@phennex
What to monitor?
@phennex
Practice outages
“Firedrills”, “Gamedays” - repeat regularly
@phennex
Monitoring with prometheus
@phennex
Start being proactive.
Dont be firefighters.
… and remember …
@phennex
Hope is NOT a strategy
@phennex
Source: Site Reliability Engineering, How Google Runs Production Systems (2016), B. Beyer et al.
If you wanna know more…
@phennex
- prometheus.io
- promcon.io
- The Site Reliability Engineering book
- Podcasts:
- https://dev.to/sedaily/prometheus-monitoring-with-brian-brazil
- https://dev.to/sedaily/the-art-of-monitoring-with-james-turnbull 

(prefers push based opposite prometheus)
- https://dev.to/sedaily/prometheus-with-julius-volz
@phennex
The 3rd project in CNCF
opentracing.io
Thank you!
@phennex
kaspernissen@gmail.com
@phennex

More Related Content

PDF
Getting Started Monitoring with Prometheus and Grafana
PDF
Prometheus Overview
PDF
Infrastructure & System Monitoring using Prometheus
PPTX
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
ODP
Monitoring With Prometheus
PDF
Server monitoring using grafana and prometheus
PPTX
MeetUp Monitoring with Prometheus and Grafana (September 2018)
PDF
Implementing Observability for Kubernetes.pdf
Getting Started Monitoring with Prometheus and Grafana
Prometheus Overview
Infrastructure & System Monitoring using Prometheus
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Monitoring With Prometheus
Server monitoring using grafana and prometheus
MeetUp Monitoring with Prometheus and Grafana (September 2018)
Implementing Observability for Kubernetes.pdf

What's hot (20)

PDF
Prometheus - basics
PPTX
Monitoring With Prometheus
PPTX
Prometheus and Grafana
PDF
Prometheus monitoring
PPT
Monitoring using Prometheus and Grafana
PDF
Monitoring Kubernetes with Prometheus
PDF
Monitoring Kubernetes with Prometheus
PDF
How to monitor your micro-service with Prometheus?
PPTX
Grafana
PDF
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
PDF
Cloud Monitoring tool Grafana
PPTX
Prometheus design and philosophy
PDF
Observability
PDF
Prometheus
PDF
Systems Monitoring with Prometheus (Devops Ireland April 2015)
PDF
Prometheus + Grafana = Awesome Monitoring
PPTX
MySQL Monitoring using Prometheus & Grafana
PDF
Explore your prometheus data in grafana - Promcon 2018
PPTX
Prometheus 101
PDF
Grafana introduction
Prometheus - basics
Monitoring With Prometheus
Prometheus and Grafana
Prometheus monitoring
Monitoring using Prometheus and Grafana
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
How to monitor your micro-service with Prometheus?
Grafana
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Cloud Monitoring tool Grafana
Prometheus design and philosophy
Observability
Prometheus
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Prometheus + Grafana = Awesome Monitoring
MySQL Monitoring using Prometheus & Grafana
Explore your prometheus data in grafana - Promcon 2018
Prometheus 101
Grafana introduction
Ad

Viewers also liked (20)

PDF
PromQL Deep Dive - The Prometheus Query Language
PPTX
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
PDF
Microservices and Prometheus (Microservices NYC 2016)
PPTX
Monitoring at-lazada
PDF
Cloud Monitoring with Prometheus
PDF
What is your application doing right now? An introduction to Prometheus
PDF
Computer monitoring with the Open Monitoring Distribution
PDF
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...
PDF
Data Visualization on the Tech Side
PDF
Doç. Dr. Mehmet Ali GÜLÇELİK
PPTX
Engineering Development & Design Capstone Project _ RICE-Optimized Knee Brace
DOCX
Alan Johnson Resume
PDF
Business quiz
PDF
Realtime Recommender with Redis: Hands on
PDF
Docker Swarm Meetup (15min lightning)
PDF
Regex Considered Harmful: Use Rosie Pattern Language Instead
PPTX
Incident Response in the wake of Dear CEO
PPTX
Plumbing tips
PDF
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
PPTX
Ufrs varlıklar grubu standartları i̇nceleme raporu sunumu
PromQL Deep Dive - The Prometheus Query Language
Counting with Prometheus (CloudNativeCon+Kubecon Europe 2017)
Microservices and Prometheus (Microservices NYC 2016)
Monitoring at-lazada
Cloud Monitoring with Prometheus
What is your application doing right now? An introduction to Prometheus
Computer monitoring with the Open Monitoring Distribution
How to build a Distributed Serverless Polyglot Microservices IoT Platform us...
Data Visualization on the Tech Side
Doç. Dr. Mehmet Ali GÜLÇELİK
Engineering Development & Design Capstone Project _ RICE-Optimized Knee Brace
Alan Johnson Resume
Business quiz
Realtime Recommender with Redis: Hands on
Docker Swarm Meetup (15min lightning)
Regex Considered Harmful: Use Rosie Pattern Language Instead
Incident Response in the wake of Dear CEO
Plumbing tips
Urban legends - PJ Hagerty - Codemotion Amsterdam 2017
Ufrs varlıklar grubu standartları i̇nceleme raporu sunumu
Ad

Similar to Monitoring with prometheus (20)

PDF
Monitoring Cloud Native Applications with Prometheus
PDF
System monitoring
PPTX
So You Want to Write an Exporter
PPTX
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
PPTX
An Introduction to Prometheus (GrafanaCon 2016)
PDF
Regain Control Thanks To Prometheus
PDF
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
PPTX
Prometheus - Open Source Forum Japan
PDF
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
PDF
Monitoring with Prometheus
PDF
Monitoring a Kubernetes-backed microservice architecture with Prometheus
PDF
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
PPTX
Prometheus for Monitoring Metrics (Fermilab 2018)
PDF
Prometheus Course from beginners to expert course
PDF
Prometheus (Microsoft, 2016)
PPTX
Prometheus - Utah Software Architecture Meetup - Clint Checketts
PPTX
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
PDF
The hitchhiker’s guide to Prometheus
PDF
The hitchhiker’s guide to Prometheus
PDF
DevOps Spain 2019. Beatriz Martínez-IBM
Monitoring Cloud Native Applications with Prometheus
System monitoring
So You Want to Write an Exporter
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
An Introduction to Prometheus (GrafanaCon 2016)
Regain Control Thanks To Prometheus
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
Prometheus - Open Source Forum Japan
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring with Prometheus
Monitoring a Kubernetes-backed microservice architecture with Prometheus
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Prometheus for Monitoring Metrics (Fermilab 2018)
Prometheus Course from beginners to expert course
Prometheus (Microsoft, 2016)
Prometheus - Utah Software Architecture Meetup - Clint Checketts
Prometheus for Monitoring Metrics (Percona Live Europe 2017)
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
DevOps Spain 2019. Beatriz Martínez-IBM

More from Kasper Nissen (11)

PDF
GitOps - Operation By Pull Request
PDF
Should developers care about dockerfiles and kubernetes resources
PDF
Two Years In Production With Kubernetes - An Experience Report
PDF
Cloud Native CI/CD with GitOps
PDF
Cloud native aarhus #5
PDF
Kubernetes Kops - Automation Night
PDF
Lunar Way and the Cloud Native "stack"
PDF
Container orchestration on_aws
PDF
IT Minds Mindblown Networking Event 2016
PDF
Google Cloud Platform and Kubernetes
PDF
Let's tak Productivity (Let's talk Apple #4)
GitOps - Operation By Pull Request
Should developers care about dockerfiles and kubernetes resources
Two Years In Production With Kubernetes - An Experience Report
Cloud Native CI/CD with GitOps
Cloud native aarhus #5
Kubernetes Kops - Automation Night
Lunar Way and the Cloud Native "stack"
Container orchestration on_aws
IT Minds Mindblown Networking Event 2016
Google Cloud Platform and Kubernetes
Let's tak Productivity (Let's talk Apple #4)

Recently uploaded (20)

PDF
Build Real-Time ML Apps with Python, Feast & NoSQL
PPTX
Internet of Everything -Basic concepts details
PDF
Altius execution marketplace concept.pdf
PDF
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
PDF
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
PDF
Electrocardiogram sequences data analytics and classification using unsupervi...
PDF
EIS-Webinar-Regulated-Industries-2025-08.pdf
PDF
Ensemble model-based arrhythmia classification with local interpretable model...
PDF
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
PPTX
Presentation - Principles of Instructional Design.pptx
PDF
CEH Module 2 Footprinting CEH V13, concepts
PDF
Examining Bias in AI Generated News Content.pdf
PDF
Co-training pseudo-labeling for text classification with support vector machi...
PDF
A symptom-driven medical diagnosis support model based on machine learning te...
PDF
ment.tech-Siri Delay Opens AI Startup Opportunity in 2025.pdf
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
Introduction to MCP and A2A Protocols: Enabling Agent Communication
PDF
substrate PowerPoint Presentation basic one
PDF
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
PDF
Dell Pro Micro: Speed customer interactions, patient processing, and learning...
Build Real-Time ML Apps with Python, Feast & NoSQL
Internet of Everything -Basic concepts details
Altius execution marketplace concept.pdf
AI.gov: A Trojan Horse in the Age of Artificial Intelligence
IT-ITes Industry bjjbnkmkhkhknbmhkhmjhjkhj
Electrocardiogram sequences data analytics and classification using unsupervi...
EIS-Webinar-Regulated-Industries-2025-08.pdf
Ensemble model-based arrhythmia classification with local interpretable model...
Transform-Your-Supply-Chain-with-AI-Driven-Quality-Engineering.pdf
Presentation - Principles of Instructional Design.pptx
CEH Module 2 Footprinting CEH V13, concepts
Examining Bias in AI Generated News Content.pdf
Co-training pseudo-labeling for text classification with support vector machi...
A symptom-driven medical diagnosis support model based on machine learning te...
ment.tech-Siri Delay Opens AI Startup Opportunity in 2025.pdf
Advancing precision in air quality forecasting through machine learning integ...
Introduction to MCP and A2A Protocols: Enabling Agent Communication
substrate PowerPoint Presentation basic one
5-Ways-AI-is-Revolutionizing-Telecom-Quality-Engineering.pdf
Dell Pro Micro: Speed customer interactions, patient processing, and learning...

Monitoring with prometheus