0% found this document useful (0 votes)
102 views61 pages

Cloud Native Observability Insights

The document discusses Cloud Native Observability (CNO) and its components, including metrics, events, logs, and traces, collectively referred to as M.E.L.T. It emphasizes the transition from traditional monitoring to full-stack observability, enabling organizations to manage and understand complex cloud-native applications. The document also outlines Cisco's solutions and tools for achieving effective observability in modern cloud environments.

Uploaded by

naveen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views61 pages

Cloud Native Observability Insights

The document discusses Cloud Native Observability (CNO) and its components, including metrics, events, logs, and traces, collectively referred to as M.E.L.T. It emphasizes the transition from traditional monitoring to full-stack observability, enabling organizations to manage and understand complex cloud-native applications. The document also outlines Cisco's solutions and tools for achieving effective observability in modern cloud environments.

Uploaded by

naveen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Cloud Native Observability

Shannon McFarlan, CCIE #5245


Distinguished Engineer, Emerging Technologies & Incubation
@eyepv6

BRKCLD-2158
Cisco Webex App

Questions?
Use Cisco Webex App to chat
with the speaker after the session

How
1 Find this session in the Cisco Live Mobile App
2 Click “Join the Discussion”
3 Install the Webex App or go directly to the Webex space Enter your personal notes here

4 Enter messages/questions in the Webex space

Webex spaces will be moderated


until February 24, 2023.

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
Agenda
• What is Cloud Native Observability (CNO)?
• What is M.E.L.T?
• Metrics
• Events (and Alerts)
• Logs
• Traces

• Service Meshes – Built-in CNO


• Cisco Solutions for Observability
• Conclusion

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
What is Cloud
Native
Observability?
What is Cloud Native?
• “Cloud native technologies empower organizations to build and run scalable
applications in modern, dynamic environments such as public, private, and hybrid
clouds. Containers, service meshes, microservices, immutable infrastructure, and
declarative APIs exemplify this approach.
• These techniques enable loosely coupled systems that are resilient, manageable,
and observable. Combined with robust automation, they allow engineers to make
high-impact changes frequently and predictably with minimal toil.” - CNCF
• [Link]
• Other Cloud Native criteria include:
• Elasticity/Horizontal Scaling of Live Services
• Leveraging Common Frameworks (Application service leverages a Service Mesh)

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
Persona/Role – Moving from Monitoring to
Observability
Cloud Native Observability
• Platform Operator/Developer
Apps/Services
• They may want to see the application through a Layer 7
(HTTP/gRPC) lens Data
• What is the latency/RPS/memory/CPU for each service Serverless
component?
Pods, Containers
• Where is the bottleneck?

M.E.L.T.
Security
Kubernetes
• Does each component adhere to an SLO?
L4-7 Networking
• Data Scientist/Data Engineer
L2-3 Networking
• They may want to see very specific parts of the
streaming data pipeline that is sub-component of the Operating System
overall application
Virtualization
• CISO/Security Architect/DevSecOps Compute
• They may want to see the same application view as the Storage
developer, but with a specific focus on CI/CD-centric
security (image scanning, code scanning) and
internal/external API security

*Metrics, Events, Logs, Traces


BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
An Example: Cloud Native Stack
Business Metrics/KPIs
User Metrics
Application
Transaction Metrics
Application
Platform Apps
Cloud Native Full Stack
Platform Pods, Containers / Events Observability
Observability
Kubernetes / Functions
Operating System
Cloud Virtualization
Compute Storage Network
Internet Access Power/Cooling

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Let’s look at a topology
Networking Service
VPC peering, Hybrid Cloud, etc.

DNS CA

Load Balancer [Link] [Link]


Load Balancer

Ingress Stuff: Ingress


• Node logs
db-leader db-follower
• Container/Pod logs
Service • Service logs Service
• Cluster logs
K8s Cluster Deployment • Application/DB logs K8s Cluster
• Load-Balancer logs Deployment

• Network service logs (VPC flow,


VPN, NAT-GW, Routing protocols)
db-leader-pod • Metrics for everything db-follower-
Events for some
pod

• Alerts for even less
• Tracing for even less

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
“Full-Stack Observability” adds to traditional
monitoring to support seamless digital
experiences for modern architectures and teams
Monitoring Full-Stack Observability

Passive detection of (sub)system Seamless digital experiences for modern application


“health” issues architectures and teams

▪ Encompasses full spectrum of Visibility – Insights – Action


capabilities to actively understand issues and drive remediation

▪ Broad coverage across application, infrastructure, networking, and


security stack with rich, real-time correlation across domains
‒ Includes all systems impacting the digital experience for users

▪ Focused on actively understanding and remediating issues,


enhanced with ML/AI to ultimately predict issues before they occur

▪ Facilitates collaboration across modern teams (e.g., DevOps / SRE)


to achieve common objectives (e.g., SLOs)

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Monitoring and Observability
Traditional New Observability
Monitoring Chain Dominant Design

Full-Stack
Observability Now
Detect Multi-domain
3 connected
trends forming
Cloud + AI/ML new dominant
DevOps Analysis design around
Diagnose
observability

Future
Automated
Actions
Fix

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
What is
M.E.L.T.?
Metrics
Metrics
Collect/Measure Data at Regular Intervals
Expose Collect Store Query

• Expose
• Infrastructure:
• AWS Elastic Compute 2 (EC2) VM hosting Elastic Kubernetes Service (EKS) worker node - CPU, Memory,
Storage, Network
• Application:
• NGINX, DB

• Collect
• Scrape from exposed sources

• Store
• Time-Series Database (TSDB)

• Query
• PromQL, MQL (monitoring query language), MetricsQL (VictoriaMetrics)

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Metrics
Collect/Measure Data at Regular Intervals
Expose Collect Store Query

• Prometheus ([Link] – Open-source event monitoring and alerting tool


• Thanos ([Link] – Adds high availability, long-term storage and global query
capabilities for Prometheus
• Cortex metrics ([Link] – Adds high availability, multi-tenant, horizontally
scalable and long-term storage capabilities for Prometheus
• Grafana ([Link] – Visualization of metrics, logs and events from MANY data sources
to include Prometheus
• AWS CloudWatch Metrics
• Google Cloud Metrics
• Microsoft Azure Monitor Metrics

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Metrics
Common Architectural Components

Service
Discovery
Alerts

UI

Collector Server/Engine
API

CLI

Storage
3rd Party UI

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
Metrics
[Link]

Prometheus Example

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
Events
and Alerts
Events
• Metrics and Events are two different data types:
• Metrics = regular/predictable data
• Events = irregular/unpredictable data
• Scheduled or unscheduled state changes

• Kubernetes event example:


# kubectl get events --field-selector reason=NodeHasSufficientMemory
LAST SEEN TYPE REASON OBJECT MESSAGE
57m Normal NodeHasSufficientMemory node/<omitted> Node <omitted> status is now: NodeHasSufficientMemory

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Events
• Events can be paired with other toolsets to provide a robust
Event<>Action framework
• Metrics, Pub/Sub, AI/ML, DevOps, etc.
• Event-Driven Architecture (EDA):
• KEDA – Kubernetes-Based Event-Driven Autoscaler: [Link]
• AWS EventBridge
• Many, many more

Target/
Metrics Event Rule Destination/
Action

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Events – AWS EventBridge

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Events – AWS EventBridge

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Alerts
• Alerts – A predefined trigger based on a threshold or event
• Static alert example:
• HTTP Request Per Second (RPS) of 90% triggers alert to Slack
• Kubernetes Node isn’t ready for 1 minutes (Prometheus Example)
- alert: KubernetesNodeReady
expr: kube_node_status_condition{condition="Ready",status="true"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: Kubernetes Node ready (instance {{ $[Link] }})
description: "Node {{ $[Link] }} unready \n VALUE = {{ $value }}\n LABELS = {{ $labels }}"

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Alerts – AWS CloudWatch Alerts

Metric Conditions Action

AWS EC2 Threshold: Static Notification: In Alarm


-Per-Instance -CPU -Publish to AWS SNS Topic
-CPUUtilization -Greater than 90% -SNS Topic > Email
-(Auto scaling)
-(EC2 action)
-(System Manager action)

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Alerts – AWS CloudWatch Example

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
Alerts – AWS CloudWatch Example

© 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public
Alerts – AWS CloudWatch Example

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
Logs
Logs
• Nearly everything in a Cloud Native (or other) environment produces logs in some form

• Logging has tremendous potential, but it is very complex to manage all the sources
and then derive value out of what the logs say
• Collection and data formatting should be simple, but it isn’t:
• Currently, K8s, doesn’t enforce uniform structure for log messages*
• You can’t safely assume all log formats are in JSON
• You may need to transform logs
• There are MANY gotchas on storage, forwarding, rotation – We don’t have time for
that today

*[Link]

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
Logs
tl;dr: Logs sucks

Common Components Common Stacks


• Fluentd/Fluent Bit – Open-Source log • EFK – Elasticsearch, Fluentd/bit,
collection and processing (Bit is for highly
resource-constrained environments) Kibana
• Logstash – Open-Source log collection and • ELK – Elasticsearch, Logstash,
processing
Kibana
• Elasticsearch, OpenSearch, Grafana Loki, etc
– Aggregation, search and analytics • OFO – OpenSearch, Fluentd/bit,
• Kibana, OpenSearch Dashboard, Grafana OpenSearch Dashboard
Loki, etc – Dashboard
• ENDLESS combination of tools
• AWS CloudWatch/CloudTrail, GCP Cloud
Logging, Microsoft Azure Cloud Monitoring
• Many more…

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
L ogs – Common Architecture Components

Collection/ Aggregation and Indexing and


Sources Visualization
Ingestion Processing Storage

Output Plugins:
Input Plugins:
fluentd >
stdout >
elasticsearch
fluentd fluentd > S3
http > fluentd

Building this by hand and as independent components is

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 30
Logs
• Do things the easier way:
• Kubernetes:
• Use Operators – Cisco (Banzai Cloud)
Open-Source Logging Operator
• [Link]
operator
• Fluentd/FluentBit and source
configuration
• Security (TLS, RBAC, etc)
• Output configuration
• AWS CloudWatch, S3, Azure Storage,
GCP Storage, Elasticsearch, Grafana Loki,
Kafka, etc.

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
Logging Operator – Example – One source, multi-outputs
– NGINX to Elasticsearch/Kibana & Grafana Loki

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
Traces
Traces
• Distributed tracing helps with:
• Service mapping (topology)
• Bottlenecks/Latency/Drops in a distributed architecture (network, microservice, etc.)
• Example projects/solutions:
• OpenTelemetry - Combo of OpenCensus + OpenTracing – Library-based collection
• Service Meshes – Istio, Linkerd, etc. – Sidecar-based collection
• Jaeger – Visualize traces
• W3C TraceContext/B3 TraceContext – Bringing some sanity to the format of a trace
ID
• AWS X-ray, GCP Cloud Trace, Azure Monitor (Application Insights) – Tracing
libraries and visualization service

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
Traces
• Primer:
• A “span” is the foundational element of a distributed trace – it represents an individual unit of
work
• A “span” can reference another span and when assembled, you have a “trace”
• “Context propagation” – Correlate trace metadata across service boundaries
• Not using a standard method for trace context propagation can lead to VERY painful
deployments and VERY expensive workarounds
svc_1 span ff9a7eb95d042655

ac6ee4da7079ddc7ff9a7eb95d042655
"traceID": "ac6ee4da7079ddc7ff9a7eb95d042655",
"spans": [
{
"traceID": "ac6ee4da7079ddc7ff9a7eb95d042655",
"spanID": "ff9a7eb95d042655",
"operationName": "[Link]/*",
"references": [],

CHILD_OF
"startTime": 1638308084933526,
svc_2 span

Trace
"duration": 803461, 799a0a38bbd339d4
{
"traceID": "ac6ee4da7079ddc7ff9a7eb95d042655",
"spanID": "799a0a38bbd339d4",
"operationName": "[Link]/*",
"references": [
{
"refType": "CHILD_OF",
"traceID": "ac6ee4da7079ddc7ff9a7eb95d042655",
"spanID": "ff9a7eb95d042655"

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
OpenTelemetry (OTel)
[Link]

• Language-specific
libraries
• Supports: Traces,
Metrics, Logs
• The Collector
recognizes multiple
Trace Context formats
• Different form factors
for the Collector

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
Example: OpenTelemetry Components in Action
Visualization,
Storage, etc.

Optional path

OTel
collector
(gateway)
sidecar/ OTel
collector
K8s
daemons (agent)
et
Application Code Application Code

OTel SDK (language-specific) OTel SDK (language-specific)


OTel API OTel API
OTel Instrumentation (auto/manual) OTel Instrumentation (auto/manual)

Container Gateway Model L4-7 Hyperscalers


BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
OpenTelemetry + Jaeger
svc_1 svc_2 svc_3

svc_1 span

Trace
svc_2 span

svc_3 span

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 39
Tracing Deployment Example

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
Example OpenTelemetry Deploy - 1
Deploy a test KinD cluster
# kind create cluster

Deploy Cert Manager


# kubectl apply -f [Link]

Deploy the Jaeger Operator


# kubectl create namespace observability
# kubectl create -f [Link] -n
observability

Deploy the Jaeger All-in-One Strategy


# kubectl apply -f - <<EOF
apiVersion: [Link]/v1
kind: Jaeger
metadata:
name: simplest
EOF

Deploy the OpenTelemetry Operator


# kubectl apply -f [Link]

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
Example OpenTelemetry Deploy - 2
Deploy the OTel Collector Deploy the OTel Java Auto-instrumentation CRD
# kubectl apply -f - <<EOF # kubectl apply -f - <<EOF
apiVersion: [Link]/v1alpha1 apiVersion: [Link]/v1alpha1
kind: OpenTelemetryCollector kind: Instrumentation
metadata: metadata:
name: otel name: my-instrumentation
spec: spec:
config: | exporter:
receivers: endpoint: [Link]
otlp: propagators:
protocols: - tracecontext
grpc: - baggage
http: - b3
processors: sampler:
exporters: type: parentbased_traceidratio
logging: argument: "0.25"
jaeger: java:
endpoint: "simplest-collector:14250" image: [Link]/open-telemetry/opentelemetry-
tls: operator/autoinstrumentation-java:latest
insecure: true nodejs:
service: image: [Link]/open-telemetry/opentelemetry-
pipelines: operator/autoinstrumentation-nodejs:latest
traces: python:
receivers: [otlp] image: [Link]/open-telemetry/opentelemetry-
processors: [] operator/autoinstrumentation-python:latest
exporters: [jaeger] EOF
EOF

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
Example OpenTelemetry Deploy - 3
Deploy the Spring Pet Clinic service
# kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: spring-petclinic
spec:
selector:
matchLabels:
app: spring-petclinic
replicas: 1
template:
metadata:
labels:
app: spring-petclinic
annotations:
[Link]/inject: "true"
[Link]/inject-java: "true"
spec:
containers:
- name: app
image: [Link]/pavolloffay/spring-petclinic:latest
EOF

# kubectl port-forward [Link]/spring-petclinic 8080:8080

# kubectl port-forward svc/simplest-query 16686:16686

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
Example OpenTelemetry - Validation
# kubectl logs [Link]/otel-collector
. . . <output summarized>
builder/receivers_builder.go:73 Receiver started. {"kind": "receiver", "name": "otlp"}
. . .
jaegerexporter@v0.41.0/[Link] State of the connection with the Jaeger Collector backend {"kind": "exporter",
"name": "jaeger", "state": "READY"}

Browser-to-Jaeger Query Service (16686) and Otel Collector-to-Jaeger (14250)


# kubectl exec -it simplest-797dd8fc67-6l9q4 -- netstat -at

Active Internet connections (servers and established)


Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:47280 localhost:16686 ESTABLISHED
tcp 0 0 simplest-797dd8fc67-6l9q4:14250 [Link] ESTABLISHED

Browser-to-Pet Clinic UI (8080) and Java Otel library-to-Otel Collector (4317)


root@spring-petclinic-6d569df946-9f26m:/# netstat -at
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:55028 localhost:8080 ESTABLISHED
tcp 0 0 [Link]:34208 [Link] ESTABLISHED

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Hybrid MELT Support

logs – Fluent, filebeat


OTel
Exporte
OTel collector
Exporte r non-OTel-svc
collector (agent)
r tracing/metrics
(agent) [Link]
[Link]
Application Code
Application Code
OTel SDK (language-specific)
OTel SDK (language-specific) OTel API
OTel API OTel Instrumentation (auto/manual)
OTel Instrumentation (auto/manual)
Python Agent – Tracing over OTLP
Java Agent – MLT over OTLP pip install opentelemetry-api
pip install opentelemetry-sdk
java -javaagent:path/to/opentelemetry- pip install opentelemetry-exporter-{exporter}
[Link] \ -jar [Link] pip install -U aws-xray-sdk

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 45
Application Instrumentation Options
• Library/SDK-based:
• OpenTelemetry
• Cisco/AppD
• Cisco/Espagon
• AWS X-ray, GCP Cloud Trace, Azure Monitor (Application Insights)
• Many others
• Sidecar-based:
• Service Meshes – Istio, Linkerd, Consul Connect, KongHQ, etc.

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
Service Mesh
What is a Service Mesh? • Infrastructure layer for
service-to-service
Service Mesh Control Plane communication
• Can use a mesh of sidecar
proxies:
• Can inspect API
sidecar sidecar
svcB proxy svcC proxy transactions at Layer 7
podB podC
and 4 (TCP)
• Intelligent routing rules
can be applied
between endpoints
• Allow for tracing and some
UI sidecar
Service proxy application instrumentation
podA
without the need to add
Ingress/Gateway
User/Tool/Service
code/libraries/SDK to the
application

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 48
Service Mesh Observability with Proxies
• Service Meshes provide observability via their sidecar proxies
• Observability info:
• Mesh-specific metrics
• Application-specific metrics
• Distributed traces: Layer 4-7: TCP, HTTP, gRPC
• Access logs (mesh and apps)

• In-mesh dashboards: Istio/Kiali, Linkerd/Viz, Cisco Calisti


• Out-of-mesh dashboards: Prometheus, Grafana, Jaeger, EFK, etc.

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 49
Cisco Solutions
[Link]
us/solutions/full-stack-
[Link]
Cisco Calisti Operationalize the Service Mesh
[Link]
Multi-cloud, multi-cluster connectivity and
observability
Connect any on-prem and public cloud together

Simplifies service mesh management


Single pane of glass, in depth metrics

Policy-based app networking & security


Policy management for DevOps teams

Apache Kafka on Kubernetes & Service Mesh


Lifecycle management of Apache Kafka and
components

Traffic management ensures Complete application and Security at all layers between
smooth app updates health observability clusters and clouds

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Demo
Cisco Full-Stack Observability

Full-Stack Full-Stack Full-Stack


Visibility Insights Actions

Observable Application and Prioritized remediations


and optimizable business insights and optimizations
technology stack across stack across stack

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
Differentiated Solution with Business Context
Full-Stack actions for the business

Visibility Insights Actions


Observable Application and Prioritized remediations
MELT and Transactions,
and optimizable open telemetry business insights incidents, KPIs and optimizations
technology stack across stack across stack

Application performance, End-to-end correlation Application to


services and development and dependencies infrastructure performance
Network and internet Shared common context of Cost and workload optimizations
performance issues
DC infrastructure and clouds Application security
Prioritize issues with business detection and policy enforcement
Security (dev and runtime)
and experience impact
Business outcomes
and transactions
End user experiences
Prioritize issues with business context

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 54
Full-Stack Observability
Builds on monitoring and visibility, and adds business context
Full-Stack Observability
Multiple domains and cross-functional teams

Business context and Impact Cross-domain full MELT and security DevOps and SRE
Real-time, distributed and hybrid apps Cloud and Edge native KPI: SLO with business context, insights
Issues and Incident remediation driven actions/automation,

Visibility/Observability
Per domain/team

Active and modern apps Telemetry based (MEL) subset KPI: performance, experience
Root Cause Identification Tools sprawl, some integrations

Monitoring
Per domain/team

Passive and traditional apps Events sampling KPI: Availability, capacity


Health and reporting Dashboards/views

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 55
Summary
• There are a lot of things to keep track of and a lot of tools to help you do so:
• Metrics, Events, Logs, Traces
• Proprietary solutions, open-source solutions
• Most solutions (vendor and OSS) do a handful of things well - most of the time up to you to
‘integrate’ them

• Next-gen solutions such as Cisco Full Stack Observability will reduce/remove the
burden of you having to stitch together various tools to gain visibility to – derive value
from and take action on your data
• Check out Cisco FSO: [Link]
[Link]
• Start working with Cisco Calisti!
• [Link]

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Related Sessions
Session Code Title Speaker
BRKCLD-2759 Full-Stack Observability: The HOW! Carlos Pereira
BRKETI-2005 Simplifying Cloud Native Application Connectivity and Observability Ivan Padilla Ojeda
with Calisti
LTRAPP-2682 Building Observability Solutions on the FSO Platform Instructor-led Labs, labs 3
Renato Quedas
BRKAPP-4042 What is Full-Stack Observability (FSO) and How It Can Help You Joe Byrne, Wei Wang
Featuring customer EasyJet
BRKAPP-2098 Observe and Troubleshoot Cloud Native Applications for IT Ops and Vipul Shah
DevOps with AppDynamics Cloud
BRKAPP-2624 Full-stack Observability (FSO) for App Security in the Cloud or Randy Birdsall
Wherever
PSOAPP-1775 New AppDynamics Innovation in Cloud and Security Randy Birdsall, Eugene Kim

BRKAPP-1154 Do Tell About OTel: An Introduction to OpenTelemetry and How Wayne Brown
AppDynamics is Embracing It
BRKAPP-2322 Observability Starts Here: Enhance and Add Value to your Cloud Native Pranav Kumar
Capabilities
BRKAPP-3503 Custom Correlation on AppDynamics: The Secret Weapon for True Ivo Santos
Business Transaction Visibility

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 57
Complete your Session Survey
• Please complete your session survey
after each session. Your feedback
is very important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (open from Thursday) to
receive your Cisco Live t-shirt.
• All surveys can be taken in the Cisco Events Mobile App or
by logging in to the Session Catalog and clicking the
"Attendee Dashboard” at
[Link]
[Link]

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
Continue Your Education

Visit the Cisco Showcase for related demos.

Book your one-on-one Meet the Engineer meeting.

Attend any of the related sessions at the DevNet,


Capture the Flag, and Walk-in Labs zones.

Visit the On-Demand Library for more sessions


at [Link]/on-demand.

BRKCLD-2158 © 2023 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
Thank you

You might also like