Unlocking Enhanced Observability: The Power of Thanos in Multi-Cluster Kubernetes Environments
Introduction
In this article we are going to see the limitation of a Prometheus only monitoring stack and why moving to a Thanos based stack can improve metrics retention and also reduce overall infrastructure cost.
Config files and chart used for this demo are available here.
Kubernetes Prometheus Stack
This stack is designed to collect, store, query, and visualize metrics from Kubernetes clusters, providing insights into the health and performance of the applications and infrastructure.
- Prometheus: collect metrics
- AlertManager: send alerts to various provider based on metrics query
- Grafana: fancy dashboards
While this is a popular and effective monitoring solution, it also has some caveats and limitations that one should be aware of:
- It does not scale out well when increasing the number of cluster from which you want to get metrics.
- Each cluster has its own Grafana with its own set of dashboards which can be a pain to maintain.
- Prometheus is designed for short-term data retention by default. It stores metrics for a limited period, typically a few weeks. This means that historical data beyond the retention period is not available for analysis, block storage can be expensive if you store terabyte of data on it. which may be crucial for certain use cases.
Thanos
Thanos works as an extension to Prometheus, enhancing its capabilities to address the challenges of long-term data retention, scalability, and global querying. It introduces several components and architectural changes to achieve these goals. Here’s an overview of how Thanos works:
Prometheus Data Collection:
- Just like in a regular Prometheus setup, each Prometheus instance scrapes metrics from various targets (applications, services, Kubernetes components, etc.) using the pull-based model.
- Prometheus stores the collected metrics locally in its time-series database.
Thanos Sidecar:
- The Thanos Sidecar runs alongside each Prometheus server and acts as a connector between Prometheus and Thanos.
- It continuously uploads the local Prometheus data to a remote object storage system, like Amazon S3, Google Cloud Storage, or any other compatible storage backend.
- By pushing the data to remote storage, the Sidecar offloads the long-term data storage responsibility from Prometheus.
Thanos Store Gateway:
- The Thanos Store component is a read-only component that exposes the metrics data stored in the remote object storage.
- It acts as a gateway, allowing Prometheus and other Thanos components to query the historical metrics data from the object storage, even though the data is physically stored in a different location.
Global Querying with Thanos Querier:
- The Thanos Querier component allows for cross-cluster querying and operates as a single entry point for querying metrics data from multiple Prometheus instances and Thanos Store gateways.
- When a query is issued, the Querier aggregates data from various sources (Prometheus servers and Thanos Store gateways) and presents a unified view of metrics from different clusters.
Compaction and Deduplication:
- Thanos periodically runs the Compactor component, which performs data compaction in the object storage, reducing storage space by removing unnecessary data.
- Thanos also deduplicates data to optimize storage efficiency and reduce redundancy, further reducing the storage costs.
High Availability:
- Thanos achieves high availability by replicating data across multiple object storage instances.
- In case a Thanos Store or object storage instance becomes unavailable, the data remains accessible through other replicas, ensuring fault tolerance and avoiding data loss
Ruler:
- Thanos Ruler evaluates alerting rules across all the Prometheus servers and Thanos Store gateways in the Thanos-based monitoring stack. This allows for a consistent and unified approach to alerting across all clusters.
- Alerting rules can be defined in any of the Prometheus instances or Thanos Store gateways. Thanos Ruler collects and aggregates these rules, ensuring that each rule is evaluated appropriately.
- Thanos Ruler works in conjunction with Prometheus Alertmanager to handle alert notifications. When an alert is triggered, it is sent to Alertmanager, which then takes care of grouping, deduplicating, and routing alerts to the appropriate receivers.
Multi Cluster Architecture
This example is running on Azure with 3 cluster setup in private AKS (Azure Kuberentes Service):
Ops-Cluster: Centralized cluster for monitoring and operations exposed with external label as ops-cluster.
Cluster N: Application clusters exposed with external label cluster-n.
Our deployment uses the official Kube-prometheus-stack.
Cluster Setup:
Setup Monitoring namespace$: kubectl create ns monitoring
Create secret to mount blob container for storing prometheus metrics, update access key in thanos.yaml
type: AZURE
config:
storage_account: 'centralmetricstore' # azure blob storage account name
storage_account_key: 'xxxxxxxxxxxxxxxxxxxx' # azure blob storage key
container: 'metricsthanos' # azure blob container name
# Create Secret using
$: kubectl -n monitoring create secret generic thanos-objstore-config --from-file=thanos.yaml=thanos.yaml
Deploy prometheus stack — Make below changes in values file cluster-n.yaml
# update on line 2750
externalLabels:
cluster: cluster-n
Install helm chart
$: cd multi-cluster-thanos/monitoring-setup
$: helm dependency build
$: cd ../
$: helm install prometheus-stack monitoring-setup/ --values ./cluster-n.yaml -n monitoring
# check pods
$: kubectl get po -n monitoring
# check services
$: kubectl get svc -n monitoring
Following components are deployed:
- Prometheus
- Thanos sidecar
- Kube-metrics
NOTE: repeat this steps for other application clusters.
Ops-cluster Setup:
Create namespace to setup monitoring stack$: kubectl create ns monitoring
Create secret to mount blob container for storing prometheus metrics, update access key in thanos.yaml.
type: AZURE
config:
storage_account: 'centralmetricstore' # azure blob storage account name
storage_account_key: 'xxxxxxxxxxxxxxxxxxxx' # azure blob storage key
container: 'metricsthanos' # azure blob container name
# Create Secret using
$: kubectl -n monitoring create secret generic thanos-objstore-config --from-file=thanos.yaml=thanos.yaml
Deploy prometheus stack — Make below changes in values file ops-cluster.yaml
# update slack config for alerts
slack_configs:
- channel: 'channel-name'
api_url: 'https://hooks.slack.com/services/XXXXXXXXX/XXXXXXXXX/XXXXXXXXXXXXXXXXXXXXXX'
# update external label in value file
externalLabels:
cluster: ops-cluster # line 2749
# on line 3840 update alertmanager config, this is used by thanos ruler to fire alerts
extraSecret:
name: "thanos-alertmanager-config"
data:
alertmanager-configs.yaml: |
alertmanagers:
- static_configs: ["prometheus-stack-kube-prom-alertmanager.monitoring.svc.cluster.local:9093"]
scheme: http
timeout: 30s
$: cd multi-cluster-thanos/
$: helm install prometheus-stack monitoring-setup/ --values ./ops-cluster.yaml -n monitoring
# check pods
$: kubectl get po -n monitoring
# check services
$: kubectl get svc -n monitoring
Following components are deployed:
- Prometheus
- Grafana
- Alertmanager
- Thanos sidecar
- Thanos ruler
- Kube-metrics
Now we need to deploy store and query.
# update XX.XX.XX.XX with node IP, we are using nodePort to expose service
# Find node IP using kubectl get nodes && kubectl get svc -n monitoring
# querier-deployment.yaml
args:
- 'query'
- '--log.level=debug'
- '--query.replica-label=prometheus_replica'
- '--store=XX.XX.XX.XX:30901' # cluster-a
- - '--store=XX.XX.XX.XX:30901' # cluster-n
- '--store=prometheus-stack-kube-prom-thanos-external.monitoring.svc.cluster.local:10901' # ops-cluster
- '--store=thanos-store-svc.monitoring.svc.cluster.local:10901' # store svc
$: kubectl create -f querier-deployment.yaml
# expose service
$: kubectl create -f querier-service-servicemonitor.yaml
# deploy store service which connects to blob storage and fetches historical data
$: kubectl create -f store-service.yaml
Connecting using grafana:
Add prometheus as data source on grafana and pass quereier service endpoint which is thanos-query.monitoring.svc.cluster.local:9090
Conclusion:
Thanos is a powerful open-source project that extends Prometheus and addresses some of the key challenges faced in monitoring and observability, particularly in multi-cluster Kubernetes environments. By adding Thanos to a monitoring stack, organizations can achieve long-term data retention, improved scalability, and efficient global querying capabilities.