After deploying the Grafana Agent in a Kubernetes cluster, you’ll most likely want to monitor it to ensure that no observability data gets lost. Grafana provides a comprehensive guide on how to configure alerts for the agent, but I found it to not work for all cases. Namely, enabling agent integration didn’t enable scraping metrics of the agent itself. This could be due to running the Grafana Agent in Kubernetes, which the guide may not be targeted at, or due to configuring the agent in a manner that deviates from Grafana’s recommended way.

Available Grafana Agent Metrics

To check what metrics are available from the Grafana Agent, you first need to have the metrics exposed from the agent pods as a Service. This is also a prerequisite to the rest of this post. I will use the Service configuration from the Grafana Agent v0.25.1 example deployment as an example.

Once you have the Service running in your cluster, get the name of one of your Grafana agent pods:

kubectl get pods --namespace <GRAFANA AGENT NAMESPACE>

Take one of the pod names and port forward its port 80 to some available port on your local machine, e.g. 12345:

kubectl port-forward <GRAFANA AGENT POD NAME> --namespace <GRAFANA AGENT NAMESPACE> 80:12345

In another terminal instance, get the list of exposed metrics:

curl localhost:12345/metrics

Alongside the listed metrics, the agent will also report an up metric for the Grafana Agent service.

Scraping Grafana Agent Metrics

To scrape the metrics of the Grafana Agent, a ServiceMonitor or a scrape config for the agent metrics is required. The ServiceMonitor will work if you’re using the Grafana Agent operator for your Grafana Agent deployment, and the scrape config is required if you’re running the agent without using the operator.


Modify the namespace in the ServiceMonitor definition below to the namespace that your Grafana agent runs in. Then, deploy the ServiceMonitor onto your cluster with kubectl apply:

kind: ServiceMonitor
    name: grafana-agent
  name: grafana-agent
  namespace: NAMESPACE
    - bearerTokenFile: /var/run/secrets/
      port: http-metrics
      path: /metrics
        insecureSkipVerify: true
      name: grafana-agent

Scrape config

Modify the namespace in the scrape config definition below to the namespace that your Grafana agent runs in. Then, append the scrape config to the end of your scrape_configs block in the Grafana Agent ConfigMap. Finally, apply the scrape config using kubectl apply and restart your Grafana Agent using kubectl rollout restart sts/grafana-agent --namespace NAMESPACE:

- bearer_token_file: /var/run/secrets/
  honor_labels: false
  job_name: serviceMonitor/grafana-agent/grafana-agent/0
    - namespaces:
            - NAMESPACE
      role: endpoints
  metrics_path: /metrics
    - source_labels:
        - job
      target_label: __tmp_prometheus_job_name
    - action: keep
      regex: grafana-agent
        - __meta_kubernetes_service_label_app
    - action: keep
      regex: http-metrics
        - __meta_kubernetes_endpoint_port_name
    - regex: Node;(.*)
      replacement: $1
      separator: ;
        - __meta_kubernetes_endpoint_address_target_kind
        - __meta_kubernetes_endpoint_address_target_name
      target_label: node
    - regex: Pod;(.*)
      replacement: $1
      separator: ;
        - __meta_kubernetes_endpoint_address_target_kind
        - __meta_kubernetes_endpoint_address_target_name
      target_label: pod
    - source_labels:
        - __meta_kubernetes_namespace
      target_label: namespace
    - source_labels:
        - __meta_kubernetes_service_name
      target_label: service
    - source_labels:
        - __meta_kubernetes_pod_name
      target_label: pod
    - source_labels:
        - __meta_kubernetes_pod_container_name
      target_label: container
    - replacement: $1
        - __meta_kubernetes_service_name
      target_label: job
    - replacement: http-metrics
      target_label: endpoint
    insecure_skip_verify: true

Final Word on Cardinality

Your metrics should now be available in Grafana Cloud. Note that some metrics (e.g. prometheus_target_sync_length_seconds and promtail_log_entries_bytes_bucket) have a high cardinality, so you may want to filter some metrics labels to reduce your metrics usage. The post on reducing Prometheus metrics usage from Grafana can come in handy for this.