# Cluster Monitoring
Cadence is emitting metrics in both Server and client:
Follow this example to emit the client side metrics for Golang client (opens new window).
Follow this example to emit the client side metrics for Java client (opens new window). Make sure you at least upgrade to 3.0.0.
For production, follow this example of hemlchart (opens new window) to emit server side metrics. Or you can follow the example of local environment (opens new window) to Prometheus. All services need to expose a HTTP port to provide metircs like below
metrics: prometheus: timerType: "histogram" listenAddress: "0.0.0.0:8001"
The rest of the instructions are using local environment as an example.
For local server emitting metrics to Promethues, easies way is to use docker-compose (opens new window) to start a local Cadence.
Make sure to update the
prometheus_config.yml to add "host.docker.internal:9098" to the scrape list before starting the docker-compose:
global: scrape_interval: 5s external_labels: monitor: 'cadence-monitor' scrape_configs: - job_name: 'prometheus' static_configs: - targets: # addresses to scrape - 'cadence:9090' - 'cadence:8000' - 'cadence:8001' - 'cadence:8002' - 'cadence:8003' - 'host.docker.internal:9098'
host.docker.internal may not work for some docker versions (opens new window)
After updating the prometheus_config.yaml as above, run
docker-compose upto start the local Cadence
Go the the sample repo, build the helloworld sample
make helloworldand run the worker
./bin/helloworld -m worker, and then in another Shell start a workflow
Go to local Grafana (opens new window) , login as
Configure Prometheus as datasource: use
http://host.docker.internal:9090as URL of prometheus.
Import the Grafana dashboard tempalte as JSON files.
Client side dashboard looks like this:
And server basic dashboard:
# Grafana dashboard templates
This package (opens new window) contains examples of Cadence dashboards with Prometheus.
Cadence-Clientis the dashboard of client metrics, and a few server side metrics that belong to client side but have to be emitted by server(for example, workflow timeout).
Cadence-Server-Basicis the the basic server dashboard to monitor/navigate the health/status of a Cadence cluster.
Apart from the basic server dashboard, it's recommended to set up dashboards on different components for Cadence server: Frontend, History, Matching, Worker, Persistence, Archival, etc. Any contribution (opens new window) is always welcome to enrich the existing templates or new templates!
# Periodic tests(Canary) for health check
It's recommended to run periodical test every hour on your cluster following this package (opens new window) to make sure a cluster is healthy.