Monitoring Stack Deployment
Overview
This guide provides detailed instructions on deploying a monitoring stack. The stack includes Prometheus, Grafana, cAdvisor, and Node Exporter.
Docker Compose File Breakdown
version: "3.9"
services:
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.37.5
deploy:
mode: global
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
ports:
- "8080:8080"
networks:
- ovencrypt
node_exporter:
image: quay.io/prometheus/node-exporter:v1.5.0
command: "--path.rootfs=/host"
pid: host
restart: unless-stopped
networks:
- ovencrypt
volumes:
- /:/host:ro,rslave
grafana:
image: grafana/grafana-oss:latest
networks:
- ovencrypt
volumes:
- grafana-data:/var/lib/grafana
ports:
- 4000:3000
deploy:
replicas: 1
labels:
- "traefik.enable=true"
- "traefik.http.routers.grafana.rule=Host(`grafana.domain.com`)"
- "traefik.http.services.grafana.loadbalancer.server.port=3000"
- "traefik.docker.network=ovencrypt"
- "traefik.http.routers.grafana.middlewares=traefikae-auth"
prometheus:
image: prom/prometheus:latest
networks:
- ovencrypt
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus-data:/prometheus
command: "--config.file=/etc/prometheus/prometheus.yml"
ports:
- 9090:9090
deploy:
placement:
constraints:
- node.role == manager
replicas: 1
labels:
- "traefik.enable=true"
- "traefik.http.routers.prometheus.rule=Host(`prometheus.domain.com`)"
- "traefik.http.services.prometheus.loadbalancer.server.port=9090"
- "traefik.docker.network=ovencrypt"
- "traefik.http.routers.prometheus.middlewares=traefikae-auth"
volumes:
grafana-data:
prometheus-data:
networks:
host:
external: true
ovencrypt:
external: true
attachable: true
Prometheus Configuration File
global:
scrape_interval: 15s # By default, scrape targets every 15 seconds.
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'traefik_api'
static_configs:
- targets: ['traefikmetrics.domain.com']
- job_name: 'node_exporter'
static_configs:
- targets: ['node_exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'kvm4' # Database server
static_configs:
- targets: ['kvm4_ip:9100']
Deployment Instructions
-
Prerequisites
- Ensure Docker and Docker Compose are installed.
- Initialize Docker Swarm:
sh docker swarm init - Ensure the
ovencryptnetwork is created and attachable:sh docker network create --driver=overlay --attachable ovencrypt
-
Setup Environment Variables
- Ensure the domain names in the Traefik labels match your actual domain names.
-
DNS Configuration
- Ensure the domains (
domain.com,grafana.domain.com,prometheus.domain.com,traefikmetrics.domain.com, ) have A records pointing to the IP address of the server where the services will be deployed. Configure Domains
- Ensure the domains (
-
Install Node Exporter on Database Server (kvm4)
- SSH into the kvm4 server.
- Download and run the Node Exporter:
wget https://github.com/prometheus/node_exporter/releases/download/v1.2.2/node_exporter-1.2.2.linux-amd64.tar.gz
tar xvfz node_exporter-1.2.2.linux-amd64.tar.gz
cd node_exporter-1.2.2.linux-amd64
./node_exporter
- Create a systemd service for the Node Exporter to ensure it runs on startup:
sudo vim /etc/systemd/system/node_exporter.service
Add the following content:
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
Reload systemd and start the service:
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter
-
Deploy the Stack
- Navigate to the directory containing the
docker-compose.ymlfile. - Run the following command to deploy the stack:
sh docker stack deploy -c docker-compose.yml monitoring_stack
- Navigate to the directory containing the
-
Verify Deployment
- Check the status of the services using:
sh docker stack services monitoring_stack - Verify that all services are running and properly configured.
- Check the status of the services using:
-
Access the Services
- Grafana should be accessible at
http://grafana.domain.com. - Prometheus should be accessible at
http://prometheus.domain.com.
- Grafana should be accessible at
-
Logs and Debugging
- To view logs for a specific service, use:
sh docker service logs monitoring_stack_<service_name> - Replace
<service_name>with the actual service name if different.
- To view logs for a specific service, use: