[TA/admin]Grafana: Setup - JOJ3

JOJ/JOJ3

Table of Contents

setup grafana, loki, and prometheus on joj-dev
nginx reverse proxy for prometheus on joj-dev, also on real machine out of lxc containers
install & setup promtail and prometheus-node-exporter on course machine

setup `grafana`, `loki`, and `prometheus` on `joj-dev`

Check https://grafana.com/docs/grafana/latest/setup-grafana/installation/debian/ and also apt install loki prometheus.

nginx reverse proxy for `prometheus` on `joj-dev`, also on real machine out of lxc containers

tt@dev:~$ cat /etc/nginx/sites-enabled/metrics.conf
server {
    listen 9102;

    location /metrics {
        proxy_pass http://111.186.58.48:9100/lxc/engr151/go-judge/metrics;
    }
}

server {
    listen 9103;

    location /metrics {
        proxy_pass http://111.186.58.48:9100/lxc/ece477/go-judge/metrics;
    }
}

server {
    listen 9104;

    location /metrics {
        proxy_pass http://111.186.59.59:9100/lxc/ece482/go-judge/metrics;
    }
}

tt@dev:~$ cat /etc/prometheus/prometheus.yml
# Sample config for Prometheus.

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

  # Attach these labels to any time series or alerts when communicating with
  # external systems (federation, remote storage, Alertmanager).
  external_labels:
      monitor: 'example'

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ['localhost:9093']

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s
    scrape_timeout: 5s

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ['localhost:9090']

  - job_name: focs
    # If prometheus-node-exporter is installed, grab stats about the local
    # machine by default.
    static_configs:
      - targets: ['10.0.3.1:9100'] # from focs host machine
  - job_name: joj1
    static_configs:
      - targets: ['202.121.180.22:9100']
  - job_name: ve482
    static_configs:
      - targets: ['111.186.59.59:9100']
  - job_name: engr151
    static_configs:
      - targets: ['111.186.58.48:9100']
  - job_name: lxc_engr151_go-judge
    static_configs:
      - targets: ['127.0.0.1:9102']
  - job_name: lxc_ece477_go-judge
    static_configs:
      - targets: ['127.0.0.1:9103']
  - job_name: lxc_ece482_go-judge
    static_configs:
      - targets: ['127.0.0.1:9104']

ta@engr151:~$ cat /etc/nginx/sites-enabled/metrics.conf
server {
    listen 9100;

    location /metrics {
        proxy_pass http://localhost:9101/metrics;
    }

    location /node-exporter/metrics {
        proxy_pass http://localhost:9101/metrics;
    }

    location /lxc/engr151/go-judge/metrics {
        proxy_pass http://10.0.3.151:5052/metrics;
    }

    location /lxc/ece477/go-judge/metrics {
        proxy_pass http://10.0.3.180:5052/metrics;
    }
}

install & setup `promtail` and `prometheus-node-exporter` on course machine

Check https://grafana.com/docs/loki/latest/send-data/promtail/installation/, then apt install prometheus-node-exporter.

tt@engr151-24fa:~$ cat /etc/promtail/config.yml
# This minimal config scrape only single log file.
# Primarily used in rpm/deb packaging where promtail service can be started during system init process.
# And too much scraping during init process can overload the complete system.
# https://github.com/grafana/loki/issues/11398

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml
  sync_period: 10s

clients:
- url: http://focs.ji.sjtu.edu.cn:3100/loki/api/v1/push
  batchwait: 1s
  batchsize: 102400

scrape_configs:
- job_name: engr151-24fa
  static_configs:
  - targets:
      - localhost
    labels:
      job: engr151-24fa_joj3-logs
      __path__: /home/tt/.cache/joj3/**/*.ndjson

  pipeline_stages:
    - json:
        expressions:
          time: time
    - timestamp:
        source: time
        format: RFC3339Nano

setup grafana, loki, and prometheus on joj-dev

nginx reverse proxy for prometheus on joj-dev, also on real machine out of lxc containers

install & setup promtail and prometheus-node-exporter on course machine

setup `grafana`, `loki`, and `prometheus` on `joj-dev`

nginx reverse proxy for `prometheus` on `joj-dev`, also on real machine out of lxc containers

install & setup `promtail` and `prometheus-node-exporter` on course machine