Monitoring Stack¶

Our monitoring stack provides deep visibility into the health and performance of all homelab services, from hardware metrics to application-level traces.

📊 Overview¶

The stack is built on the LGTM (Loki, Grafana, Tempo, Mimir) philosophy, optimized for a single-node homelab environment.

Component	Role
Prometheus	Metric collection and time-series database.
Grafana	Central visualization and alerting dashboard.
Loki	Log aggregation and indexing.
Promtail	Log shipping agent for Docker containers.
Beyla	eBPF-based auto-instrumentation for HTTP/gRPC services.
JSON Exporter	Scrapes custom APIs (e.g., Speedtest Tracker).

🛠️ Architecture¶

graph TD
    subgraph Services
        S1[Bridge]
        S2[Glance]
        S3[Jellyfin]
    end

    S1 -- Logs --> PT[Promtail]
    S2 -- Metrics --> B[Beyla]
    S3 -- Metrics --> B

    PT -- Push --> L[Loki]
    B -- Pull --> P[Prometheus]

    L -- Query --> G[Grafana]
    P -- Query --> G

🚨 Alerting¶

Alerting is handled through centralized configuration files that are automatically provisioned into Grafana.

High CPU/Memory: Triggers if a container exceeds 90% allocation for 5 minutes.
Service Down: Triggers if a healthcheck fails for 2 consecutive polls.
Notification Routing: Alerts are routed via Apprise to Telegram.