User suggests adding the ability to provision custom alert rules directly from the chat interface in the Grafana Lens plugin.
Hey r/selfhosted, Like many of you, I use the standard Grafana/Prometheus/Loki stack to monitor my home services. It works great, but honestly, whenever something actually breaks (like Nextcloud eating all my RAM or Nginx throwing 502s), I hate having to manually align timestamps across different panels and dig through raw logs. I also constantly forget PromQL syntax. I wanted to use an LLM to help debug and write queries for me without piping my private server logs to a cloud SaaS. So, to scratch my own itch, I built an open-source plugin: **Grafana Lens**. \- GitHub:[https://github.com/awsome-o/grafana-lens](https://github.com/awsome-o/grafana-lens) **How it works (and why OpenClaw):** Instead of building a standalone app, I built this as a plugin on top of **OpenClaw** (an open-source AI agent engine). With the help of OpenClaw, your AI agent becomes accessible from anywhere (via its chat UI/API). It connects directly to your existing Grafana setup and runs locally without spinning up any new databases. **How it helps you debug and manage your stack:** * **Out-of-the-box Data Access:** Because it hooks directly into your Grafana instance, **any data source you've already configured works out-of-the-box**. The agent simply uses your existing Grafana API to read the data. * **Automated Troubleshooting:** When a container crashes, you can just ask the agent "What happened?". It runs a `grafana_investigate` tool that parallel-fetches metrics, logs, and traces for that time window, runs a Z-score anomaly check against your 7-day baseline, and gives you a concrete hypothesis of what broke. * **Manage and Query via Chat:** You can ask things like, *"Check the memory usage of my postgres container over the last 3 hours"* or *"Alert me if the AdGuard DNS latency goes over 50ms."* It dynamically generates the queries (No more memorizing PromQL/LogQL) and can even provision native Grafana alert rules for you directly from the chat. * **Monitor the Agent itself via OTLP (No AI black box):** A huge problem with AI agents is they can get stuck in loops and fry your CPU. I built in hard guardrails. To give you full visibility, **the plugin uses OTLP specifically to push the OpenClaw agent's** ***own*** **telemetry natively into your Tempo**. You can see the exact waterfall trace (`Session -> LLM Call -> Tool Execution`) of what the AI is actually thinking and doing behind the scenes. **How to run it:** Assuming you have a Grafana LGTM stack and OpenClaw running, it’s just a quick plugin install and passing your Grafana Service Account Token: Bash openclaw plugins install openclaw-grafana-lens export GRAFANA_URL=http://localhost:3000 export GRAFANA_SERVICE_ACCOUNT_TOKEN=glsa_xxxxxxxxxxxx openclaw gateway restart I built this primarily to make my own homelab maintenance less annoying. If anyone wants to test it out, I'd love to hear your feedback or if there are other management features you'd want the agent to handle! *(Note: Just a community-built open-source project, not affiliated with Grafana Labs).*