In the 9.1 release, we've made significant upgrades to alerting to help SREs and operators cut through the noise, understand what's happening faster, and take meaningful action with less guesswork.
Here's what's new:
Improved Related Alert grouping with Relevance Scoring & Reasoning
We've enhanced our related alert detection to go beyond surface-level correlations. Alerts are now grouped based on a relevance score that reflects the strength of their relationship across dimensions like:
- Shared entities or resources (e.g. same host, pod, or service)
- Temporal proximity (alerts firing within a suspiciously short window)
- Signal similarity (e.g. spikes in logs, metrics, and traces that point to the same failure mode)
More importantly, we now show the why. You'll see why an alert is grouped, whether it's sharing the same Kubernetes pod, has similar log patterns, or was triggered by the same upstream anomaly. This gives users confidence in the grouping logic and accelerates root cause analysis.
Link Dashboards to Alert Rules and get Smart Suggestions
You can now link dashboards directly to your alert rules, giving responders an instant visual lens into the metrics or logs that matter most for that alert. No more scrambling to remember which dashboard to check — just click and go.
And we've made this smarter too: Elastic will now suggest relevant dashboards based on the alert's source, rule logic, or monitored entities, helping users land on the right view without needing to configure anything upfront.
Investigation guides Embedded Into Alerts
Every alert can now be configured with an investigation guide, a set of pre-configured, context-aware instructions or next steps tailored to the alert. Think of it as a playbook that's embedded right where and when you need it.
Use it to:
- Document your team's runbooks and standard triage steps or link to existing runbooks
- guide junior engineers or on-call responders through unfamiliar territory
- Automate the first few steps of root cause analysis
Why This Matters
These changes are all about reducing time to detect (MTTD) and time to resolve (MTTR). By:
- grouping alerts more intelligently (and transparently)
- giving you the dashboards you need, when you need them
- Embedding action-oriented guides in every alert
We're bringing you closer to a truly streamlined incident response workflow; No swivel-chairing, no guesswork, just clarity.
Additionally, look at some of our other articles on Elastic Observability Labs related to analysis: