🧠 What Happens When You Trigger Debug with AI?
- Clicking “Debug with AI” navigates to the Alert Details page.
- The AI agent:
- Parses alert metadata: service, severity, tags, message, source, timestamps
- Uses the alert as the entry point for the investigation
- Generates a High-Level Debugging Plan:
- Lists Potential Root Causes
- Suggests Debugging Steps
- Recommends Remediation Actions
- If integrations are available, the agent attempts to verify each root cause.
🪄 Root Cause Evaluation Flow
Each root cause goes through a lifecycle of verification:Status | Description |
---|---|
Pending | The hypothesis is created but not yet verified (often due to no integrations) |
Evaluating | The agent is currently verifying the root cause by analyzing relevant data |
Unverified | Agent attempted verification but couldn’t complete (e.g., permission errors, timeout) |
Ruled Out | The agent checked and confirmed that this cause is not valid |
Detected | The hypothesis was verified and is considered the most likely root cause |
🚀 Key Benefits
- Faster Resolution: Starts debugging immediately with alert context
- Proactive Analysis: AI suggests potential issues before they escalate
- Structured Approach: Follows a methodical debugging process
- Context-Aware: Leverages existing alert metadata and integrations
🔍 Example
Alert:
🧭 Generated Plan (shown on Alert Details page):
Potential Root Cause A: Memory limits too low
- Status: ✅ Detected
- How verified: Pod logs show OOMKilled; memory usage spiked to 270MiB; limit set to 256MiB
- Suggested Remediation: Increase memory limit to 512MiB
Potential Root Cause B: Invalid container image
- Status: ❌ Ruled Out
- Reason: Image pulled and container ran for 30s before crashing — not a pull/startup issue
Potential Root Cause C: Secret volume missing
- Status: ⚠️ Unverified
- Reason: Permission denied when accessing secret volume logs
🎯 Why Use Alert-based Debugging?
- Faster Triage: No need to manually describe the problem — just click and let AI investigate
- Intelligent Plans: Structured and explainable investigations, not black-box suggestions
- Integration-Aware: Leverages real-time data where possible; gracefully degrades to guidance