Agentic Debugging
Alert-based
Alert-based Debugging enables AI to begin an investigation using the alert context as the starting point. When you click “Debug with AI” on an alert in the Alerts Inbox, the platform opens the Alert Details page where the agent immediately initiates a structured investigation.
This mode does not require the user to input anything — the AI uses available metadata, tags, and context from the alert to propose likely causes and remediation paths.
🧠 What Happens When You Trigger Debug with AI?
- Clicking “Debug with AI” navigates to the Alert Details page.
- The AI agent:
- Parses alert metadata: service, severity, tags, message, source, timestamps
- Uses the alert as the entry point for the investigation
- Generates a High-Level Debugging Plan:
- Lists Potential Root Causes
- Suggests Debugging Steps
- Recommends Remediation Actions
- If integrations are available, the agent attempts to verify each root cause.
🪄 Root Cause Evaluation Flow
Each root cause goes through a lifecycle of verification:
Status | Description |
---|---|
Pending | The hypothesis is created but not yet verified (often due to no integrations) |
Evaluating | The agent is currently verifying the root cause by analyzing relevant data |
Unverified | Agent attempted verification but couldn’t complete (e.g., permission errors, timeout) |
Ruled Out | The agent checked and confirmed that this cause is not valid |
Detected | The hypothesis was verified and is considered the most likely root cause |
🚀 Key Benefits
- Faster Resolution: Starts debugging immediately with alert context
- Proactive Analysis: AI suggests potential issues before they escalate
- Structured Approach: Follows a methodical debugging process
- Context-Aware: Leverages existing alert metadata and integrations
🔍 Example
Alert:
🧭 Generated Plan (shown on Alert Details page):
Potential Root Cause A: Memory limits too low
- Status: ✅ Detected
- How verified: Pod logs show OOMKilled; memory usage spiked to 270MiB; limit set to 256MiB
- Suggested Remediation: Increase memory limit to 512MiB
Potential Root Cause B: Invalid container image
- Status: ❌ Ruled Out
- Reason: Image pulled and container ran for 30s before crashing — not a pull/startup issue
Potential Root Cause C: Secret volume missing
- Status: ⚠️ Unverified
- Reason: Permission denied when accessing secret volume logs
🎯 Why Use Alert-based Debugging?
- Faster Triage: No need to manually describe the problem — just click and let AI investigate
- Intelligent Plans: Structured and explainable investigations, not black-box suggestions
- Integration-Aware: Leverages real-time data where possible; gracefully degrades to guidance