🛠️ How It Works
DroidAgent collects evidence from your system and correlates them to come up with the likely root cause behind the issue and how it can be fixed. Here’s how it works:-
Collects Context: The agent collects context across multiple places, including but not limited to:
- Data Sources: Give the platform access to your telemetry data sources & map them to your services so it knows where to query metrics, logs, deployments or any other information for which service.
- Alerts: Access to alerts enables agent to decide when to investigate.
- Runbooks: Provide an additional set of prompts / wiki so that the agent can leverage it during investigations. Runbooks are not required for standard/common scenarios but only for situations where user already has an opinion.
-
Correlates & iterates:
- After every evidence collected, it tries to connect it back to the hypothesis it created and evaluates if it’s getting any closer to the issue resolution or root cause identification.
Data Sources & Integrations
Our platform supports integration with 50+ tools for evidence collection & investigation — from your logs in ELK to dashboards in Grafana. You can find the full list here. Each data source is converted into an MCP server and made accessible to the agent for leveraging during investigation.📋 Example
For example, if a Kubernetes pod is CrashLooping, and you have observability integrations configured:- The agent may fetch pod logs, check deployment configurations, and diagnose the root cause (e.g., OOMKill, bad image).
- Basis the pod’s name, it might also look for a service of similar name, and once it finds it, it can decide to further investigate data sources of the service (like the dashboards & metrics associated with the service or the deployments history).