Chaos Minimisation Framework (CMF)

[WIP document]

If you're an engineer that has worked with building LLM & related applications, you know that the biggest challenge is to make it reliable & consistent.

At Dr. Droid, we are in the business of enabling engineers get a first responder agent for every issue so that the automation can assist in saving time & debugging issues faster in production.

This means:

  • If an engineer asks "How's the health of service X? or when did service X get last deployed?", the answers needs to be (a) accurate (b) data backed (c) insightful (d) contextual.
  • If an engineer asks "Run an analysis of why my user isn't able to do action A1", the agent needs to understand what is a "user" in context of the company, what does A1 mean and how it can check it.

LLMs come with their set of challenges, including but not limited to hallucinations, token limit, non-trivial data requirements for high-quality fine-tuning, prompt injections, misuse and lack of expertise on getting value from large volume of structured data.


This is where our CMF comes into picture. At Dr. Droid, we've solved for a lot of these challenges by building an agentic framework that's designed around minimising the chaos. Here's a quick glimpse of how the framework tackles these problems:

  1. Runbook automation framework: Under the hoods, Dr. Droid deeply leverages the capabilities of Playbooks -- a runbook automation framework.
  2. Contextual data access:
    1. Tools
    2. Memory
    3. Catalog
    4. SOPs
  3. Request & Response Guardrails:
    1. Isolated AI & backend services – The AI agent can request data but cannot execute actions directly. All execution requests pass through a backend review for correctness & safety.