Doctor Droid is building a platform to help you translate your existing observability data into insights that are contextual to your product.
To get started:
Businesses which are dependent on real-time functioning of software to generate revenue mostly rely on engineers to debug any deviations in business metrics or customer facing issues. This requires engineers to switch context every now and then reduces the time they can spend on new features. Doctor Droid is solving this pain by adding an auto triaging layer which connects business outcomes to events generated from within the code. Kind of like creating a correlation between application logs & metrics to business success.
Solving this pain point will make sure engineers do what they do best i.e. build. Companies often specifically hire tech support teams to triage customer facing issues by tracing that back to their data flowing through the applications. We are automating that.
We used to work in a fast paced logistics company that specialises in food and grocery deliveries under 30 minutes. That was spread across hundreds of merchants, hundreds of cities & their operations teams and with thousands of delivery partners. It was built using more than 50 micro-services and there were a lot of point of failures that could fail in specific scenarios for different stakeholders. We found the triaging issues at a delivery partner, merchant or order level was extremely hard unless there is a mental map that connects the questions to the answers. Rarely were we told “API x is down”. Most often, we would face questions like:
- Why is delivery partner A not able to upload their documents?
- Why are we receiving 14% lesser orders today than last week same day from New Delhi?
- How much should we improve our delivery partner allocation algorithm to increase orders by 5%?
Only senior engineers who had built those products could debug these and that created a bottleneck in both tech support as well new developments. These questions can only be answered by complex analysis into data from different sources, the data captured by different lenses for the same product - sometimes also involving multiple teams. We solved it back then by writing custom scripts that could proactively detect such issues before being reported so that that work doesn’t have to be done on-demand. We also hired specific junior engineers to follow a process & check in 5 different parts of our observability and monitoring stack to find the root cause for each issue.
There should be a simpler & faster way to do this.
And that’s what we are working on.
Rooting for a more productive future,
Updated 13 days ago