Alerts-Insights v0.1

by Siddarth Jain

Summary: Before December 13th, integrations with various tools were implemented, charts were published as images, and 1-click data fetching was enabled. On December 13th, a streamlit app with aggregate insights at a channel level was published. On December 14th, the structure was changed, an option to remove specific alerts was added, the dashboard was consolidated into a single file, and integration with Google Chats was added.

Pre-13th December:

  • Integrations: Slack[SJ], NewRelic[MG], Datadog[MG], Sentry[MG/AA], Honeybadger[MG/AA], Squadcast[SJ/AA].
  • Published charts as images to users.[SJ]
  • 1-click data fetching from all integrations [MG]

[13th December, 2023]

  • Published streamlit app with aggregate insights at a channel level, with different data sources for every channel.

[14th December, 2023]

  • Changed structure to alert_type (infra, apm, error, container)
  • Added an option for user to remove a specific alert (this can be useful if there was one specific noisy alert that might be making the entire data too noisy)
  • Running entire dashboard from a single file.
  • Added integration to Google Chats.

Here are some key updates since our last update:

  • We made our platform open source -- accessible here!
  • We have launched an upgrade to dashboards such that multiple panels can be added within the same dashboard.
  • We have created a sandbox so users can access and experience the power of Doctor Droid platform without need for creating an account.

Key Releases:

  • Setting up rules to track entities
  • Giving visibility of branches in a funnel

There are 3 new triggers that we have introduced:

  • Detect a transaction stuck at a state
    • This is useful when you know that an object is supposed to not stay at a certain state for more than "x" duration.
  • Detect an event's occurrence
    • This is useful when you want to get alerted on an event's happening.
  • Detect an event occurring multiple times
    • This is useful when you want to get alerted on an event happening more than "n" times within a specific time window.

Here's a demo video of the triggers:

Branching in Funnels

  • Until now, we enabled you to monitor funnels and identify drops at different steps.
  • When you notice a drop in the funnel, we now enable you to know what other paths did the events take. This enables you to investigate an issue further.

To try the platform, sign up and try the playground now: https://drdroid.io/

Since the previous update there have been two features that have been released:

Workflows

With workflows, you can correlate metrics related to a specific workflow or user journey on the same screen -- bring performance metrics (from DataDog), product metrics (from custom events), error rates (from Sentry), et al.. on the same page.

Integrations Released for Workflows:

Self-Hosting

We now enable hosting of Doctor Droid in your own environment using our docker compose.

  • This is in private beta. Since this is early and might require different support for different environments, we are not rolling it out for Public beta, yet. To try our self-hosting, please reach out to us directly.

Additional feature updates:

  • Saving funnels and workflows in panels
  • Adding filters on funnels

We have added 2 key updates to our platform since the previous update:

  • You can now visualise funnels and filter them by any attribute.
    What does it mean? Let's say you're joining 5 unique events and want to filter these by an attribute in the first event, you can do that.
  • You can now see the transactions which are dropped between consecutive steps.
    For example, if the transaction in your product is expected to go from A --> B --> C, and there's a 2% drop from A --> B, you can click on the connecting line and see the transactions for which the drop happened.

Here's a short demo video walking you through this feature.

Additional releases:

  • Extensive search on transactions:
    [block:image]
    {
    "images": [
    {
    "image": [
    "https://files.readme.io/29015f8-Screenshot_2023-08-04_at_11.50.16_AM.png",
    "",
    ""
    ],
    "align": "center"
    }
    ]
    }
    [/block]
  • Extensive search on Entities (List View):
    Similar search capabilities on the entities page are also enabled now.
  • Auto-refresh on custom time-window:
    Now, you can setup a time-window which is different from past 30 minutes. For example, you can now set time-window "from 9pm yesterday until now" and get auto-refreshes on the dashboard.

We are releasing the flow view in Entities under beta. ☺️

  • You can expect it to work for your pre-existing entities.

A lot of experimentations are undergoing on this one currently!

We might be silent as we get rigorous with it's beta testing but expect something in a month or two!

Hello folks, 2 key releases are coming along this time:

  • Capability to plot a metric basis a property of a "monitor".
    [block:image]
    {
    "images": [
    {
    "image": [
    "https://files.readme.io/1909222-monitor-based-metrics.avif",
    null,
    null
    ],
    "align": "center"
    }
    ]
    }
    [/block]
  • Capability to save the metric as a dashboard in the platform. These can now be accessed under the "Dashboards" section.

In this release, we have launched a Metrics Explorer -- a way to visualize the events sent to Dr. Droid platform.

Additionally, for every metric, you can:

  • Group by any of the variables, or multiple variables
    In case of grouping by multiple variables, it will turn into a non-timeseries table for the time duration selected.
  • Aggregate by multiple types, including but not limited to SUM, Count, AVG.

This week, we are adding entities to the platform. An entity helps you define product goals and track intermediate steps proactively. You can consider them like a business trace or business object.

Creating an entity

While creating an entity, you can select any number of events that could be relevant in context of your product workflow. For e.g., the below sample shows how an entity related to a payment has been created.

Monitoring an entity

Our platform automatically creates any trigger rules created on any of the events within an entity. What it means is that, as soon as you create an entity, you are already able to see the health of these entities, out-of-the-box.

For instance, in this below example, there were two rules created to monitor "wallet-api-failing" and "bank-server-unresponsive" already.

They are automatically correlated with the entity, hence giving a very simple way to debug or track an issue associated with a critical flow.

While we do have SDKs and APIs, we realised that some of you are already generating events for critical checkpoints. To stream this data into Doctor Droid, we have launched connectors with the following sources:

  • SQS, AWS Kinesis: To forward events from your existing event buses
  • Segment: To imports events from teams using Segment to route events
  • Cloudwatch logs: To import existing logs
  • Sentry: To import error messages and correlate them with other events on Doctor Droid

To read more about specifics of implementations, checkout our docs or blog.

Additional releases:

  • Capability to filter events basis attribute properties.
  • Automated metric visualisations for individual events.
    A dashboard with auto-generated metrics for any event created in the system.