This will enable you to (see grouping of alerts)(https://aiops.drdroid.io/issues) in real-time with literally no effort apart from sending alerts via webhook or using the Slack App.
The will enable you to understand how issues, teams, routing, notifications and alerts work in the platform.
This will also expose you to understanding how our platform, on a daily-basis, gives you suggestions and updates your wiki, creates RCA when needed and summarises your noisy alerts.
This will help you understand how DrDroid is designed for real-time noise reduction.
Define a couple of services in the catalog and the data sources of where relevant data for that services lies: For e.g. where it’s metrics / logs are (Grafana/APM tool) or where it’s deployment / code is (Github/Jenkins) or where it’s deployed (k8s, ecs, etc.).
This will enable you to give context of a service to the agent so that it uses the context to debug issues better after it’s raised.
This will improve alert correlation and root causes analysis while investigating issues.
Once the service to team mapping starts happening, it’ll be easy to quickly move from an alert to respective engineer/document and vice-versa in a couple of clicks.
Define your first team in Team catalog. You can define the escalation & notification policy here along with the on-call rotations and mapping to respective projects in JIRA/Linear/etc.
Create a service and start filling information about the service. You might notice smart recommendations and auto-filling options in this page. These can be leveraged if you do the next step before this current one.
Add integrations:
Decide what integrations are most relevant for that service and add those from the integrations page. We even support integrations to tools that are self-hosted. Please refer to the relevant document for how.
Add integration as datasource in service:
Within a service, you will be required to define data sources. After you have added integrations, you can define them in the UI against that service. This will ensure that when investigating the service, the agent knows exactly where to look up for what data.