Explore these practical examples to get started with creating your own runbooks. Each example includes a description, use case, and the complete runbook definition.
1. Service Health Check
Use Case: Verify the health of a web service
name: Web Service Health Check
description: Check if web service is responding and healthy
variables:
SERVICE_URL: "https://api.example.com/health"
TIMEOUT_SECONDS: 10
tasks:
- name: Check HTTP Status
type: http
method: GET
url: ${SERVICE_URL}
timeout: ${TIMEOUT_SECONDS}
expect:
status: 200
body:
status: "healthy"
- name: Check Response Time
type: metric
query: 'http_request_duration_seconds{service="api"}'
condition: 'value < 1'
message: "Response time is within acceptable range"
alerts:
- name: Service Unhealthy
condition: tasks["Check HTTP Status"].status != "success"
severity: critical
message: "Service is not responding as expected"
2. Database Maintenance
Use Case: Perform routine database maintenance
name: Database Maintenance
description: Run routine maintenance tasks on the database
variables:
DB_HOST: "db-prod"
DB_NAME: "myapp_production"
MAX_CONNECTIONS: 100
tasks:
- name: Check Connection Count
type: sql
database: postgres
query: |
SELECT count(*) as active_connections
FROM pg_stat_activity
WHERE datname = '${DB_NAME}'
condition: 'result.rows[0].active_connections < ${MAX_CONNECTIONS}'
- name: Run Vacuum
type: sql
database: postgres
query: VACUUM ANALYZE;
timeout: 3600 # 1 hour
- name: Reindex Database
type: sql
database: postgres
query: REINDEX DATABASE ${DB_NAME};
timeout: 7200 # 2 hours
3. Kubernetes Deployment Rollout
Use Case: Deploy a new version of a service
name: Kubernetes Service Deployment
description: Deploy a new version of a service with health checks
inputs:
- name: IMAGE_TAG
type: string
required: true
description: Docker image tag to deploy
- name: NAMESPACE
type: string
default: "default"
tasks:
- name: Deploy New Version
type: kubernetes
action: apply
manifest: |
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
namespace: ${NAMESPACE}
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myregistry.com/myapp:${IMAGE_TAG}
ports:
- containerPort: 8080
- name: Verify Rollout
type: kubernetes
action: rollout-status
resource: deployment/myapp
namespace: ${NAMESPACE}
timeout: 300 # 5 minutes
4. Incident Response
Use Case: Automated response to high CPU usage
name: High CPU Response
description: Investigate and mitigate high CPU usage
trigger:
condition: 'avg(cpu_usage{service="api"}) > 90'
duration: 5m
tasks:
- name: Identify Top Processes
type: command
command: |
ps -eo pid,ppid,cmd,%cpu --sort=-%cpu | head -n 10
host: ${AFFECTED_HOST}
- name: Capture Thread Dump
type: command
command: jstack -l ${JAVA_PID} > /tmp/thread-dump-$(date +%s).txt
when: tasks["Identify Top Processes"].output contains "java"
- name: Scale Out Service
type: kubernetes
action: scale
namespace: production
deployment: api
replicas: ${CURRENT_REPLICAS * 2}
when: tasks["Identify Top Processes"].output contains "api"
alerts:
- name: High CPU Investigation Required
severity: warning
message: |
High CPU usage detected. Investigation required.
Top processes:
${tasks["Identify Top Processes"].output}
Getting Started with These Examples
- Copy the YAML for the runbook you want to use
- Paste it into the DrDroid runbook editor
- Modify the variables and parameters as needed
- Save and Test the runbook in a non-production environment first
Next Steps