Explore these practical examples to get started with creating your own runbooks. Each example includes a description, use case, and the complete runbook definition.

1. Service Health Check

Use Case: Verify the health of a web service

name: Web Service Health Check
description: Check if web service is responding and healthy

variables:
  SERVICE_URL: "https://api.example.com/health"
  TIMEOUT_SECONDS: 10

tasks:
  - name: Check HTTP Status
    type: http
    method: GET
    url: ${SERVICE_URL}
    timeout: ${TIMEOUT_SECONDS}
    expect:
      status: 200
      body:
        status: "healthy"

  - name: Check Response Time
    type: metric
    query: 'http_request_duration_seconds{service="api"}'
    condition: 'value < 1'
    message: "Response time is within acceptable range"

alerts:
  - name: Service Unhealthy
    condition: tasks["Check HTTP Status"].status != "success"
    severity: critical
    message: "Service is not responding as expected"

2. Database Maintenance

Use Case: Perform routine database maintenance

name: Database Maintenance
description: Run routine maintenance tasks on the database

variables:
  DB_HOST: "db-prod"
  DB_NAME: "myapp_production"
  MAX_CONNECTIONS: 100

tasks:
  - name: Check Connection Count
    type: sql
    database: postgres
    query: |
      SELECT count(*) as active_connections 
      FROM pg_stat_activity 
      WHERE datname = '${DB_NAME}'
    condition: 'result.rows[0].active_connections < ${MAX_CONNECTIONS}'

  - name: Run Vacuum
    type: sql
    database: postgres
    query: VACUUM ANALYZE;
    timeout: 3600  # 1 hour

  - name: Reindex Database
    type: sql
    database: postgres
    query: REINDEX DATABASE ${DB_NAME};
    timeout: 7200  # 2 hours

3. Kubernetes Deployment Rollout

Use Case: Deploy a new version of a service

name: Kubernetes Service Deployment
description: Deploy a new version of a service with health checks

inputs:
  - name: IMAGE_TAG
    type: string
    required: true
    description: Docker image tag to deploy
  - name: NAMESPACE
    type: string
    default: "default"

tasks:
  - name: Deploy New Version
    type: kubernetes
    action: apply
    manifest: |
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: myapp
        namespace: ${NAMESPACE}
      spec:
        replicas: 3
        selector:
          matchLabels:
            app: myapp
        template:
          metadata:
            labels:
              app: myapp
          spec:
            containers:
            - name: myapp
              image: myregistry.com/myapp:${IMAGE_TAG}
              ports:
              - containerPort: 8080

  - name: Verify Rollout
    type: kubernetes
    action: rollout-status
    resource: deployment/myapp
    namespace: ${NAMESPACE}
    timeout: 300  # 5 minutes

4. Incident Response

Use Case: Automated response to high CPU usage

name: High CPU Response
description: Investigate and mitigate high CPU usage

trigger:
  condition: 'avg(cpu_usage{service="api"}) > 90'
  duration: 5m

tasks:
  - name: Identify Top Processes
    type: command
    command: |
      ps -eo pid,ppid,cmd,%cpu --sort=-%cpu | head -n 10
    host: ${AFFECTED_HOST}

  - name: Capture Thread Dump
    type: command
    command: jstack -l ${JAVA_PID} > /tmp/thread-dump-$(date +%s).txt
    when: tasks["Identify Top Processes"].output contains "java"

  - name: Scale Out Service
    type: kubernetes
    action: scale
    namespace: production
    deployment: api
    replicas: ${CURRENT_REPLICAS * 2}
    when: tasks["Identify Top Processes"].output contains "api"

alerts:
  - name: High CPU Investigation Required
    severity: warning
    message: |
      High CPU usage detected. Investigation required.
      Top processes:
      ${tasks["Identify Top Processes"].output}

Getting Started with These Examples

  1. Copy the YAML for the runbook you want to use
  2. Paste it into the DrDroid runbook editor
  3. Modify the variables and parameters as needed
  4. Save and Test the runbook in a non-production environment first

Next Steps