Skip to content

Runbook automation platform with deep observability integrations & Jupyter-notebooks style interface

License

Notifications You must be signed in to change notification settings

DrDroidLab/PlayBooks

Repository files navigation

Doctor Droid Logo

Docs | Sandbox | Community


tl;dr Enrich your Slack alerts with contextual observability data, helping on-call engineer investigate faster.

About PlayBooks

PlayBooks are executable notebooks designed to Automate Preliminary Investigations in Production for engineers. Watch demo video.

Using PlayBooks, a user can configure the steps as data queries or actions within their observability stack. Here are the integrations we currently support:

  1. Run bash commands on a remote server;
  2. Fetch logs from AWS Cloudwatch and Azure Log Analytics;
  3. Fetch metrics from any PromQL compatible db, AWS Cloudwatch, Datadog and New Relic;
  4. Query PostgreSQL, ClickHouse or any other JDBC compatible databases;
  5. Write a custom API call;
  6. Query events from EKS / GKE;
  7. Add an iFrame

Automating Playbook Executions

  1. Define a playbook with your enrichment logic
  2. Configure the playbook to auto-trigger basis a Slack alert received in a channel
  3. Receive automated investigation summary in the Slack thread for the same alert

Playground:

  • The sandbox has a sample Playbook created. Check out how it works.
  • You can also check out the #demo-alerts channel in community Slack workspace to see how automated replies are received for alerts.

Capabilities

  • Enrichment library: The tool currently supports fetching 50+ types of enrichment data from metric sources (Datadog, New Relic, Grafana+Prometheus, Cloudwatch Metrics), Logs & Events (Cloudwatch Logs, EKS) and Databases (PostgreSQL DB, Clickhouse DB)

  • Past Executions: See the historical runs of a playbook and go back to an investigation from a specific point in time.

  • Continuous monitoring: Setup continuous monitoring cron for specific use-cases (e.g. post deployment, peak hours, post bug-fix). Read docs for list of allowed configurations.

  • Interpretation Layer: Configure ML modules which can analyse & interpret data from your investigation playbooks.

Coming Soon:

  • Templates: Common investigation & troubleshooting logics which can be used out of the box.
  • Conditionals: Create decision trees in your playbooks basis evaluation of a playbook step.
  • More integrations: Find something missing? Request here.

Getting Started with alert enrichment

Step 1: Follow this guide to setup Playbooks by docker-compose or helm.

Step 2: Follow this Step-by-Step guide to do your first alert enrichment.

Have feedback or queries?

Asks questions in the Slack Community or write to us at founders [at] drdroid [dot] io

Want to contribute?

Read our contribution guidelines

Roadmap

Read our roadmap here