The Open Source DevOps Assistant - solve problems twice as fast with an AI teammate
-
Updated
Jun 11, 2024 - Python
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
The Open Source DevOps Assistant - solve problems twice as fast with an AI teammate
Terraform Pull Request Automation
⭐ 【开源书籍】深入讲解内核网络、Kubernetes、ServiceMesh、容器等云原生相关技术。经历实践检验的 DevOps、SRE指南。如发现错误,谢谢提issue
A prometheus exporter for pg-promise
A prometheus exporter for node-postgres
A prometheus exporter exposing metrics for KafkaJS
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
Kaytu's AI platform boosts cloud efficiency by analyzing historical usage and delivering intelligent recommendations—such as optimizing instance sizes—that maintain reliability. Pay for what you need, without compromising your apps.
An active monitoring software to detect failures before your customers do.
A collection of opensource runbooks / playbooks
🐒 🔥 Datadog Failure Injection System for Kubernetes
SREnovate is aimed at providing a comprehensive platform for SRE enthusiasts to learn, collaborate, and work on projects related to Site Reliability Engineering.
log data pre processing in python
Cloud-ops automation runbooks that are ready to use. Build your own automations using the hundreds of drag and drop actions included in the repository. Built on Jupyter Notebooks, our automation platform jumpstarts your SRE RunBook creation. 😎 published by the unSkript community.
A blazing fast tool for building data pipelines: read, process and output events. Our community: https://t.me/file_d_community
DevOps Roadmap for 2024. with learning resources
DevOps Tutorials
A curated list of Site Reliability and Production Engineering resources.