Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: runtime: surface silent panic-recovers #67243

Open
r-hang opened this issue May 7, 2024 · 0 comments
Open

proposal: runtime: surface silent panic-recovers #67243

r-hang opened this issue May 7, 2024 · 0 comments
Labels
Milestone

Comments

@r-hang
Copy link

r-hang commented May 7, 2024

Proposal Details

We propose a feature to expose whether or not Go programs are frequently recovering from panics without explicitly handling recover values.

Background

In very large Go codebases, we observe that application crashing panics frequently cause production issues. In response, teams have created and proliferated a wide variety of solutions ranging from frameworks like code-generated recovery decorators to recovery-wrapped goroutine libraries paired with linters against plain go routine usage. Unfortunately, these solutions are often unknowingly mashed together and not standardized.

Over years, we’ve experienced a fundamental tension with these “automagic” panic recover patterns. They are a safety net that can prevent application crashes but they also frequently enable serious underlying panics and problems to go unnoticed in production.

Solution Idea

Adding logging and metric emissions to thousands of recover sites is often not feasible internally and not easy to do for 3rd party libraries we depend on. At Uber, we’re experimenting with changing go’s recover statement to automatically print a standard message with configurable formatting very similar to how the GOTRACEBACK setting helps Go users debug the causes of panics. Printing the recover value helps surface issues and printing the stacktrace helps users meaningfully debug the underlying panic.

Go runtime exported metrics are a possible alternative to provide Go users a sense of overall application health but can’t easily surface where individual panics are happening.

@r-hang r-hang added the Proposal label May 7, 2024
@ianlancetaylor ianlancetaylor changed the title proposal runtime: surface silent panic-recovers proposal: runtime: surface silent panic-recovers May 7, 2024
@gopherbot gopherbot added this to the Proposal milestone May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

2 participants