-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce pyroscope integration for continuous profiling on demand #4254
Conversation
I personally don't like the idea of attaching Also, it looks like the pyroscope server sometimes fails to shutdown if I call any API (say |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would help if you could also include an instruction of how to enable profiler and see the actual flamegraph
When I try to run with config in settings, i get this:
Which doesn't look too good. And it is also not related to this: grafana/pyroscope-rs#165, which is also doesn't look good. |
2a1627d
to
d194f0f
Compare
31c0d02
to
112e4d1
Compare
TLS tests seem to be failing because we already use See this. Is this debug actually doing something? Because it isn't a part of the current Settings struct. |
IIRC we used it before, but deprecated it some time ago. That would mean you can safely remove it. One problem though is that users might still have |
Sadly, this still crashes on my machine when enabling pyroscope through 2024-05-27T08:14:57.765853Z INFO actix_web::middleware::logger: 127.0.0.1 "PATCH /debug HTTP/1.1" 200 53 "http://localhost:6333/dashboard" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0" 2.632543
2024-05-27T08:15:00.390173Z ERROR qdrant::startup: Panic backtrace:
0: qdrant::startup::setup_panic_hook::{{closure}}
at ./src/startup.rs:19:25
1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2034:9
2: std::panicking::rust_panic_with_hook
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:783:13
3: std::panicking::begin_panic_handler::{{closure}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:649:13
4: std::sys_common::backtrace::__rust_end_short_backtrace
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:171:18
5: rust_begin_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:645:5
6: core::panicking::panic_nounwind_fmt::runtime
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panicking.rs:110:18
7: core::panicking::panic_nounwind_fmt
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panicking.rs:123:9
8: core::panicking::panic_nounwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panicking.rs:156:5
9: core::slice::raw::from_raw_parts::precondition_check
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/intrinsics.rs:2799:21
10: core::slice::raw::from_raw_parts
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/slice/raw.rs:98:9
11: <pprof::collector::TempFdArrayIterator<T> as core::iter::traits::iterator::Iterator>::next
at /home/timvisee/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pprof-0.12.1/src/collector.rs:225:26
12: core::iter::traits::iterator::Iterator::fold
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/iter/traits/iterator.rs:2586:29
13: <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::fold
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/iter/adapters/chain.rs:93:19
14: core::iter::traits::iterator::Iterator::for_each
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/iter/traits/iterator.rs:817:9
15: pprof::report::ReportBuilder::build
at /home/timvisee/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pprof-0.12.1/src/report.rs:110:17
16: pyroscope_pprofrs::Pprof::dump_report
at /home/timvisee/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyroscope_pprofrs-0.2.7/src/lib.rs:202:22
17: <pyroscope_pprofrs::Pprof as pyroscope::backend::backend::Backend>::report
at /home/timvisee/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyroscope_pprofrs-0.2.7/src/lib.rs:159:9
18: pyroscope::pyroscope::PyroscopeAgent<pyroscope::pyroscope::PyroscopeAgentReady>::start::{{closure}}
at /home/timvisee/.cargo/registry/src/index.crates.io-6f17d22bba15001f/pyroscope-0.5.7/src/pyroscope.rs:676:38
19: std::sys_common::backtrace::__rust_begin_short_backtrace
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys_common/backtrace.rs:155:18
20: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/mod.rs:528:17
21: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/panic/unwind_safe.rs:272:9
22: std::panicking::try::do_call
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:552:40
23: __rust_try
24: std::panicking::try
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panicking.rs:516:19
25: std::panic::catch_unwind
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/panic.rs:146:14
26: std::thread::Builder::spawn_unchecked_::{{closure}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/thread/mod.rs:527:30
27: core::ops::function::FnOnce::call_once{{vtable.shim}}
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/core/src/ops/function.rs:250:5
28: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
29: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/alloc/src/boxed.rs:2020:9
30: std::sys::pal::unix::thread::Thread::new::thread_start
at /rustc/9b00956e56009bab2aa15d7bff10916599e3d6d6/library/std/src/sys/pal/unix/thread.rs:108:17
31: start_thread
at ./nptl/pthread_create.c:447:8
32: __GI___clone3
at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
2024-05-27T08:15:00.390249Z ERROR qdrant::startup: Panic occurred in file library/core/src/panicking.rs at line 156: unsafe precondition(s) violated: slice::from_raw_parts requires the pointer to be aligned and non-null, and the total size of the slice not to exceed `isize::MAX`
2024-05-27T08:15:00.398996Z DEBUG reqwest::connect: starting new connection: https://staging-telemetry.qdrant.io/
thread caused non-unwinding panic. aborting.
fish: Job 1, 'cargo run $argv' terminated by signal SIGABRT (Abort) Update: it only crashes when running the binary directly, it is fine when running through our default Docker image. I'm assuming some system library mismatch. |
* feat: Working pyroscope profiler integration * fix: Remove UpdateDebugConfigResponse * fix: Format * fix: Simplify debug settings type * fix: Remove /debug from API spec and clean pyrostate state * fix: Update openapi.json * refactor: Move models to suitable paths * refactor: Improve API and pyroscope state structure * fix: Format * feat: Use null debug for pyroscope config * fix: Pyroscope shouldnt be used and enabled in windows * fix: Format * fix: Remove pyroscope for macos * fix: Add more compile conditions * fix: Try to fix OS target issue * feat: Move DebugConfig and Pyroscope to common::debug * feat: Introduce and use DebugState instead of PyroscopeState * feat: Simplify debug_api.rs and move logic to debug.rs * fix: format * fix: Use debug patch enum * feat: Use parking lot Mutex * fix: Propagate errors instead of panic * fix: Format * feat: Forward stop agent errors * feat: Use enums * fix: For non linux OS * fix: Missing import for non linux os * fix: Remove redundant logs * feat: Take lock throughout the patch * Don't unwrap pyroscope state, it may be None * fix: Remove debug: true from tls config tests * refactor: Rename debug setting to debugger * refactor: Rename debug to debugger everywhere * Improve logs --------- Co-authored-by: timvisee <tim@visee.me>
This will allow us to understand the bottlenecks (through chaos testing) and debug customer nodes when required.
Steps to run:
process_cpu:cpu:nanoseconds:cpu:nanoseconds{service_name="qdrant"}
as the query to see only Qdrant profiling resultsAlternatively you may call the API:
All Submissions:
dev
branch. Did you create your branch fromdev
?Changes to Core Features: