Skip to content

probes: BPF uprobe service for entry/exit duration tracking via OTLP logs#3181

Open
gnurizen wants to merge 1 commit into
mainfrom
simple-probes-v1
Open

probes: BPF uprobe service for entry/exit duration tracking via OTLP logs#3181
gnurizen wants to merge 1 commit into
mainfrom
simple-probes-v1

Conversation

@gnurizen

@gnurizen gnurizen commented May 14, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a generic eBPF uprobe service that attaches paired entry/exit probes to declared symbols and emits per-call duration records through the OTLP logs pipeline introduced in #3190.

The motivating use case is finding long JS execution blocks in Node.js applications. A 200 ms callback hogging the libuv event loop is exactly the "phantom" tail latency that doesn't show up in CPU profiles. The reference config attaches to node::InternalCallbackScope's ctor + dtor, brackets each outer libuv callback, and emits one record per scope close with a precise duration measured in BPF.

What's added

  • probes/ package:
    • YAML config schema: {id, file_match (regex), entry_symbol, exit_symbol, main_thread_only, min_duration_ms}.
    • eBPF uprobe program (paired entry + exit) with a per-tid scope counter that emits only when the outer scope closes. Nested scopes roll into the outer's measured duration. Main-thread filter via tid == tgid directly in BPF; libuv worker-pool / V8 background threads are silently dropped.
    • User-space drains the BPF ringbuf and emits one OTel log.Record per completed outer scope via Logger("parca-agent.probes"), so consumers can filter on attributes_scope.name to slice probe records vs other agent logs.
  • --probe-config <path> CLI flag (default: disabled). Requires a remote-store.
  • Reference probes/testdata/probes.yaml.sample with the Node.js InternalCallbackScope ctor + dtor pair pre-verified on Node v18.20 / v22.17 / v24.4 (same mangled symbol on all three, exported via .dynsym so stripped builds work too).
  • Build system: single arch-agnostic probe.bpf.o (no PT_REGS_* macros used). Makefile + .goreleaser.yml before: hook + CI workflow + probes/bpf/README.md set up to produce it on every release.

Data on the wire

Each probe-fire record:

Field Value
body "node.callback_scope" (stable; queryable)
timestamp exit ktime → unix ns via shared times.KTime offset (aligned with CPU sample timestamps; range queries over [start_ns, end_ns] against the profile-samples table don't drift)
attributes start_ns, end_ns, duration_ns, pid, tid, comm, is_main, spec_id, probe_id, level
resource inherited from the OTel SDK LoggerProvider: service.name, service.version, host.name
scope "parca-agent.probes"

Test plan

  • make probes-bpf produces probes/bpf/probe.bpf.o
  • go build ./... clean
  • go test ./probes/... ./reporter/... -count=1 — all pass
  • Local end-to-end on linux/amd64 against PolarSignals backend: ran node blocker.js doing synthetic 200 ms blocks. Agent attached, fired 142 times over 8 seconds. Duration histogram: 112 sub-ms callbacks (libuv noise), 27 records at 199 ms and 3 at 200 ms (the blocker). Exactly the expected signal.
  • Reviewer: verify on an arm64 build. BPF bytecode is arch-agnostic but exercising the cross-compile pipeline this PR sets up is worthwhile.

Stacked on

Was stacked on #3190 (OTel logs support), which has since merged. This branch is now rebased directly on top of main.

@gnurizen gnurizen force-pushed the simple-probes-v1 branch from 0779bed to 791b9b4 Compare May 26, 2026 15:17
@gnurizen gnurizen force-pushed the simple-probes-v1 branch 3 times, most recently from 8fa8124 to 1c1c011 Compare June 12, 2026 07:28
@gnurizen gnurizen changed the base branch from main to otlp-log June 12, 2026 07:29
@gnurizen gnurizen force-pushed the simple-probes-v1 branch 6 times, most recently from a178b36 to 2955851 Compare June 12, 2026 09:28
@gnurizen gnurizen mentioned this pull request Jun 17, 2026
4 tasks
Add a probes package that:
  - parses a YAML config of (symbol, file_match regex) pairs and assigns
    a 1-based spec_id per entry;
  - loads an embedded BPF uprobe program (probe.bpf.amd64, built by
    `make probes-bpf`) that emits one ringbuf record per fire carrying
    ktime/pid/tid/comm/spec_id;
  - on each newly-observed executable, regex-matches its path and
    attaches an exec.Uprobe per matching spec, encoding the spec_id in
    the uprobe cookie;
  - drains the ringbuf in a goroutine and forwards each event as a
    reporter.LogEvent (Body=symbol, attrs=pid/tid/comm/spec_id) via
    reporter.ParcaReporter.ReportLogEvents. The BPF service no longer
    owns the Arrow log stream — that lives in the reporter package now.

Reporter integration is via two small additions: a ProbesHook interface
(OnExecutable) plus a SetProbes setter on arrowReporter so ReportExecutable
can notify the BPF service to attach to fresh binaries.

Wires the existing --probe-config flag through main.go: when set, the
service is started with the parca reporter; offline mode is rejected
since log streaming needs a gRPC conn.
@gnurizen gnurizen changed the base branch from otlp-log to main June 17, 2026 17:51
@gnurizen gnurizen marked this pull request as ready for review June 17, 2026 17:52
@gnurizen gnurizen changed the title simple probes v1 probes: BPF uprobe service for paired entry/exit duration tracking via OTLP logs Jun 17, 2026
@gnurizen gnurizen changed the title probes: BPF uprobe service for paired entry/exit duration tracking via OTLP logs probes: BPF uprobe service for entry/exit duration tracking via OTLP logs Jun 17, 2026
@gnurizen gnurizen requested review from brancz and umanwizard June 17, 2026 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant