Skip to content

OpenTelemetry logging support#3190

Merged
gnurizen merged 2 commits into
mainfrom
otlp-log
Jun 17, 2026
Merged

OpenTelemetry logging support#3190
gnurizen merged 2 commits into
mainfrom
otlp-log

Conversation

@gnurizen

Copy link
Copy Markdown
Contributor

Summary

Adds OTLP/gRPC logs support to parca-agent so any in-process producer
(the eBPF probes service in PR #3181, the agent's own logrus output via
the new --otlp-logging flag) can ship structured log records over the
same gRPC connection already used for profile data.

Built on the upstream OTel Go SDK (sdklog.LoggerProvider +
otlploggrpc.Exporter), so retry, backoff, queue, batch, and
PartialSuccess handling come from the SDK rather than being
hand-rolled.

Commits

  • reporter: add ParcaReporter log-event interface and per-sample relabel

    • New ParcaReporter interface extending OTel's TraceReporter with
      Logger(scope string) log.Logger. Producers obtain a per-scope
      log.Logger and emit records directly.
    • Reporter struct renamed arrowReporter and now owns a
      *sdklog.LoggerProvider. The provider uses otlploggrpc.New(... WithGRPCConn(conn)) so it shares the agent's existing gRPC
      connection.
    • sdklog.NewBatchProcessor configured with size 512 / age 250 ms /
      queue depth 4096 — tighter than the SDK default so individual events
      land on the server within a few hundred ms.
    • sdklog.WithResource fills service.name, service.version,
      host.name as required attributes.
    • Offline mode (no gRPC conn) returns the OTel no-op logger; emit
      calls become inert and producers don't need an extra nil check.
    • Also: per-sample relabel pass for probe-origin trace samples
      (labelsForTID runs a second relabel.ProcessBuilder against
      patched per-sample labels when meta.Origin == TraceOriginProbe).
  • reporter: --otlp-logging flag forwards agent logs over OTLP

    • New --otlp-logging flag, default off. When set, installs a logrus
      hook that converts each entry to an OTel log.Record and emits via
      Logger(\"parca-agent.agent\").
    • Severity / SeverityText come from the logrus level; a level
      attribute is always emitted so consumers can filter by level even
      though the OTLP server doesn't store severity_text directly.
    • OTLPSkipField ("otlp_skip") is an opt-out: entries tagged with
      this field set to true are dropped before emit. Used by the probes
      service's per-fire debug logging to avoid double-shipping.
    • In offline mode the flag logs a one-shot warning and behaves as
      off, since Logger() returns the no-op variant.

gRPC codec extension

flags/codec.go gains handling for pdata's SizeProto/MarshalProto
shape so the global vtproto codec doesn't reject the pdata-typed
requests otlploggrpc sends. No change in behavior for existing
profile-data paths (which use vtproto or gogoproto).

Test plan

  • `go build ./...` clean
  • `go test ./reporter/... -count=1` — all pass; new unit tests
    cover the logrus hook (severity / attribute / skip-field / level
    mapping) using an in-process `captureExporter` + SDK
    `SimpleProcessor`
  • Local end-to-end run on linux/amd64 against PolarSignals backend:
    agent logs flow as OTLP records under `attributes_resource.host.name`
  • Reviewer: confirm gRPC connection is correctly shared with the
    profile-data path (no second TCP/TLS handshake) — `WithGRPCConn`
    should make this trivial to verify with `ss -tnp`

Stacked on

PR #3181 (`simple-probes-v1`) rebases on top of this branch and adds
the eBPF probes service as a third Logger consumer.

gnurizen added 2 commits June 17, 2026 11:35
Introduce a reporter.ParcaReporter interface that extends otel's
TraceReporter with Logger(scope string) log.Logger so any producer
(uprobes or otherwise) can ship OTel log records through a shared gRPC
connection without owning its own pipeline.

The existing struct is renamed arrowReporter and now owns a
*sdklog.LoggerProvider built from:
  - go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc as the
    transport, configured with WithGRPCConn so it shares the connection
    already established for profile data;
  - sdklog.NewBatchProcessor for queue + batch + retry (size 512, age
    250ms, queue depth 4096), tuned tighter than the SDK default so
    individual events show up on the server within a few hundred ms;
  - sdklog.WithResource fills service.name = "parca-agent",
    service.version = build VCS revision, host.name = agent --node.

Constructed only when a gRPC conn is provided; offline mode returns the
OTel no-op Logger so callers can use Logger() unconditionally and emit
calls become inert.

Also fold in per-sample relabeling for probe-origin trace samples:
labelsForTID now runs a second relabel.ProcessBuilder pass against the
patched per-sample labels (thread_id, thread_name, cpu) when meta.Origin
is TraceOriginProbe, so relabel rules can derive custom labels from
per-sample fields without touching the cached per-PID fast path used by
CPU/off-CPU/memory/cuda samples.
When set, install a logrus hook on the global logger that converts each
entry to an OTel log.Record and emits it via a Logger obtained from the
shared LoggerProvider (scope "parca-agent.agent"), so logrus calls flow
alongside probe events and any other log producers through the same
batch processor and gRPC connection. The hook captures every level
logrus emits — actual filtering is left to the logger's own configuration.

Severity/SeverityText come straight from the logrus level, and a `level`
attribute is always emitted alongside (the OTLP server doesn't store
severity_text, so the attribute is the only way to filter by level
downstream).

OTLPSkipField ("otlp_skip") is a tagged opt-out: entries with this
field set to true are dropped before emit. Used by the probes service's
per-fire debug logging to avoid double-shipping the same event (once as
a probe record, once as an agent log).

In offline mode (no remote-store), the flag logs a one-shot warning at
startup and otherwise behaves as off, since the LoggerProvider is itself
absent in that case and Logger() returns the no-op variant.
@gnurizen gnurizen merged commit 6324893 into main Jun 17, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants