Skip to content

otel: per-query SQL tracing, sampled only on manual sync#21

Merged
dotwaffle merged 1 commit into
mainfrom
otelsql-db-spans
Jun 4, 2026
Merged

otel: per-query SQL tracing, sampled only on manual sync#21
dotwaffle merged 1 commit into
mainfrom
otelsql-db-spans

Conversation

@dotwaffle

Copy link
Copy Markdown
Owner

Summary

Adds per-query SQL tracing so a trace shows the SQL behind a request, not just the HTTP envelope. Ships as v1.19.5.

What

  • The shared *sql.DB is opened through XSAM/otelsql, so every statement — ent's queries and the raw sync_status statements — emits a DB span nested under the active request/sync span. (otelsql adds no new transitive modules — it depends only on OTel packages already in the graph.)
  • On by default (PDBPLUS_OTEL_SQL, set =false to disable) — the data is useful and its volume is bounded by the trace sampler.

Volume control (the historical concern was the sync path)

DB spans inherit their parent's sampling via the existing ParentBased per-route sampler. Two new sampler gates on the sync root span:

  • pdbplus.origin=syncscheduled sync cycles are dropped (no trace, so no DB spans);
  • pdbplus.force_sample → a manual POST /sync is force-sampled (traced by default; ?trace=0 opts out), via a WithForceTrace context flag threaded from the handler.

So: API/rest/graphql/connect query spans follow PDBPLUS_OTEL_SAMPLE_RATE; scheduled syncs stay trace-free; manual syncs are observable end-to-end including SQL.

Verification

  • go build ./..., -race tests for internal/{config,database,otel,sync} + cmd (incl. new TestPerRouteSampler_SyncTraceGating and TestOpen_TracedSQL), golangci-lint (0 issues) — all green.
  • Post-deploy: confirm fleet health, that scheduled-sync traces have stopped, and that an /api request now produces nested DB spans (allowing for batch-export lag).

Note

Trace-volume worth a Grafana check in a few days against the 50 GB/mo budget; dial back via PDBPLUS_OTEL_SAMPLE_RATE or PDBPLUS_OTEL_SQL=false if it trends high.

🤖 Generated with Claude Code

Flat /api traces showed the request but not the SQL behind it, and there
was no way to see ent's queries or the raw sync_status statements as
nested spans.

Add per-query DB spans via XSAM/otelsql, opened on the shared *sql.DB so
ent and the raw queries are both covered. On by default (the data is
useful and its volume is bounded); set PDBPLUS_OTEL_SQL=false to disable.
otelsql adds no new transitive modules; it depends only on the
OpenTelemetry packages already in the graph.

Volume is bounded by the existing ParentBased per-route sampler: DB spans
inherit their parent request/sync span's decision. The historical concern
was the sync path, so scheduled sync cycles are no longer traced: the
sampler gains two gates read from the root span's start attributes --
pdbplus.origin=sync drops a span by default, pdbplus.force_sample forces
it. The sync worker stamps origin=sync on every cycle and force_sample
only when the cycle was started by a manual POST /sync; that flag rides an
app-root context value (WithForceTrace) set by the handler. Manual syncs
are traced by default; POST /sync?trace=0 opts out.

Net: on by default; API-read DB spans follow the per-route rate, scheduled
syncs stay trace-free, and a manually-triggered sync is observable end to
end including its SQL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

Code Metrics Report

Coverage Test Execution Time
80.9% 4m12s

Code coverage of files in pull request scope (73.1%)

Files Coverage
cmd/peeringdb-plus/main.go 35.1%
internal/config/config.go 92.2%
internal/database/database.go 92.3%
internal/otel/sampler.go 100.0%
internal/sync/worker.go 89.2%

Reported by octocov

@dotwaffle dotwaffle merged commit dee32db into main Jun 4, 2026
1 of 2 checks passed
@dotwaffle dotwaffle deleted the otelsql-db-spans branch June 4, 2026 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant