otel: per-query SQL tracing, sampled only on manual sync#21
Merged
Conversation
Flat /api traces showed the request but not the SQL behind it, and there was no way to see ent's queries or the raw sync_status statements as nested spans. Add per-query DB spans via XSAM/otelsql, opened on the shared *sql.DB so ent and the raw queries are both covered. On by default (the data is useful and its volume is bounded); set PDBPLUS_OTEL_SQL=false to disable. otelsql adds no new transitive modules; it depends only on the OpenTelemetry packages already in the graph. Volume is bounded by the existing ParentBased per-route sampler: DB spans inherit their parent request/sync span's decision. The historical concern was the sync path, so scheduled sync cycles are no longer traced: the sampler gains two gates read from the root span's start attributes -- pdbplus.origin=sync drops a span by default, pdbplus.force_sample forces it. The sync worker stamps origin=sync on every cycle and force_sample only when the cycle was started by a manual POST /sync; that flag rides an app-root context value (WithForceTrace) set by the handler. Manual syncs are traced by default; POST /sync?trace=0 opts out. Net: on by default; API-read DB spans follow the per-route rate, scheduled syncs stay trace-free, and a manually-triggered sync is observable end to end including its SQL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Code Metrics Report
Code coverage of files in pull request scope (73.1%)
Reported by octocov |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds per-query SQL tracing so a trace shows the SQL behind a request, not just the HTTP envelope. Ships as v1.19.5.
What
*sql.DBis opened through XSAM/otelsql, so every statement — ent's queries and the rawsync_statusstatements — emits a DB span nested under the active request/sync span. (otelsql adds no new transitive modules — it depends only on OTel packages already in the graph.)PDBPLUS_OTEL_SQL, set=falseto disable) — the data is useful and its volume is bounded by the trace sampler.Volume control (the historical concern was the sync path)
DB spans inherit their parent's sampling via the existing
ParentBasedper-route sampler. Two new sampler gates on the sync root span:pdbplus.origin=sync→ scheduled sync cycles are dropped (no trace, so no DB spans);pdbplus.force_sample→ a manualPOST /syncis force-sampled (traced by default;?trace=0opts out), via aWithForceTracecontext flag threaded from the handler.So: API/rest/graphql/connect query spans follow
PDBPLUS_OTEL_SAMPLE_RATE; scheduled syncs stay trace-free; manual syncs are observable end-to-end including SQL.Verification
go build ./...,-racetests forinternal/{config,database,otel,sync}+cmd(incl. newTestPerRouteSampler_SyncTraceGatingandTestOpen_TracedSQL),golangci-lint(0 issues) — all green./apirequest now produces nested DB spans (allowing for batch-export lag).Note
Trace-volume worth a Grafana check in a few days against the 50 GB/mo budget; dial back via
PDBPLUS_OTEL_SAMPLE_RATEorPDBPLUS_OTEL_SQL=falseif it trends high.🤖 Generated with Claude Code