feat: consolidate tracer, fees & CRM into the midaz v4 monorepo#2159
Open
fredcamaral wants to merge 428 commits into
Open
feat: consolidate tracer, fees & CRM into the midaz v4 monorepo#2159fredcamaral wants to merge 428 commits into
fredcamaral wants to merge 428 commits into
Conversation
X-Lerian-Ref: 0x1
…ysis Redis backup_queue hash is the durable WAL of authorized transactions (atomic seed in the Lua script, no TTL, AOF everysec, post-persist HDel) so the RabbitMQ DLQ is flow-control only; the financial fix is Epic 4.4 quarantine-then-delete for poison backup records. F21 remediation reframed accordingly (delete-only would destroy the last copy). X-Lerian-Ref: 0x1
Nine tasks across five epics per D5-v2: shared retry engine in pkg/rabbitmq, DLQ as flow-control, Postgres quarantine for poison backup records, panic conversions, HMAC hard-fail + PARTIAL status. X-Lerian-Ref: 0x1
…rdening Epic 4.1: reporter retry machinery generalized into pkg/rabbitmq (classifier interface, RetryManager engine, republish hook seam, header-name lock test); reporter re-pointed, behavior tests unchanged. Epic 4.4 (D5-v2 layer 2): transaction_backup_quarantine table + repository; poison backup records quarantined after 3 cycles with insert-before-delete invariant guarded by tests; backup-queue depth/age + quarantine + tenant-skip metrics. Epic 4.5 (D7): HMAC hard-fail with typed 401/0310 permanent error on both pipeline and reconciler paths; PARTIAL report status with per-section classified error codes. Task 4.3.1: newValidator panic converted to error return. X-Lerian-Ref: 0x1
transaction.dlx/dlq topology mirroring reporter (TTL 7d, max-len 10k); the three blanket Nack(requeue=true) sites route through the shared pkg/rabbitmq retry engine (DefaultClassifier, maxRetries 3, exponential backoff, ST republish hook with lazy channel resolution); constructor panic converted to propagated error. DLQ is flow-control only - the durable copy lives in the Redis backup hash. Also lands files missed by the previous Phase 4 commit: PARTIAL status constant, metric definitions, quarantine/metrics test files, go mod tidy artifacts. X-Lerian-Ref: 0x1
X-Lerian-Ref: 0x1
X-Lerian-Ref: 0x1
Six parallel territory passes applying T3/T5/T6/T7/T8/T13/E10: ~830 fmt.Sprintf-in-logger sites converted to constant messages with typed libLog fields; ~400 per-request Info narration lines deleted or demoted to Debug (Info reduced to the sanctioned milestone list); ~150 leaf child-span ctx rebinds flipped to non-rebinding form (one live mis-nesting bug fixed in reporter template find_list); ~135 business-class span records switched to HandleSpanBusinessErrorEvent via pkg.IsBusinessError (CRM 0->24, fees 46, ledger-http 65 boundary conversions with class-checked helper); ~330 duplicate inner-layer error logs dropped (single-point logging); all 90+ reflect.TypeOf entity sites moved to constant.Entity*; import aliases normalized repo-wide; SQL args dropped from Debug logs. Span-status contract test added (business failure = UNSET, infra = Error). Absorbs the pre-existing format-pass noise entangled in swept files. X-Lerian-Ref: 0x1
domain_operations_total{component,operation,result} +
domain_operation_duration_ms{component,operation} emitted at every
public use-case exit boundary: ledger 45 ops, crm 10, fees 11,
tracer 15, reporter 19 - catalog documented under T11. Shared
RecordDomainOperation helper classifies result via pkg.IsBusinessError;
nil factory is a no-op. Epic 5.6 verified already-satisfied: tracer
has been on MetricsFactory via the OTel-Prometheus bridge since before
this plan; allowlist cardinality model intact - no migration needed.
X-Lerian-Ref: 0x1
X-Lerian-Ref: 0x1
make ci = lint + check-telemetry + test-unit (pre-existing test-matrix target renamed ci-tests); five rg enforcement gates (Sprintf-in-logger, SetSpanAttributesFromValue, prefixed wire codes, Info narration, reflect entity names) each proven to fire on a planted violation; dogsled max-blank-identifiers=3 for the lib's 4-return tracking API; ~135 mechanical lint fixes across components. Docs synced to the post-normalization reality: standards file:line refs re-resolved, future-tense process language removed, PROJECT_RULES T5/T8 contradictions fixed, tracer CLAUDE.md rewritten, stale transitional docstrings corrected. AGENTS.md includes entangled pre-existing edits. X-Lerian-Ref: 0x1
X-Lerian-Ref: 0x1
Update the LerianStudio/github-actions-shared-workflows references from v1.27.5 and v1.28.9 to v1.33.0 across all GitHub Actions workflows to leverage the latest CI/CD pipeline features and fixes. Also, bump the golangci_lint_version from v2.4.0 to v2.12.2 in the Go analysis workflow to apply updated linting rules and improve code quality checks.
Locked decisions + before/after architecture + engine port→reporter infra mapping + phase breakdown for embedding fetcher's pkg/engine in-process, replacing the remote-HTTP fetcher coupling.
…engine Phase 1 of the reporter→engine migration: the critical-path spike. - pkg/reporter/engine/: ConnectorRegistry adapter over the embedded github.com/LerianStudio/fetcher/pkg/engine (standalone module, empty require — zero third-party dep inheritance; lib-commons stays v5.5.0). - TenantResolver seam with multi/single-tenant implementations. MT resolves per-tenant PG/Mongo handles from engine TenantContext via lib-commons tenant managers, validates tenant shape (tmcore.IsValidTenantID), and rejects empty/malformed tenant in MT mode — no cross-tenant read path. - sqlQuerier seam (satisfied by *sql.DB and dbresolver.DB) collapses the former DirectProvider/FetcherProvider split into one tenant-aware path. - Streaming bounded-memory cursors (one row at a time, table-by-table) for Postgres and Mongo; ctx-cancel honored with category-correct errors. - Filters fail closed (CategoryValidation) until WHERE translation lands. - bson.Binary/UUID conversion mirrors the reporter's existing semantics. Unit + testcontainers integration tests (tenant isolation, projection, ctx-cancel, 50k-row bounded-memory). Engine require is tidy-stable/direct.
…MT-postgres Phase 2 of the reporter→engine migration: construct the engine in the reporter-worker bootstrap with fail-fast validation. The engine is built but not yet driven by the job handler (that is Phase 3). Adapters (pkg/reporter/engine/): - ConnectionStore: read-mostly over the env-configured datasources; FindConnection stamps the tenant via WithTenantID (the load-bearing link ExecuteExtraction depends on); write methods return unsupported. - Observability: over the worker's OTel tracer; nil-tracer no-op. - SchemaCache: optional, Redis-backed, tenant-scoped keys; a Redis fault degrades to cache-miss (fresh discovery), never fails extraction. Bootstrap (components/reporter-worker/internal/bootstrap/): - config_engine.go assembles engine.New(WithConnectorRegistry/ConnectionStore/ Observability/Limits); a nil/typed-nil required port aborts boot (fail-fast). No CredentialProtector, no encrypted persistence — transport-S3 is dead. - Real multi-tenant PostgreSQL: tenantPostgresAdapter over lib-commons tenant-manager/postgres.Manager (GetDB→dbresolver.DB satisfies SQLQuerier), mirroring the tmmongo wiring. Per-tenant pools drain on shutdown. Tenant isolation invariant: an MT request resolves ONLY its tenant's pool. GetDB errors fail closed (CategoryUnavailable); a nil connection is guarded at both the adapter and resolver seams; there is no fallback to the single-tenant pool or another tenant's DB. Confirmed by production using MT-postgres reporting — the earlier fail-closed stub is removed entirely. ExecutionStore deferred (the reporter already persists report status). Phase 1 seam types SQLQuerier/SingleTenantDatasources exported via type alias so the bootstrap adapter (a different package) can name them. Unit + bootstrap tests (tenant forwarding, error propagation without shared-pool fallback, fail-fast on nil registry, no encrypted persistence). Build, CI-version lint (v2.12.2), and tests green; lib-commons stays v5.5.0, no new third-party module entered the graph.
…6.4) The Makefile pins fell out of sync with .github/workflows/go-combined-analysis.yml (which they are commented to track): golangci-lint was v2.4.0 vs CI's v2.12.2, and GO_VERSION was 1.26.3 vs CI's 1.26.4 (go.mod's go directive is already 1.26.4). Under the stale v2.4.0, `make lint` reported prealloc/wsl_v5 false-positives that do not exist under the v2.12.2 the CI gate actually runs, so local lint disagreed with CI. Bumping both pins makes `make lint` and `make`-driven builds reproduce CI.
…e 3)
Replace the remote-HTTP fetcher extraction path with the embedded engine.
Reports now extract synchronously through pkg/reporter/engine instead of
dispatching to fetcher over RabbitMQ + transport-S3.
Engine-driven extraction:
- generate-report-extraction.go drives the engine, decodes planner filters,
and re-keys engine output (dot-notation schema.table) to the Pongo2
renderer contract (schema__table) via resolveTableKeys autodiscovery.
- connector_{postgres,mongo}.go gain filter translation with legacy parity
(Equals single->Eq / multi->IN, Between date upper-bound expansion to
end-of-day), replacing the fail-closed rejectUnsupportedFilters stub.
- pkg/reporter/engine/filters.go decodes the planner's map[string]any back
into typed datasource filters with arity validation.
plugin_crm parity (routed OUTSIDE the generic engine, which queries literal
collection names only):
- plugincrm/ module: TransformFilters (document->search.document hash
pre-transform), FanOutOrgCollections (holders_* org fan-out +
organization_id injection), DecryptRecords (field decryption).
- uc.extractPluginCRM composes them; tenant context is MT-aware ("default"
placeholder applies in single-tenant mode only — never substituted under
multi-tenant, which would resolve a real tenant's DB).
- worker-level integration test proves decryption round-trip, org fan-out +
org-id injection, hash-filter subset selection, and fail-closed on missing
key against a live testcontainers MongoDB.
Deletions (HTTP path now dead): notification_consumer, reconciler,
process-notification, extraction-request dispatch, data-pipeline/decrypt/hmac,
config_fetcher, generate-report-data, and their tests. The FETCHER_ENABLED
code gate is removed; the env var + Helm values are cleaned up in a later
phase. redisRequired is now gated on MultiTenantEnabled only.
BREAKING CHANGE: the reporter CRM datasource is now named "crm" everywhere
the legacy "plugin_crm" identifier was used. CRM lives inside the ledger
component; "plugin" was legacy residue from the standalone-plugin era.
Renamed across the reporter subsystem (worker + manager + pkg/reporter + tests):
- Package components/reporter-worker/internal/services/plugincrm -> .../crm.
- Datasource config-name literal "plugin_crm" -> "crm" (crm.DatasourceName,
crmDataSourceID in pkg/reporter/datasource + reporter-manager, e2e DSCRM).
- Go identifiers: PluginCRM -> CRM fragment (CryptoHashSecretKeyCRM,
CryptoEncryptSecretKeyCRM, GetDatabaseSchemaForCRM, extractCRM, etc.).
- Env vars: CRYPTO_HASH_SECRET_KEY_PLUGIN_CRM -> CRYPTO_HASH_SECRET_KEY_CRM,
CRYPTO_ENCRYPT_SECRET_KEY_PLUGIN_CRM -> CRYPTO_ENCRYPT_SECRET_KEY_CRM,
DATASOURCE_PLUGIN_CRM_* -> DATASOURCE_CRM_* (struct tags, .env.example,
validation, error-message strings, e2e env map).
- The crm extraction parity integration test tracks the new literals and
stays green as the regression guard.
"crm" is now a RESERVED datasource token: the handler routes a section to the
crm decrypt + org-fan-out path when the datasource name Is("crm"). A generic
datasource may no longer be named "crm" (the multi-datasource parity test's
generic mongo source was renamed to "mongo" accordingly).
Out of scope (left verbatim): the ledger authz namespace "plugin-crm" (hyphen,
the X1 migration), dated historical docs, and docker-compose container names.
Operator migration (owned outside this commit): existing report templates
referencing {{ plugin_crm.* }} and the datasource registered as plugin_crm
must move to crm; deployed secrets must move to the *_CRM / DATASOURCE_CRM_*
env names.
…e nolint The five Go component Makefiles each defined their own GOLANGCI_LINT_VERSION := v2.4.0, which the root var does not export. So `make lint`/`make ci` linted the components at v2.4.0 while the root tree (tests/, pkg/) and CI (go-combined-analysis.yml) used v2.12.2 — local CI was weaker than the real gate. Bump all five pins (plus the two hardcoded go install lines) to v2.12.2 so make ci faithfully mirrors CI. Under v2.12.2 the prealloc linter no longer flags the nil-kept overdraft items slice, so its //nolint:prealloc directive became unused (nolintlint). Drop the directive; the explanatory comment above the declaration already documents why the slice must stay nil.
…ant, retire fetcher HTTP path Phase 4 (Option B) of the reporter→engine migration. The reporter-manager's schema discovery and validation now run in-process and resolve per-tenant connection pools through lib-commons tenant managers, mirroring the worker's Phase 2b wiring. The remote FetcherProvider HTTP path is deleted. What changed: - New tenant_schema_source.go resolves the per-tenant pool via tmpostgres.Manager (GetDB→dbresolver.DB) and tmmongo.Manager (GetDatabaseForTenant→*mongo.Database) behind narrow TenantPostgresManager/ TenantMongoManager seams, then feeds the existing DirectProvider schema/ validation/CRM logic from the tenant-scoped snapshot. CRM prefix-grouping, org-suffix filtering, postgres schema-ambiguity detection and the D7 unavailable→warning behavior are preserved unchanged in single-tenant mode. - DirectProvider gains NewMultiTenantDirectProvider; the four schema-read paths dispatch to the tenant source when MT, bypassing the env-pool lazy-connect. - Bootstrap initManagerSchemaTenantManagers builds both managers off one shared Tenant Manager client; factory.go drops the FetcherEnabled gate. - NewDataSourceRepositoryFromDatabase injects a *mongo.Database without pool ownership for the tenant-scoped repository. Deleted (manager-side fetcher HTTP retirement): - pkg/reporter/fetcher (whole package), pkg/reporter/auth (M2M + credential providers), datasource/fetcher_provider.go, readyz FetcherChecker, the service-layer FetcherEnabled flag + isFetcherMode gate, and the dead FETCHER_*/M2M_* config block (+ .env.example entries). Tenant-isolation invariant (third-rail) upheld at every seam: tenant ID read from context, never substituted under MT; resolution/nil errors fail closed with no shared- or cross-tenant fallback; the MT validation dispatch sits before the D7 softening so a resolution failure surfaces as a hard error, not a masked Valid:true warning. BREAKING CHANGE: MULTI_TENANT_ENABLED=true no longer requires FETCHER_ENABLED. The FETCHER_URL, FETCHER_ENABLED and M2M_* environment variables are removed from the reporter-manager; manager schema discovery is now always in-process.
…mponents/reporter, RUN_MODE) Phase 5 of the reporter→engine migration. The two reporter deploy units — reporter-manager (REST API, :4005) and reporter-worker (RabbitMQ consumer + health server, :4006, PDF/Chromium) — are now ONE Go component at components/reporter, with the active surface selected at runtime by RUN_MODE=api|worker|all. Production still deploys SPLIT (two Deployments, one image); RUN_MODE=all is dev-only. Structure (history preserved via git mv): - reporter-manager/internal → components/reporter/internal/manager - reporter-worker/internal → components/reporter/internal/worker - reporter-manager/api → components/reporter/api - new internal/app/app.go orchestrator: ParseRunMode (default all, rejects typos fast), InitService gating, Service.Run registering each selected surface's runnable in ONE libCommons launcher. - new cmd/app/main.go reading RUN_MODE. Two deliberate design calls (documented so they aren't mistaken for accidents): 1. The two surfaces keep SEPARATE bootstrap trees composed by a thin orchestrator, rather than ledger's single merged bootstrap. Each surface owns its own lib-commons tenant managers; collapsing them into one bootstrap would risk cross-tenant manager scoping. The orchestrator gates construction by RUN_MODE and runs both runnables under one launcher — same single-binary, single-launcher, split-deploy outcome with the tenant-isolation invariant preserved by construction. An unselected surface stays nil and opens no connections. 2. The old components/reporter-manager and components/reporter-worker dirs survive as Dockerfile-ONLY image-name anchors. The shared CI build workflow derives the published image name from the build-context directory basename, so keeping these dirs (each with just a Dockerfile that builds the unified binary) keeps the midaz-reporter-manager / midaz-reporter-worker image names and the .manager / .worker Helm value keys stable — leaving the Helm chart untouched for devops. A header comment in each Dockerfile explains this. Worker graceful shutdown is preserved verbatim: the full ordered teardown (reconciler cancel, health checker, health server, PDF pool, event listener, multi-tenant resources, RabbitMQ, MongoDB, telemetry flush) moved into worker bootstrap Service.Shutdown(), invoked by both the standalone Run() and the orchestrator, so SIGTERM drains identically either way. Build: root Makefile gains a single `reporter` component (one build target → .bin/reporter); the two Dockerfiles build the same binary differing only by base image + default RUN_MODE; CI build.yml adds components/reporter to shared_paths (a source change rebuilds both images) while filter_paths, image names, and Helm key mappings are unchanged; go-combined-analysis and pr-security-scan filter_paths repointed. BREAKING CHANGE: the reporter is now a single binary selected by RUN_MODE. A deployment that previously ran the reporter-worker binary must set RUN_MODE=worker, and the reporter-manager deployment must set RUN_MODE=api (baked as the default in each respective image). Helm charts are unchanged; devops applies the per-Deployment RUN_MODE.
…ync docs to unified binary Phase 6/7 cleanup of the reporter→engine migration. With the remote fetcher HTTP path gone, this removes the worker-side remnants and reconciles docs to the unified components/reporter (RUN_MODE=api|worker|all) reality. Code (components/reporter/internal/worker/bootstrap): - Remove 10 dead fetcher/M2M config fields (FetcherEnabled, FetcherURL, AppEncKey, FetcherStorageBucket, FetcherStorageEndpoint, M2M client/secret, M2MTargetService, M2M cache TTLs) — each verified to have zero read sites. - Narrow the SaaS-TLS Redis dependency gate from `FetcherEnabled || MultiTenantEnabled` to `MultiTenantEnabled` only. The reconciler (the sole single-tenant Redis consumer) was deleted at cutover, so Redis is now required only under multi-tenancy — this matches the already MT-only gate in BuildWorkerCheckers. - Drop the dead reconcilerCancel stub (field + shutdown branch) and the obsolete fetcher-gated TLS tests, keeping the live MT-Redis cases. readyz: scrub the stale FETCHER_ENABLED operator-facing reason string and self-probe/aggregation fixtures (the worker dep set is five, no fetcher); add a NotContains regression guard. Docs: STRUCTURE.md, AGENTS.md, CLAUDE.md, docs/PROJECT_RULES.md, and the worker .env.example now describe one reporter binary deployed split via RUN_MODE rather than two services; the migration plan is marked complete. Load-bearing items left untouched: the ModuleManager/ModuleWorker tenant-scope constants, datasource/factory.go, the WorkerContainer integration-test default, and the Dockerfile image-name anchor stubs.
…umer test narrative
Final cleanup of the reporter→engine migration. With the remote fetcher HTTP
path, its worker consumer/reconciler, and the manager FetcherProvider already
gone, this removes the last unreferenced remnants of the skeleton. Each target
was grep-proven to have zero live callers repo-wide (excluding the file being
deleted and its own tests) before removal.
Code deleted (pkg/reporter):
- mongodb/extraction/ — whole package (7 files): the jobID→reportID mapping
subsystem for the deleted async fetcher path. No non-self callers.
- crypto/ — whole package (key_deriver.go): fetcher TRANSPORT key derivation
(HMAC verify + S3 blob decrypt). Distinct from CRM crypto, which uses
lib-commons libCrypto.Crypto via CryptoEncryptSecretKeyCRM — untouched.
- storage/fetcher_adapter.go (+ test): the transport download adapter. Only
these two files; the rest of storage (s3-client, seaweedfs, ports, config)
is live and kept.
- datasource/types.go: remove the ExtractionJobRequest and ExtractionMapping
payload structs (+ companion test), the last consumers of which were the
deleted extraction package. Ripple: dropped the now-orphaned `import "time"`.
- constant/mongo.go: remove MongoCollectionExtractionMapping ("extraction_mapping").
Test narrative (components/reporter/internal/worker/bootstrap/retry_guard_test.go):
- Exorcise the dead Consumer-2 (fetcher notification) narrative left behind by a
prior phase: rename TestNotificationHandler_* → TestReportHandler_*, repoint
the comment from the deleted ProcessFetcherNotification to the live
handlerGenerateReport, and relabel the scenarios/fixtures that named the dead
consumer (extraction_mapping → report, "parse notification" → "parse report
request", "stale extraction" → report generation). Pure rename/relabel —
every error value and assertion is preserved, no coverage change.
Load-bearing item left untouched: ErrExtractionJobFailed (0287) — the engine's
in-process extraction-failure sentinel, live in five sites (retry_guard,
generate-report-data, pkg/errors, rabbitmq classifier, datasource alias). It is
NOT a fetcher vestige; the in-process engine still extracts datasource rows.
Verified green: go build ./..., go vet (reporter + pkg/reporter), the full
reporter unit suites, and golangci-lint v2.12.2 (0 issues). Repo-wide grep
confirms zero surviving references to any deleted symbol.
The remote fetcher is fully retired; the reporter→engine migration is complete.
…eck-docs guardrail (Phase 1) Phase 1 of the OpenAPI documentation quality plan (docs/plans/2026-06-10-openapi-doc-quality.md), resolving the audit's pipeline + parity findings (docs/openapi/AUDIT-2026-06-10.md). All edits are swag annotations, generator tooling, and the regenerated specs they produce. Pipeline (H5, H6): - generate-docs.sh + sync-postman.sh: COMPONENTS "reporter-manager" -> "reporter". The dead reporter-manager (Dockerfile-only CI anchor, no cmd/app/main.go) made `make generate-docs` fail at the reporter step on a clean tree; it now resolves the real components/reporter binary. - convert-openapi.js: the COMPONENT_PORTS key was still "reporter-manager", so after the rename the reporter port fell through to the 3002 ledger default. Renamed the key to "reporter" -> reporterPort now correctly resolves to 4005. Fixed three stale reporter-manager comments alongside. - Retired the stale, drifted postman/specs/reporter-manager/ (git mv -> reporter/); the old hub copy had already lost the Partial status enum. General-info parity (M4, M5, M7, L11, L14, L15, L16) across the three cmd/app/main.go headers, now byte-identical on the shared info fields: - @Version -> 4.0.0 (dropped ledger's v-prefix). - @title -> "Midaz {Ledger,Tracer,Reporter} API" (added the Midaz prefix). - @termsofservice -> the Elastic License URL (was the swagger.io scaffold). - @schemes -> "http https" (ledger gained https; reporter gained the line). - @contact + @license -> added to tracer and reporter (were contact:{}, license:null). - reporter Bearer description aligned to ledger's canonical wording; reporter @description now states REST serves only in api/all mode (worker is health-only). - tracer @description enriched to name its bounded contexts. - Deliberately untouched: tracer's ApiKeyAuth/X-API-Key scheme is correct (lib-auth v2 API Key), not a parity defect. Guardrail (L17): - postman/generator/check-docs.sh: parity half (always) asserts the shared info fields are identical across the three swagger.json via jq; drift half (CHECK_DOCS_REGEN=1) regenerates and asserts git-clean against committed specs. - `make check-docs` target at the repo root (next to generate-docs, per the repo's docs-target convention); wired as a "Check Docs" job in pr-validation.yml. - postman/README.md documents the parity fields. Includes the regenerated specs for all three components, the refreshed Postman collection/environment, and the governing audit + plan docs. Verified: `make generate-docs` exit 0; `make check-docs` parity green; jq confirms the six parity fields identical; reporterPort=4005; regeneration is idempotent (second run byte-identical, so the drift gate will pass); all three cmd/app binaries build.
Phase 2 of the OpenAPI doc-quality initiative. Brings annotation hygiene to quality parity across ledger, tracer and reporter. @name sweep (M-series): 22 swag @name directives relocated onto each struct's closing brace (`} // @name X`) — swag v1.16.6 ignores @name when placed as a leading comment above `type X struct`. Resolves package-dotted definition keys for reporter (23 -> 5 dotted) and tracer api types (40 -> 36 dotted); the remainder are intentionally deferred to Phase 5 (feeshared billing, tracer pkg/model, reporter unexported types, HTTPError). Tag taxonomy + groups (M8): @tags normalized to Title-Case plural across all three components; @router HTTP methods lowercased for consistency. @tag.name/@tag.description group blocks added to each general-info header AND relocated BEFORE @securityDefinitions — swag drops @tag.* directives that follow the security-scheme @description. Emitted .tags now populated: ledger 21, tracer 7, reporter 6. Text fixes: commit/cancel 400 descriptions corrected (were "cannot be reverted"); report status example capitalized to match the persisted constant; "plugin" wording replaced with "reporter"; single-quote array examples converted to swag comma form; stale retired TRC- prefixes in two tracer files replaced with canonical sentinel constant names. Param examples: 15 query-parameter `example(...)` tokens removed. The Swagger 2.0 Parameter Object does not support `example`; emitting it broke the openapi-generator conversion. These were never present in the generated spec before (swag silently ignored the prior malformed tokens), so this is zero-regression. Param-level examples belong to a future OpenAPI 3.0 migration. Specs regenerated; parity guardrail green.
Resolve audit finding C1: the ledger declared a BearerAuth securityDefinition
that zero operations referenced (dangling auth). Apply per-operation
`@Security BearerAuth` to all 111 ledger operations and drop the 111 ad-hoc
optional `@Param Authorization` header lines.
Mechanism: model (a) was framed as a global security requirement, but swag
v1.16.6 does not emit a top-level `.security` from a general-info `@security`
directive — it only honors per-operation `@Security`, emitting
`[{"BearerAuth":[]}]` per op. This is the pattern tracer (28/31) and reporter
(22/22) already use, so per-op delivers identical secure-by-default behavior
plus true cross-component source-style parity. C1 was ledger-only.
- 25 handler files: in-place swap `@Param Authorization` -> `@Security
BearerAuth`; `@Param X-Request-Id` tracing header preserved; path/body/query
params untouched.
- Regenerated ledger specs: 111/111 operations carry
`.security == [{"BearerAuth":[]}]`, 0 Authorization params remain,
securityDefinitions byte-identical, definitions/summaries unchanged.
- check-docs.sh: new always-on security-coverage guard (ledger-only) that fails
listing any ledger operation lacking `.security`. tracer's public
/health,/readyz,/version and reporter (already fully secured) are out of scope.
Verified: 111/111 secured, ledger builds, parity green, spec diff provably
security-only (definitions + non-Authorization params identical to HEAD).
…B resolver T1: buildTracerReserver fails fast when MULTI_TENANT_ENABLED && TRACER_BASE_URL is set but no M2M auth provider is wired (none exists yet) — refuses to ship unauthenticated, tenant-less reserve calls on the transaction hot path. F1: inject the fees Mongo manager into TransactionHandler so the fee seam resolves the tenant fee DB (test in transaction_fee_tenant_test.go).
…ient seams T4: InjectHTTPContext in do() so all five reservation ops continue the ledger trace instead of starting orphaned roots. T5/T6: remove the orphaned circuit-breaker seam (test-only, never wired). T7: remove dead WithHTTPClient option.
…econ-epic41) Slice Epic 4.1: onda 4.1a (rewire generate-docs/check-docs onto the Huma OAS 3.1 dumps + de-risk the redocly join/lint, swaggo fully intact/additive) then onda 4.1b (retire swaggo annotations + runtime wiring + generated files, preserving tracer/api/types.go; go mod tidy; delete pkg.HTTPError + fix the 5 compile-breakers). Anchors from recon-epic41 baked in; version test->4.0.0. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Promote the native Huma OAS 3.1 dump version from the placeholder "test"
to the contract version "4.0.0" on both server planes, then regenerate
the two committed goldens.
- tracer: buildTracerHumaAPI openapi.Config.Version test -> 4.0.0
- ledger: buildUnifiedHumaAPI openapi.Config.Version test -> 4.0.0
- regenerated components/{tracer,ledger}/api/openapi.huma.yaml goldens
The value stays hardcoded (never os.Getenv) so the golden dump remains
hermetic and drift-deterministic. This lets the docs pipeline switch its
source to the Huma dump: check-docs.sh requires .info.version to match
^4.0.0$, currently satisfied by the swaggo main.go @Version that will be
retired in wave 4.1b. Additive: swaggo annotations and generated artifacts
are untouched.
Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Repoint postman/generator/generate-docs.sh spec-gen at the native Huma OAS 3.1 dumps instead of swag+openapi-generator: - generate_openapi_spec now runs the golden TestOpenAPISpecDump with -update per plane (regenerates components/<c>/api/openapi.huma.yaml); resolve_swag_bin, generate_openapi_yaml (Docker), and SWAG_BIN removed. - publish_specs copies openapi.huma.yaml (was the swagger triplet). - consolidate_openapi joins the two openapi.huma.yaml inputs (ledger first, --prefix-tags-with-info-prop title, same output). Version-parity, security-scheme, and orphan-ref guards preserved; both dumps are 3.1.0 and the tracer dump declares BearerAuth + ApiKeyAuth. Stale "ApiKeyAuth (tracer)" comment corrected: the tracer declares both. - Drop the now-unmaintained tracked swagger triplet under postman/specs/<c>/, publish the Huma dumps in their place, and refresh the consolidated specs + Postman collection. Swaggo (annotations, generated api/*, /swagger routes, go.mod) untouched; retirement is a later wave. Verified: make generate-docs runs with no swag/Docker, emits postman/specs/midaz.openapi.yaml (openapi: 3.1.0), deterministic across three runs. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Point the docs guardrail at components/<c>/api/openapi.huma.yaml instead of the swaggo swagger.json. jq cannot read YAML, so read_field/read_field_raw and security_coverage_check now project the dump to JSON via the same bundled js-yaml the generator uses. Parity check drops swaggo-era fields that OAS 3.1 / the Huma dump no longer carry: .schemes (absent in 3.1) and .info.contact/.license/.termsOfService (Huma emits only title + version). The ^Midaz title assertion is dropped too: title is per-plane, not shared metadata, and the ledger dump still carries the contract-spec golden-test placeholder title. Parity now asserts .info.version is byte-identical across planes and matches ^4.0.0$. Security coverage (ledger 113/113) and the redocly consolidated lint are unchanged in intent — only the source file and its YAML->JSON read path move. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
The Huma rewire (2bd6f3e) switched publish_specs to copy each plane's openapi.huma.yaml into postman/specs/<c>/, but sync-postman.sh still fed convert-openapi.js the dead swagger.json path. Every component hit the 'spec not found' branch, ledger came back SKIPPED, and convert_to_postman failed the whole generate-docs run. One line: point the converter at the published openapi.huma.yaml (convert-openapi.js already reads YAML natively). Regenerated MIDAZ.postman_collection.json is the resulting artifact (28 folders). The joined midaz.openapi.{yaml,json} were committed in the prior wave and reproduce byte-identically (drift check green), so no delta there. Verification (both makes green): - make generate-docs: full pipeline to Postman collection, no failures. - CHECK_DOCS_REGEN=1 make check-docs: parity (info.version 4.0.0 identical across ledger/tracer), security coverage (113/113 ledger ops secured), redocly lint on joined spec EXECUTED (81ms, valid, 29 inherited warnings, not skipped) and PASSED, drift check reproduces committed artifacts. - Joined spec: openapi 3.1.0, 141 ops (ledger 113 + tracer 28), components.schemas.Error present (RFC 9457), no raw Detail schema. - Swaggo intact: authoritative components/*/api/{docs.go,swagger.json, swagger.yaml} + 49 annotated source files untouched by the wave. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
The postman collection generation used wall-clock timestamps and random UUIDs, so every 'make generate-docs' run produced a different collection — the committed artifact never sat clean and any drift check over the full generator output would flag spurious drift on every run. - convert-openapi.js: replace new Date().toISOString() at the three date-time example sites with a fixed EXAMPLE_DATE_TIME constant. - lib/workflow-processor.js: replace random uuidv4() for Postman element ids (event script ids and the workflow folder _postman_id) with a content-seeded uuidv5(), stable across runs. The regenerated collection is now byte-identical across consecutive runs (sha eff7b002... == eff7b002..., 0 diff lines). check-docs.sh passes with 'Regeneration reproduces committed docs artifacts (no drift)'. Swaggo and the drift-gated spec dumps (components/*/api, postman/specs) are untouched. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
…y guard
The ledger golden fixture buildUnifiedHumaAPI (contract_spec_routes_test.go)
set info.title "contract-spec" — a divergence from the production humaMount
closure (unified-server.go:127), which serves "Midaz Ledger API". Wave 4.1a
wired the docs pipeline onto that golden dump, so the fixture placeholder
leaked into the published consumer-facing spec: postman/specs/midaz.openapi.
{yaml,json} carried title "contract-spec" and — because the ledger-first
redocly join uses --prefix-tags-with-info-prop title — all 22 ledger tags
were prefixed "contract-spec_" instead of the swaggo-baseline "Midaz_Ledger_
API_".
- contract_spec_routes_test.go:116: Title "contract-spec" -> "Midaz Ledger
API", so the fixture mirrors production and the regenerated dump/join carry
the runtime title + baseline-identical tag prefixes.
- check-docs.sh parity_check: re-enable the ^Midaz title assertion (now that
no fixture placeholder can leak), asserted per-plane so each keeps its own
"Midaz ..." name. contact/license/termsOfService/schemes stay honestly
dropped (Huma emits only info.{title,version}; OAS 3.1 has no .schemes).
- Regenerated ledger golden + joined specs; zero residual "contract-spec".
Determinism (uuidv5 + frozen example date) preserved; swaggo untouched.
Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
…ecorded
Onda 4.1a (pipeline → Huma 3.1 native, additive) marked Done. Records the
supervisor gate over the HEALED_NEEDS_REVERIFY return: L1 determinism healed
at root (uuidv5 + frozen example date), reverified clean; the orphaned Medium
title-leak closed by aligning the ledger golden fixture title to the runtime
("Midaz Ledger API") and re-enabling the ^Midaz parity guard. Onda 4.1b
(swaggo retirement + Epic 3.3) is now the current wave.
Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
… general-info Onda 4.1b swaggo retirement. Strips all swaggo annotation comments (@Summary/@Router/@Tags/@Param/@Success/@Failure/@Accept/@Produce/ @Security/@ID/@description) from the 27 ledger HTTP handlers under components/ledger/internal/adapters/http/in and the general-info block (@title/@version/@host/@BasePath/@securityDefinitions/...) from components/ledger/cmd/app/main.go. Genuine Go doc-comments are preserved; only annotation lines and their now-orphan `//` separators (those sitting immediately before the func) are removed. The Huma-native OAS 3.1 dumps (components/ledger/api/openapi.huma.yaml) are the sole spec source now; swaggerEnabled() -> openapi.ServeSpec is untouched. No runtime behavior change. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
… general-info Retire swaggo from tracer now that the docs pipeline consumes the native Huma dump (components/tracer/api/openapi.huma.yaml). Delete the @Summary/ @Router/@Param/@Success/@Failure/@Security/@ID/@Tags/@Accept/@Produce/ @description godoc blocks from the 8 http/in handlers, the two struct-level //@name comments in transaction_validation_handler.go, and the general-info block (title/version/tags/contact/license/securityDefinitions) in cmd/app/main.go. Real Go doc-comments are preserved. No runtime behavior change. readyz.go keeps its tracer/api import and all executable code (api.ReadyzResponse/ReadyzCheck) — only its swaggo annotations are removed. swaggerEnabled()+openapi.ServeSpec (ledger) and the Huma OAS dumps are untouched. Verified: build ./components/tracer/... EXIT0; go vet http/in clean; handlers+main have zero swaggo annotations; @Router/@Security grep over components/tracer is empty. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
…rgets
Retire swaggo across both planes now that the docs pipeline consumes the
native Huma OAS 3.1 dumps (components/{ledger,tracer}/api/openapi.huma.yaml).
Ledger:
- unified-server.go: drop the blank api import, the fiber-swagger import, and
the legacy /swagger + /swagger/* mount; keep swaggerEnabled() gating the Huma
openapi.ServeSpec surface.
- delete bootstrap/swagger.go (WithSwaggerEnvConfig/initSwaggerFromEnv) and the
generated api/{docs.go,swagger.json,swagger.yaml,openapi.yaml}.
Tracer:
- routes.go: drop the fiber-swagger import and the /swagger/* mount; keep
SwaggerEnabled gating openapi.ServeSpec.
- delete adapters/http/in/swagger.go, scripts/verify-api-docs.sh, and the
generated api/{docs.go,swagger.json,swagger.yaml,openapi.yaml}. api/types.go
survives (LIVE in readyz.go).
Build tooling: go mod tidy drops swaggo/{fiber-swagger,swag,files} directs (and
the now-unused go-openapi/swag/* indirects); remove swag install steps and the
tracer verify-api-docs target; refresh root/component Makefile doc-comments.
Also refresh now-stale swaggo references left in handler and test doc-comments
(the annotations they describe were already removed), retire the DC-3
swagger.json route-diff gate and the /swagger UI-asset unit test (both asserted
the deleted swaggo surface), and drop the /swagger cases from the tracer auth
integration test — preserving the shared buildUnifiedHumaAPI seam the Huma
golden dump depends on.
Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
pkg.HTTPError had zero runtime constructors; its 10 swaggo @failure annotations were already removed in wave 4.1.5-ledger. Delete the struct and its Error() method, drop the HTTPError-only TestHTTPError_Error, and remove the four dead `err.(*pkg.HTTPError)` type-assert branches in fee tests (ValidateParameters returns *pkg.ValidationError, never HTTPError). Prune the now-orphaned pkg import in httputils_test.go. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Remove residual swaggo/go-swagger doc-comment annotations left behind by the swaggo retirement wave. The contrarian pass flagged 49 inert `@Description` comment-directives that violated the wave's empty-annotation assertion; a full sweep also found orphaned `swagger:model`/`swagger:response`/`@name` directives and go-swagger `in: body` markers across 27 files. These comments are dead: the repo has no swag parser (no swaggo import, no docs.go/swagger.json, no comment-parsing tooling), so nothing consumes them. The Huma OAS 3.1 pipeline reads struct-field tags (swaggertype/enums/example/ format), which are left untouched — `CHECK_DOCS_REGEN=1 make check-docs` reproduces the committed dumps with zero drift. Preserved: swaggerEnabled()/openapi.ServeSpec wiring, tracer/api/types.go hand-written types (ErrorResponse/VersionResponse/ReadyzResponse/ReadyzCheck) and their /readyz usage, both openapi.huma.yaml dumps, all struct-field tags. No runtime behavior change. Verification: go build ./... EXIT 0, go vet clean, gofmt clean, existing suites green, check-docs no-drift, absence greps empty. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Wave 4.1b's swaggo retirement orphaned three items its handler-scoped tasks did not cover; this closes them so the retirement is clean. - components/tracer/api/types.go: delete ErrorResponse and VersionResponse. Both were documented "Used in Swagger documentation" and referenced ONLY by the swaggo @Failure/@success annotations + generated docs.go, all deleted this wave. Zero live consumers (the Version handler emits ad-hoc JSON; the error path uses lib-commons RFC 9457 problem). Deleting them honors the file's own ReadyzCheck manifesto against phantom-documentation types. ReadyzResponse/ReadyzCheck stay (live in readyz.go — invariant B). - components/ledger/pkg/feeshared/nethttp/httputils_test.go: the errCode table field was asserted only via the deleted `err.(*pkg.HTTPError)` branch (dead: pkg.HTTPError was never constructed), so 13 populated codes went unchecked. Re-point the assertion at the LIVE error: ValidateParameters surfaces failures via ValidateBusinessError -> pkg.ValidationError{Code: constant.Error()}, so errors.As + assert on .Code restores the per-code coverage the table intended. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Records the 4.1b supervisor gate over HEALED_NEEDS_REVERIFY: L4 contrarian defect (both swaggo + go-swagger annotation dialects) swept by self-heal; 3 orphan Lows closed by the supervisor (dead ErrorResponse/VersionResponse deleted, errCode test repointed to the live pkg.ValidationError). Invariants A/B/C verified. Epic 4.2 (parity lock + redocly re-enable + DC-3 route-diff gate reinstatement) is now the current wave. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Redocly rule re-enablement (empirical), joined-spec Error lock complementing the Go closure test, DC-3 route-diff gate reinstated against openapi.huma.yaml, make ci verify. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
The redocly lint config relaxed 8 rules that predated the swaggo -> Huma migration. With the native Huma OAS 3.1 dumps now the sole source, re-verify each rule against the current joined spec: Re-enabled (0 findings on the Huma dumps): - no-server-trailing-slash - no-server-example.com - security-defined - no-unused-components - no-invalid-schema-examples Kept off (still trip on the Huma output; comments rewritten to the real Huma-era cause, no longer 'swag emits ...'): - no-empty-servers join artifact: root servers emptied by design - operation-4xx-response 71 ledger+tracer ops without an explicit 4xx entry - no-ambiguous-paths 2 structural ledger balances/operations sub-paths CHECK_DOCS_REGEN=1 make check-docs passes green (parity, version 4.0.0, security-coverage 113, redocly lint, no drift). Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
The Go test tests/openapi (error_schema_parity_test.go) is the primary lock on the Error closure, but it reads the per-plane Huma dumps, not the joined artifact. The joined spec (postman/specs/midaz.openapi.json, consumed by the Plan B SDK) is the output of redocly join; if the join ever collides two non-identical Error schemas, redocly de-dups by suffixing (Error, Error2). That would slip past the Go test. Add error_schema_singleton_check to check-docs.sh: assert the joined json has components.schemas.Error, no dedup-suffixed siblings (^Error[-_]?[0-9]+$, so ErrorDetail is unaffected), and the RFC 9457 problem fields the SDK relies on. Skip-with-warning when the artifact is absent, mirroring consolidated_lint_check. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Restore the DC-3 contract gate removed in eca60ed, now reading the generated Huma OAS 3.1 dump (api/openapi.huma.yaml) instead of the deleted swaggo swagger.json. TestContractSpecMatchesRoutes asserts the Fiber-mounted route surface equals the dump's published (method, path) set in both directions. Adaptations from the base swaggo version: - collectSpecRoutes parses YAML via gopkg.in/yaml.v3 and prefixes /v1 to each server-relative path, since the Huma spec carries the base path in its servers block, not in the path keys. - canonicalizePath normalizes both Fiber ':param' and OpenAPI '{param}' to a positional token so the surfaces compare on structure, not label. - Locked exempts (const, one comment each): GET /health, /version, /readyz public probes, and the intentional Fiber-only multipart POST .../transactions/dsl route (sunset 2026-08-01). No /swagger* — retired. mounted=113, spec=113, zero divergence. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
…iber Task 4.2.4 (verify make ci) surfaced a latent `unparam` lint failure that predates Epic 4.2: createTransactionFiber declared an `isRevert bool` param that every one of its five callers passes as false. The revert path reaches the money-write orchestrator independently — createRevertTransaction calls executeCreateTransaction(..., true, ...) directly — so the helper's param was speculative dead surface (YAGNI) from the Wave-4 transaction migration. Remove the param; hardcode false at the executeCreateTransaction call site. Behavior-preserving: all callers already passed false, so the value reaching executeCreateTransaction is unchanged, and the isRevert=true revert semantics (applyFees skip, reverse-transaction branch) remain wired through createRevertTransaction untouched. Surfaced-by: make ci lint stage (golangci unparam), which the earlier targeted-test gates (4.1a/4.1b) did not exercise. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Second latent lint failure surfaced by task 4.2.4 (verify make ci), masked behind the createTransactionFiber unparam failure until that was fixed: make lint iterates scopes fail-fast (components -> tests -> pkg), so the pkg scope was never reached while ledger still failed. classifyForProblem's ten consecutive `if errors.As(...)` type-dispatch blocks had no blank line between them; wsl_v5 flags each `if` that follows a closing statement block. Add a blank line between each arm. Pure formatting — the error classification order and RFC 9457 status mapping are byte-for-byte unchanged. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Task 4.2.4 (verify make ci) exposed that tests/openapi — the offline cross-plane locks over the committed Huma OAS 3.1 dumps, including the byte-identical RFC 9457 Error closure the SDK (Plano B) consumes — never ran in the gate. test-unit discovers packages with `go list ./...` then drops everything under ./tests (that path is otherwise integration-only, needing Docker), and nothing re-added the offline openapi package. So the Go closure lock that 4.2.2's joined-spec singleton check was written to COMPLEMENT was itself unenforced. Add a test-openapi-locks target (offline: yaml only, no server/DB/Docker) and invoke it from ci after check-docs, so the locks read the freshly regenerated-and-drift-verified dumps. The joined-spec singleton check (check-docs) covers the published artifact; this covers per-plane closure byte-identity. Both now gate. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
The LLM-judge review of Plano A found a latent fourth false-green: test-unit
discovers packages via `go list ./...`, which exits non-zero with empty stdout
in a git worktree (VCS-stamp error). The old empty-pkgs branch then printed
"No unit test packages found" and exited 0 — silently skipping all ~13.9k unit
tests while make ci reported success. Same fail-open class as the tee-pipe and
poisoned-lint-cache traps found earlier this session.
Two-part fix:
- Self-sufficient: export GOFLAGS="-buildvcs=false ${GOFLAGS}" in the recipe so
`go list` and `go test` work in a worktree without an external flag (prepended
to preserve any caller GOFLAGS). Also add -buildvcs=false to test-openapi-locks.
- Fail-CLOSED: empty discovery is now an error (exit 1) with a diagnostic, not a
vacuous pass. The repo root always has unit packages; empty means discovery
broke, which must fail loudly rather than green-wash.
Production CI (normal clone) was never affected; this hardens local runs and
any future discovery breakage.
Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
The swaggo retirement (Epic 4.1) swept handlers + postgres adapters + some model packages, but missed pkg/mmodel and pkg/mtransaction — the shared model types the SDK consumes. The LLM-judge review of Plano A flagged ~700 inert annotation lines still contradicting the "fully retired" criterion at source level. Remove them all: `// swagger:model|response`, go-swagger `} // @name X` trailing markers (reduced to bare `}`), `// @description|@example|@type|@format` blocks, and orphaned `// in: body` field markers. Comment-only across 25 files (132 insertions / 881 deletions) — verified no struct field, tag, or type changed: every removed line is a comment/`} // ...`, every added line a bare `}`. Provably inert: CHECK_DOCS_REGEN=1 make check-docs regenerates both Huma OAS dumps byte-identically (no drift) — nothing parsed these annotations. Also rewrite postman/README.md from the retired `swag init` + Docker openapi-generator flow to the live TestOpenAPISpecDump + redocly-join pipeline. pkg/ now carries zero swaggo/go-swagger annotation residue. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
Flip 4.2.1-4.2.4 to Done with landed commits + mutation-proof outcomes; Epic 4.2 + Phase 4 → Complete. Record the gate deviations (2 latent lint defects the make-ci verify surfaced, 3 false-greens neutralized, the parity lock wired in, the annotation sweep) and the independent LLM-judge verdict (CONDITIONAL PASS → PASS after the test-unit fail-closed fix; zero blockers; accepted documented residuals). Plano A closed; handoff to Plano B stands. Claude-Session: https://claude.ai/code/session_01P4Zy5DofM3BxGLwivcRJi4
The shared-engine KMS merge (c3e2be4) changed three contracts that only the //go:build integration tag exercises, so `make ci` stayed green while the integration suite was red (8 build errors + 24 failures): - crm Mongo repos now take encryption.FieldEncryptor (which needs DecryptField); 4 tests still passed the raw *lib-commons/crypto.Crypto. Wrap it via NewEncryptionService -> NewFieldEncryptorAdapter, mirroring the passing holder integration test. - OrganizationKeyset.Validate now requires Version >= 1; createValidKeyset omitted it. Set Version: 1 to match production provisioning. - The registry revision-conflict test asserted LegacyReadable stayed false, but NewOrganizationRegistryRecord defaults it true, so the assertion never held. Flip the mutation so the test actually proves a revision-rejected update does not persist the change (production Update was already correct). - The keyset unique index is compound (tenant_id, organization_id, version); the dup-insert test never set TenantID, so it dodged the index. Stamp the saved tenant onto the direct insert. Test-only; no production code changed. Each affected package re-run green. Claude-Session: https://claude.ai/code/session_01SH3ADYS6hGU91Dy67B96oG
637730d added a node `require("js-yaml")` to postman/generator/check-docs.sh (Huma OAS dump -> JSON conversion) but never wired its install into CI. Since postman/generator/node_modules is gitignored, the check-docs job's `node -e` fails with "Cannot find module 'js-yaml'" — the job passed before only on runners that happened to have the module ambient. Add `npm ci --prefix postman/generator` before the verify step so the gate is hermetic. Claude-Session: https://claude.ai/code/session_01SH3ADYS6hGU91Dy67B96oG
The PR-validation golangci-lint pin (v2.4.0) trailed the local Makefile pin (v2.12.2), so the merge gate enforced an older ruleset than developers ran. Align the CI gate to v2.12.2; the tree already passes it locally. Claude-Session: https://claude.ai/code/session_01SH3ADYS6hGU91Dy67B96oG
Reconcile the doc surface with two large merges that landed on this branch:
the completed swaggo->Huma / RFC 9457 error-envelope migration, and the CRM
field-encryption + Vault-Transit KMS subsystem.
- Canonical agent docs (CLAUDE.md, AGENTS.md, PROJECT_RULES.md): fix the stale
streaming API block (ToEvent->ToEmitRequest, Builder-owned source), dependency
versions (lib-commons v5.8.0, lib-observability v1.1.0, lib-streaming v1.6.2),
the Huma+problem+json HTTP layer, CRM error count (16->28, CRM-0006..CRM-0041),
the CI workflow table, and swaggo->Huma; add a CRM Field Encryption / KMS section.
- Standards: rewrite error-handling E13 to the RFC 9457 problem+json wire contract
(code/status tuple preserved); add the crm_protection_* metrics to telemetry D6;
fix line-rot and dead plan links (replaced with git-history notes).
- Runbook: add the KMS envelope-encryption rollback one-way door + Vault env
surface (the data-safety claim was mode-dependent).
- New docs/architecture/crm-field-encryption.md documenting the subsystem.
- New components/ledger and components/infra READMEs (tracer parity).
- Godoc on the encryption/crypto packages (comment-only, no behavior change).
- Rewrite llms.txt/llms-full.txt (root + tracer) to match.
- Archive LEDGER.md -> docs/branch-review-campaign.md.
Known gap (documented, not fixed): pkg/net/http/withRecover.go panic path still
emits the legacy {code,title,message} envelope, diverging from the RFC 9457
WithError path.
Claude-Session: https://claude.ai/code/session_01SH3ADYS6hGU91Dy67B96oG
…lope
The WithRecover middleware hand-built a legacy fiber.Map{code,title,message}
body on panic, diverging from every other error path which serializes as RFC
9457 application/problem+json via WithError. A client parsing problem+json
mis-parsed a panic response. Route the recovered panic through WithError as an
internal-server error (pkg.ValidateInternalError) so both paths emit the
identical envelope — one producer, one shape. Status stays 500 and the panic
message / stack frames remain scrubbed (verified green by the CRMCollapse
panic integration test). Updates error-handling.md E9/E13 to match.
Claude-Session: https://claude.ai/code/session_01SH3ADYS6hGU91Dy67B96oG
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Consolidates tracer, plugin-fees, and CRM into the midaz v4 monorepo — unified
components/ledgerbinary (onboarding + transaction + CRM + fees on :3002) plus co-locatedcomponents/tracer(:4020). Single rootgo.mod(github.com/LerianStudio/midaz/v4), nogo.work.Supersedes the closed #2156 / #2154. The load-bearing difference from those PRs:
⭐ The reporter is NO LONGER part of this consolidation
Earlier attempts folded the reporter in too. That decision was reversed — the reporter is a separately-sellable product coupled to the ledger only over the wire (RabbitMQ + API), not in-process, so it belongs in its own repo. It has been extracted back to
LerianStudio/reporter(PR #696) and removed from this monorepo (626 files).A direct consequence: the private
fetcher/pkg/enginedependency is gone from midaz'sgo.mod— restoring a clean source-available build for external clones (the ledger core never imported it, but a single root module made it a build-time dep for everyone).Scope
pkg/fee, shared typespkg/feeshared, use casesinternal/services/fees; applied at thetransaction_create.gofee seam.components/ledger/internal/crm(package tree, nocmd//internal/); routes register under themidazauthz namespace — the tenant-manager policy migration is the X1 release gate (docs/auth/RBAC-NAMESPACES.md,docs/runbooks/v4-x1-rbac-migration-and-rollback.md).components/tracer; ledger↔tracer seam over gRPC + mTLS (docs/architecture/ledger-tracer-topology.md).Verification
go build ./...+go vet ./...→ exit 0 (post-extraction).https://claude.ai/code/session_01RzpM5cJt1wAqEZQ1mHL63n