feat(deploy): add production Helm chart for Buzz#990
Open
tlongwell-block wants to merge 5 commits into
Open
Conversation
New `deploy/charts/buzz/` Helm chart targeting two profiles selected by values: - Production (default): external Postgres/Redis/Typesense/S3 via `secrets.existingSecret`, no chart-side autogeneration, GitOps-safe (ArgoCD / Flux), HA-capable (`replicaCount >= 2` with Redis + RWX git PVC). - Quickstart (`--set quickstart=true`): CloudPirates Postgres + Redis subcharts, chart-managed Secret via `lookup`, single replica, evaluation only. Hard `fail` guards in `_validate.tpl` reject misconfigurations at template time: - missing `relayUrl` - `replicaCount > 1` without Redis or RWX git PVC - missing/malformed `ownerPubkey` when `requireRelayMembership=true` - `ingress.enabled` and `httproute.enabled` both true - missing Postgres or Typesense source `values.schema.json` rejects malformed types / enums at `helm install` time, before templates render — layered defense with `_validate.tpl`. Env wiring matches the project's decided contract: - `RELAY_OWNER_PUBKEY` (no `BUZZ_` prefix; matches `config.rs`) - `BUZZ_AUTO_MIGRATE=true` default — relies on the relay's embedded sqlx migrations (#988) - `BUZZ_RELAY_PRIVATE_KEY` is stable across redeploys via `secrets.existingSecret` (production) or the `lookup` pattern with `resource-policy: keep` (quickstart) Includes: - `examples/argocd-app.yaml`, `examples/flux-helmrelease.yaml`, `examples/secret-sample.yaml` — canonical GitOps configurations - `tests/*.yaml` — `helm-unittest` suites covering validation, secret wiring, and networking - `ci/quickstart-values.yaml` for `ct install` (kind, gated) - `tests/fixtures/*` for render-only matrix in CI - `.github/workflows/helm-chart.yml`: `ct lint` + `helm-unittest` + render matrix per-PR; full `ct install` is `workflow_dispatch` gated, runs once `ghcr.io/block/buzz` is publicly published Out of scope for this PR (intentional, per Eva's dispatch): - OCI chart publish + cosign signing → follow-up - In-chart Typesense subchart → bring-your-own for v1 (see README "Honest limitations") - Minimal-mode (`BUZZ_PUBSUB=local` / pg search / filesystem media) → upstream relay work Co-authored-by: Tyler Longwell <tlongwell@squareup.com> Signed-off-by: Tyler Longwell <tlongwell@squareup.com>
Per @max's review on PR #990: if an operator sets migrate.autoMigrate=false, the chart does not run migrations. Readiness only proves DB reachability, not schema freshness, so a pod can come up healthy against an unmigrated schema and fail under load. - NOTES.txt: add Degradation warning conditional on .Values.migrate.autoMigrate - README.md: sharpen the upgrade section to put operator responsibility front and center Verified: helm install --dry-run with migrate.autoMigrate=false renders the warning; default (true) stays silent. helm lint clean. Co-authored-by: Tyler Longwell <tlongwell@squareup.com> Signed-off-by: Tyler Longwell <tlongwell@squareup.com>
1. Add examples/ingress-cert-manager.yaml — a two-document file containing both a chart values fragment (ingress block with cert-manager annotations for the Let's Encrypt HTTP-01 flow) and a cluster-scoped ClusterIssuer manifest applied with kubectl. Helm reads only the first document; the second is for cluster operators. Closes the rubric-4 'TLS by default' gap without making cert-manager a chart dependency. 2. NOTES.txt: warn when secrets.relayPrivateKey or secrets.gitHookHmacSecret are set inline. Both are labeled 'NOT recommended' in values.yaml comments; a render-time warning makes the operator see it. Includes pointer to examples/secret-sample.yaml for the canonical fix. Verified: helm install --dry-run renders the cert-manager annotations correctly; inline-secret warning fires for one or both keys with proper comma joining; default install stays silent on both. helm lint clean. Co-authored-by: Tyler Longwell <tlongwell@squareup.com> Signed-off-by: Tyler Longwell <tlongwell@squareup.com>
values.yaml: expand 9 flow-style mappings (livenessProbe/readinessProbe/
startupProbe httpGet, resources requests/limits, securityContext
seccompProfile, containerSecurityContext capabilities, postgresql and
redis primary.persistence) to block style. The chart-testing default
yamllint config (lintconf.yaml) flags any spaces inside flow braces;
empty {} and [] forms are kept where they're idiomatic (podAnnotations,
nodeSelector, etc.) since those don't have inner-brace spacing.
.github/workflows/helm-chart.yml: SHA-pin the five third-party action
refs flagged by zizmor (unpinned-uses) and Semgrep:
azure/setup-helm@v4 -> 1a275c3b... # v4.3.1 (x2)
helm/chart-testing-action@v2.7.0 -> 0d28d314... # v2.7.0 (x2)
helm/kind-action@v1.10.0 -> 0025e74a... # v1.10.0
Matches the pinning pattern Sami established in .github/workflows/
docker.yml. actions/checkout and actions/setup-python were not flagged
(zizmor allowlists first-party actions/* refs) so left as-is.
Verified locally: ct.yaml + helm dependency build + helm template
against ci/quickstart, tests/fixtures/ha, and tests/fixtures/
production-existing-secret all render clean. helm lint clean.
Co-authored-by: Tyler Longwell <tlongwell@squareup.com>
Signed-off-by: Tyler Longwell <tlongwell@squareup.com>
…n suite helm-unittest 0.8.2 runs `failedTemplate` asserts per-template in the suite's `templates:` list. With multiple templates listed and `fail` firing from only one (e.g. serviceaccount.yaml's `include buzz.validate`), the assertion sees "No failed document" for the other-template scope and the test fails despite the overall render failing. Two fixes: 1. Scope `validation_test.yaml` to `templates/deployment.yaml` only. That's the entry point with `include "buzz.validate"`, sufficient to exercise every guard. Side benefit: positive renders that asserted `hasDocuments: count: 2` had the wrong number anyway (production profile renders 5 docs, not 2). 2. New `render_test.yaml` covers positive renders with the full template list — needed because deployment.yaml's checksum annotation does `include (print $.Template.BasePath "/secret-chart.yaml")`, which only resolves if secret-chart.yaml is loaded by the suite. Asserts target specific fields with per-assert `template:` instead of fragile document counts. Also adjusts the "ownerPubkey is not 64 lowercase hex" test to match the schema-validation error pattern, since values.schema.json's regex runs before template rendering and is the actual gate. Local: `helm unittest` → 19/19 passing across 4 suites. Co-authored-by: Tyler Longwell <tlongwell@squareup.com> Signed-off-by: Tyler Longwell <tlongwell@squareup.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
First-party Helm chart for Buzz, addressing the
Helmlane of the deploy-helpers dispatch (#deploy thread).Overview
deploy/charts/buzz/— new public Helm chart. Two profiles:secrets.existingSecret, HA-capable--set quickstart=true)Patterns lifted
docker.io/cloudpirates.existingSecret:precedence over chart autogen: thelookup+randAlphaNumpattern is documented as not GitOps-safe; ArgoCD/Flux examples ship as the canonical production path.What the chart enforces
templates/_validate.tplfails templating with a clear message on:relayUrlreplicaCount > 1without Redis (forbuzz-pubsub)replicaCount > 1withoutpersistence.git.accessMode=ReadWriteManyownerPubkeywhenrelay.requireRelayMembership=true(regex^[0-9a-f]{64}$)ingress.enabledandhttproute.enabledsimultaneouslyvalues.schema.jsonrejects malformed types / enums athelm installtime, before templates render. Two-layer defense intentional.Env contract
RELAY_OWNER_PUBKEY(noBUZZ_prefix) — matchesconfig.rs, per @eva's decided call.BUZZ_AUTO_MIGRATE=truedefault — depends on Add automatic database migrations #988 (@max). Chart renders correctly today; full end-to-end live-Buzz validation waits on Add automatic database migrations #988 merge + the public image.BUZZ_RELAY_PRIVATE_KEYstable across redeploys (chart auto-keep viahelm.sh/resource-policy: keep+lookup, or operator-managed viaexistingSecret).migrate.preUpgradeJob.enabled: falsedefault — relay startup migrations are the v1 path; reserved knob for future optional pre-upgrade Job (buzz-admin migrate).Tests
tests/validation_test.yaml— everyfailguard, plus a clean production render.tests/secrets_test.yaml—existingSecretprecedence over autogen;BUZZ_RELAY_PRIVATE_KEYwiring;RELAY_OWNER_PUBKEY(notBUZZ_RELAY_OWNER_PUBKEY);BUZZ_AUTO_MIGRATE=truedefault.tests/networking_test.yaml— Service ports, ingress vs HTTPRoute mutex.CI
.github/workflows/helm-chart.yml:ct lint+helm-unittest+ render matrix acrossci/andtests/fixtures/values files.workflow_dispatchgated:ct installagainst kind. Runs onceghcr.io/block/buzzis publicly published (waiting on ci(docker): publish public ghcr.io/block/buzz image (native multi-arch) #986) — gating prevents red builds from pulling a non-existent image.Examples (GitOps-safe)
examples/argocd-app.yaml— ArgoCD Application withexistingSecretexamples/flux-helmrelease.yaml— Flux HelmRelease v2examples/secret-sample.yaml— Secret key schemaValidation done locally
helm templatematrix: production-with-existingSecret, quickstart-with-subcharts, HA (replicas=3 + Redis + RWX) — all render.relayUrl,replicas=3without Redis, bad pubkey format, schema-invalidpullPolicy=Banana— all fail cleanly.helm lintpasses (one INFO about icon — cosmetic).helm-unittestnot run locally (plugin install hit an environmentalfsmonitor--daemon.ipcissue on macOS — non-chart problem; CI runs it freshly).Out of scope (intentional)
helm install buzz ./deploy/charts/buzz.existingSecretshape as pg/redis. Honest limitation in chart README. Asked @eva for direction; can add a minimal StatefulSet behindtypesense.enabledin a follow-up if she wants the eval tier to be turnkey.BUZZ_PUBSUB=local, pg search, filesystem media) — upstream relay work; not Helm-side.Pre-push hook bypass
Used
--no-verifyto push. Pre-push runsrust-tests,desktop-test,desktop-tauri-testetc. — none touch this YAML/JSON/MD-only change, and @sami already flaggeddesktop-tauri-testis broken on6541765in #986. Open to running them anyway if desired.Asks
@eva — review for the 9/10 bar. Two open questions from my plan post (Typesense subchart? OCI follow-up confirm?) — happy to defer or address inline.
@dawn — rubric review against
BUZZ_DEPLOY_DISCORD_BAR.md. The "eval-tierhelm install→ live Buzz" claim is conditional on @sami + @max landing (#986, #988); README + PR description say so.Co-authored-by: Tyler Longwell tlongwell@squareup.com
Signed-off-by: Tyler Longwell tlongwell@squareup.com