beyond pg init platform integration#3
Merged
Conversation
…urable-volume fixes beyond-pg-init now integrates with instd's VM lifecycle and runs Postgres on a fresh durable GlideFS volume end-to-end. Proven on the homelab: a primary boots, real SQL (CREATE/INSERT/SELECT) round-trips and persists on the data volume, and a snapshot+CoW-fork yields a write-divergent branch of production (verified twice, green). Platform integration (beyond-pg-init stays PID 1, per the fidelity-first design): - substrate.rs: do instd's guest_runtime Ready handshake on a dedicated tokio thread so `service.create.completed` fires (instd waits for "guest ready"). - volumes.rs: mount data volumes from MMDS at their mount_path (mirrors guest-init), so /var/lib/postgresql is the durable GlideFS volume. - bootsetup.rs: MMDS HTTP client read Content-Length bytes instead of read_to_end (Firecracker MMDS holds the connection open -> read_to_end timed out -> EAGAIN -> fail-closed on POSTGRES_PASSWORD -> kernel panic). connect_timeout, generous retry budget. Durable-volume / noble-image fixes (beyond-pg): - pg.rs/boot.rs: run runtime initdb as the postgres user + chown the fresh volume tree + make the pwfile postgres-readable (initdb refuses root; the v olume mounts root-owned and empty over the image's PGDATA). - boot.rs: ensure_socket_dir() creates /run/postgresql (no systemd-tmpfiles here). - tls.rs: emit Ed25519 server key as PKCS#8 v1 (OpenSSL 3.0.13 on ubuntu-noble can't decode rcgen's PKCS#8 v2; key/perf unchanged, v2 only appends the pubkey). - config.rs: filter shared_preload_libraries to installed .so's; supervisor.rs: downgrade missing REQUIRED_EXTENSIONS to a warning (auth/queue ext are a future milestone, pgdg pins dropped) so the standalone primitive boots. - pg_hba.conf: local trust for the co-located `pgbouncer` auth_user (PASSWORD NULL + peer mapped OS postgres -> DB pgbouncer -> "Peer authentication failed"). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The test wrote pre-backup rows + pg_switch_wal() guarded only by a non-fatal 30s wait for the sink to appear in pg_stat_replication. Under CI load the sink connected slowly, the wait fell through silently, and the pre-backup rows were written before the sink began streaming — so it started from a later LSN and never archived the pre-backup segment, panicking with the misleading "pre-backup segment never archived". Make the connect-wait fatal and extend it to 60s so the sink is provably streaming before the pre-backup rows are written, closing the race. Bump the seal deadline 30s -> 60s to match the sibling wal_sink_crash test. Test-only change; no product code touched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… beyond path dep
The `guest-runtime = { path = "../../beyond/rustlib/guest-runtime" }` dep pointed
at a sibling Beyond checkout — fine locally, but CI clones only beyondoss/postgres
so cargo couldn't resolve it (+ transitive vsock-protocol/guest-session/wire) and
the build failed.
It never needed Beyond's code: the guest-ready contract is one self-described
vsock frame ([len u32 BE][type 0x81 Ready][msgpack ReadyPayload]) on cid=2 port=52.
substrate.rs now sends it directly using crates this repo already has (tokio-vsock,
rmp-serde) — zero Beyond deps, builds standalone. The wire layout is pinned by a
fixture test (ready_frame_is_stable) so it can't silently drift from instd's
rustlib/vsock-protocol decoder.
Verified end-to-end on the homelab: create.completed fires and the full
smoke-postgres proof (primary SQL + fork-branch) passes green. Cargo.lock no
longer references the beyond workspace.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.