Skip to content

beyond-pg-init: report PSI memory pressure to the host over vsock#7

Merged
jaredLunde merged 1 commit into
mainfrom
psi-memory-pressure-reporting
Jun 21, 2026
Merged

beyond-pg-init: report PSI memory pressure to the host over vsock#7
jaredLunde merged 1 commit into
mainfrom
psi-memory-pressure-reporting

Conversation

@jaredLunde

Copy link
Copy Markdown
Contributor

Why

The host memory controller (instd) right-sizes each VM's balloon/hotplug from a guest distress signal. Its sharpest signal is Linux PSI (/proc/pressure/memory), which measures memory-stall time directly — so it catches the buffered read() cache misses a database generates, which the previous major_faults signal misses. That's exactly the thrash a Postgres VM hits when its page cache is squeezed (a real incident: a Postgres VM pinned both vCPUs at 99% for 12h).

But this primitive never reported any guest resource stats over vsock, so the controller was flying blind on the workload that needs it most.

What

  • Add a periodic (30s) GuestResourceStats (0xA2) report carrying PSI some.avg10 / full.avg10, multiplexed onto the existing substrate connection alongside heartbeats and log relay.
  • PSI-only (omits disk_total_bytes) so the host skips disk billing for this report — Postgres disk usage isn't tracked here.
  • Byte-compatible with instd's vsock_protocol::GuestResourceStatsPayload decoder. resource_stats_frame_is_stable pins the wire format the same way ready_frame_is_stable does, so it can't silently drift.
  • A pure parse_memory_pressure() with a unit test; read_memory_pressure() returns None when PSI is unavailable.

Companion change

The host side (wire field, shared collector for the in-repo primitives, instd ingestion, and the controller treating PSI as the primary distress signal) lands in the beyond repo. instd sets psi=1 on the guest kernel cmdline (the kernel already ships CONFIG_PSI=y), so no kernel rebuild is needed. If PSI is unavailable the reporter sends nothing and the controller falls back to its balloon-stat signals.

Test

cargo test -p beyond-pg-init substrateready_frame_is_stable, resource_stats_frame_is_stable, parses_psi_memory all pass.

🤖 Generated with Claude Code

The host memory controller (instd) right-sizes each VM's balloon/hotplug from a
distress signal. Its sharpest signal is Linux PSI (/proc/pressure/memory), which
measures memory-stall time directly and catches the buffered read() cache misses
a database generates — exactly the thrash a Postgres VM hits when its page cache
is squeezed. But this primitive never reported any guest resource stats, so the
controller was flying blind on the workload that needs it most.

Add a periodic (30s) GuestResourceStats (0xA2) report carrying PSI some/full
avg10, multiplexed onto the existing substrate connection alongside heartbeats
and log relay. We send PSI only (disk_total omitted) so the host skips disk
billing for this report. The frame is byte-compatible with instd's
vsock_protocol::GuestResourceStatsPayload decoder; resource_stats_frame_is_stable
pins the wire format the same way ready_frame_is_stable does, so it can't drift.

Requires the guest kernel booted with psi=1 (instd sets this on the cmdline);
if PSI is unavailable the reporter simply sends nothing and the controller falls
back to its balloon-stat signals.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jaredLunde jaredLunde merged commit a1a5d9f into main Jun 21, 2026
1 check passed
@jaredLunde jaredLunde deleted the psi-memory-pressure-reporting branch June 21, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant