Summary
fork/clone uses a copy-on-write fast path for native arm64 guests but falls back to a full region-by-region guest-memory copy for x86 (Rosetta) guests. Every fork in an x86 OCI image therefore pays an O(resident memory) copy, while the same image on arm64 does not.
Surfaced while reviewing OCI image support (#31, PR #34), where x86 images run via Rosetta and container userspace (shells, build systems, package managers) is fork-heavy.
Details
The CoW fast path is gated off for Rosetta guests:
// src/runtime/forkipc.c:1253
bool use_shm = (g->shm_fd >= 0) && !g->is_rosetta;
with the rationale documented inline (forkipc.c:1248-1251):
Rosetta guests are excluded from CoW even when shm-backed: rosetta's JIT state (TLS slabs, code caches, indirect-call tables, block lists) is process-local and corrupts when COW-shared. The legacy region-copy path preserves the parent's JIT state independently per child.
When use_shm is false, the child receives the guest RAM by copying every used region over the IPC socket in 1 MiB chunks:
// src/runtime/fork-state.c:170-189 (fork_ipc_send_memory_regions, use_shm == false)
uint8_t *src = (uint8_t *) g->host_base + used[i].offset;
size_t remaining = used[i].size;
while (remaining > 0) { ... fork_ipc_write_all(ipc_sock, src, chunk); ... }
The native arm64 path instead sends only the shm_fd (via SCM_RIGHTS); the child maps it MAP_PRIVATE for an instant CoW snapshot and zero bytes of RAM cross the socket (forkipc.c:1253-1268, fork-state.c:153-157).
Impact
- x86 OCI images run via Rosetta hit a per-fork cost proportional to resident memory, so fork-heavy workloads (e.g.
sh -c pipelines, ./configure, make, apk/apt) degrade sharply relative to arm64 images.
- This is a correctness-driven trade-off, not an outright bug: CoW-sharing Rosetta's process-local JIT state would corrupt the translator. The goal of this issue is to track the limitation and possible optimizations.
Possible directions
- Selective CoW: CoW-share the non-JIT guest regions and copy only the Rosetta JIT-owned regions per child, if those regions can be identified cheaply.
- Document the limitation in the usage/compat docs so users know x86 images carry a fork-cost penalty under Rosetta.
- Measure the actual cost on a representative fork-heavy x86 image to decide whether (1) is worth the complexity.
Refs #31, PR #34.
Summary
fork/cloneuses a copy-on-write fast path for native arm64 guests but falls back to a full region-by-region guest-memory copy for x86 (Rosetta) guests. Every fork in an x86 OCI image therefore pays an O(resident memory) copy, while the same image on arm64 does not.Surfaced while reviewing OCI image support (#31, PR #34), where x86 images run via Rosetta and container userspace (shells, build systems, package managers) is fork-heavy.
Details
The CoW fast path is gated off for Rosetta guests:
with the rationale documented inline (
forkipc.c:1248-1251):When
use_shmis false, the child receives the guest RAM by copying every used region over the IPC socket in 1 MiB chunks:The native arm64 path instead sends only the
shm_fd(viaSCM_RIGHTS); the child maps itMAP_PRIVATEfor an instant CoW snapshot and zero bytes of RAM cross the socket (forkipc.c:1253-1268,fork-state.c:153-157).Impact
sh -cpipelines,./configure,make, apk/apt) degrade sharply relative to arm64 images.Possible directions
Refs #31, PR #34.