resolve volumes by stable name (logical→physical owned by GlideFS) + node-scoped recovery + ublk idle-spin fix by jaredLunde · Pull Request #81 · beyondoss/glidefs

jaredLunde · 2026-06-23T04:35:11Z

GlideFS now owns the logical→physical mapping: callers address volumes by stable name and never supply s3_prefix/manifest_name/snapshot_sequence. Three related changes:

1. Resolve-by-name logical API (`12a6fa7`)

block/registry.rs: FromRef (image:/volume:/snapshot:), ResolvedSource, durable image + snapshot indexes (sibling of export.json).
resolve_export + GET /api/resolve/{name} (reads export.json from S3 — works on any node). resolve_source turns a logical from ref into physical coords feeding the existing fork machinery; the new volume lands in the source's pool for CoW. Re-attach-by-name on PUT when not held locally.
PUT body is fully logical (CreateVolumeRequest{size_gb, from, …}); physical knobs removed.
Snapshots return a stable snapshot_id + write the snapshot index; image index written by HTTP bless, glidefs bless CLI, promote-base, and tags; GET /api/images/{name}; lineage in ExportConfig::source.
Physical S3 layout unchanged (no migration). Docs: ARCHITECTURE.md + README.md.

2. Node-scoped boot recovery (`10484d3`)

discover_exports() listed every export.json under the single global {db_path}/exports/ prefix, so a node resurrected every export ever created on the shared bucket as a live ublk device (observed: 350+ devices when ~3 VMs existed). New discover_local_exports() recovers only the node's working set from the local device maps (ublk_devices.json/nbd_devices.json); everything else stays dormant in S3 and attaches on demand by name. Enabled by (1).

3. ublk idle-worker spin fix (`e94b6a7`)

The worker loop used to_wait = if all_done() { 0 } else { 1 } — a worker with no hosted queues busy-spun a full core. With N idle workers that burned ~N cores (observed ~1576% CPU with few/no devices). The eventfd watcher daemon already wakes blocked workers on queue assignment, so idle workers now always block (to_wait=1). Verified live: 1590%→0% CPU with 0 devices; 8MB direct-IO round-trip intact.

Testing

467 lib tests pass, clippy clean, all feature-gated targets compile.
Verified e2e on the homelab: VM boots via from:"image:…" fork; node-scoped recovery + detach behavior; worker idle CPU 0%.

Note: requires the coordinated instd cutover (separate beyond PR) — instd must speak the logical API.

🤖 Generated with Claude Code

GlideFS now owns the name→location mapping; callers address everything by logical name and never supply an s3_prefix/manifest_name/snapshot_sequence. - block/registry.rs: FromRef (image:/volume:/snapshot:), ResolvedSource, durable image + snapshot indexes (sibling of export.json). - router: resolve_export + GET /api/resolve/{name} (reads export.json from S3, works on any node); resolve_source turns a logical `from` ref into physical coords feeding the existing fork machinery; new volume lands in the source's pool for CoW. Re-attach-by-name on PUT when not held locally. - api: PUT body is fully logical (CreateVolumeRequest{size_gb, from, ...}); physical knobs removed. create_or_attach_volume is the shared core. - Phase 3: snapshot_export returns a stable snapshot_id + writes the snapshot index; image index written by HTTP bless, `glidefs bless` CLI, promote-base, and tags; GET /api/images/{name}; lineage in ExportConfig::source. - Physical S3 layout unchanged (no migration); explicit-s3_prefix admin endpoints (manifests/profile HEAD/GET) retained for build-time use. - Docs: ARCHITECTURE.md + README.md updated to the logical model. 465 lib tests pass, clippy clean, all feature-gated targets compile. Verified e2e on live ublk: blank→snapshot→fork-by-snapshot:→resolve. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…orking set discover_exports() lists every export.json under the single global {db_path}/exports/ prefix, and boot re-attaches + binds a kernel ublk device for all of them. On a shared bucket a node thus resurrects every export ever created (deep-sleeping/detached/dead/other-node) as a live /dev/ublkbN — a restart re-attached 350+ devices when ~3 VMs existed. Add discover_local_exports(): read the node-local device maps (cache_dir/ublk_devices.json + nbd_devices.json — rewritten on every device add/remove, so they name exactly the exports this node owns a device for) and load_export() only those. Swap the cold-start call (cli/server.rs). Everything else a node doesn't hold locally stays dormant in S3 and attaches on demand by name — enabled by resolve-by-name. A fresh node (no maps) recovers nothing and attaches by name when asked (dead-node recovery). Verified live: boot recovers from the device map (not S3); a detached export (export.json present, dropped from the map) is not resurrected on restart but remains resolvable by name. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The worker loop used `to_wait = if all_done() { 0 } else { 1 }`, so a worker with no hosted queues passed to_wait=0 to io_uring_enter, which returned instantly and busy-looped a full core. With N idle workers (N = pool size − workers hosting queues) that burned ~N cores doing nothing — on a host running few VMs that's most of the pool (observed: ~1576% CPU with no/few devices), and inversely proportional to device count. The eventfd watcher is a daemon task that keeps a PollAdd permanently armed on the worker's eventfd and re-arms forever, independent of any hosted queue. Every WorkerHandle::send (AddQueue/RemoveQueue/Shutdown) writes the eventfd, generating a PollAdd CQE that unblocks io_uring_enter immediately. So a queue-less worker has nothing to gain from busy-polling: no I/O can target it until a queue is assigned, and that assignment wakes it. glidefs already blocks with a WORKER_IDLE_NSEC (250ms) timeout — it is not a busy-poll-for-latency design — so always blocking is strictly better. Channel close is still noticed within one idle tick. Verified live: CPU 1590%→0% with 0 devices; 8MB direct-IO write/read round-trip intact (workers still service I/O); idles to 0% after I/O. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ver_local_exports) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jaredLunde and others added 3 commits June 22, 2026 20:39

jaredLunde changed the title ~~Resolve volumes by stable name (logical→physical owned by GlideFS) + node-scoped recovery + ublk idle-spin fix~~ resolve volumes by stable name (logical→physical owned by GlideFS) + node-scoped recovery + ublk idle-spin fix Jun 23, 2026

chore: allow(dead_code) on retained discover_exports (boot uses disco…

3852c82

…ver_local_exports) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jaredLunde merged commit adad677 into main Jun 23, 2026
25 checks passed

jaredLunde deleted the jared/resolve-by-name branch June 23, 2026 05:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

resolve volumes by stable name (logical→physical owned by GlideFS) + node-scoped recovery + ublk idle-spin fix#81

resolve volumes by stable name (logical→physical owned by GlideFS) + node-scoped recovery + ublk idle-spin fix#81
jaredLunde merged 4 commits into
mainfrom
jared/resolve-by-name

jaredLunde commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jaredLunde commented Jun 23, 2026

1. Resolve-by-name logical API (12a6fa7)

2. Node-scoped boot recovery (10484d3)

3. ublk idle-worker spin fix (e94b6a7)

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Resolve-by-name logical API (`12a6fa7`)

2. Node-scoped boot recovery (`10484d3`)

3. ublk idle-worker spin fix (`e94b6a7`)