Skip to content

OAK-12280: Provide a way to reconcile the offline GC result back #2977

Open
royteeuwen wants to merge 1 commit into
apache:trunkfrom
royteeuwen:fix/ghost-blob-gc-datastore-exception
Open

OAK-12280: Provide a way to reconcile the offline GC result back #2977
royteeuwen wants to merge 1 commit into
apache:trunkfrom
royteeuwen:fix/ghost-blob-gc-datastore-exception

Conversation

@royteeuwen

@royteeuwen royteeuwen commented Jun 25, 2026

Copy link
Copy Markdown

There is a blobid tracker used by the online GC, which should be reconciled by the offline GC run command. Make sure that if the file is not reconciled, the online GC still cleans up the blobs that no longer exist from the blobid tracker

@royteeuwen royteeuwen changed the title OAK-12280: Provide a way to reconcile the offline GC result back to t… OAK-12280: Provide a way to reconcile the offline GC result back Jun 25, 2026
After an offline `datastore --collect-garbage` (non mark-only) run, the
BlobIdTracker used by online GC is now reconciled automatically: blob IDs
deleted by this run are removed from the local tracker files and the
SharedDataStore snapshot (the `*.refs` record). This stops online GC from
repeatedly logging `DataStoreException: Record ... does not exist` for ghost
blobs and failing to prune them from the tracker.

The online sweep is also made tolerant of ghost blobs: `countDeleteChunk`
returns -1 (or a DataStoreException is caught) and the id is removed from the
tracker instead of failing, so pre-existing stale entries get cleaned up.

The tracker lives at <repository-home>/blobids. The repository home is derived
from the segment store path (its parent) and can be overridden with the new
optional --repository-home. This replaces the earlier --blobids-path option,
which pointed one level too deep: BlobIdTracker.build() appends "blobids", so
passing the blobids dir resolved to <...>/blobids/blobids and left the live
local tracker untouched (ghosts returned after a restart).
@royteeuwen royteeuwen force-pushed the fix/ghost-blob-gc-datastore-exception branch from 6f3a459 to 75c873c Compare June 25, 2026 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant