Skip to content

starknet_patricia: pre-allocate filled-tree output map to avoid rehashing#14638

Open
gkaempfer wants to merge 4 commits into
mainfrom
claude/perf/starknet-patricia-preallocate-output-map
Open

starknet_patricia: pre-allocate filled-tree output map to avoid rehashing#14638
gkaempfer wants to merge 4 commits into
mainfrom
claude/perf/starknet-patricia-preallocate-output-map

Conversation

@gkaempfer

Copy link
Copy Markdown
Contributor

Performance follow-up to #14622.

What

initialize_filled_tree_output_map_with_placeholders used HashMap::new() (capacity 0) and grew the map entry-by-entry, triggering O(log N) rehash events for a skeleton with N nodes. Each rehash allocates a new backing array and re-inserts all current entries.

The underlying get_nodes() iterator is backed by HashMap::iter(), an ExactSizeIterator, so size_hint().0 is the exact total node count — a tight upper bound for the number of modified (non-UnmodifiedSubTree) nodes actually inserted. Pre-allocating with that count eliminates all rehashing.

Before: HashMap::new() → O(log N) rehash events as the map grows
After: HashMap::with_capacity(nodes_iter.size_hint().0) → zero rehashes

Impact

On the block-finalization hot path (called once per block in both FilledTree::create and create_with_existing_leaves). For a busy block with K leaf updates in a depth-251 Patricia tree, the skeleton may have several thousand nodes. Without pre-allocation, the map rehashes ~13 times for 5 000 nodes, each time copying all current entries. This change eliminates that work entirely.

The capacity is a slight over-allocation (includes UnmodifiedSubTree nodes that are filtered out), but the memory is short-lived and bounded by the skeleton size.

Test plan

  • cargo build -p starknet_patricia
  • SEED=0 cargo test -p starknet_patricia → 107 tests passed, 0 failed ✅

Generated by Claude Code

yoavGrs and others added 2 commits June 25, 2026 09:08
The filled-tree output maps are written exactly once per node and then
reclaimed, yet they used `Mutex<Option<T>>` and locked on every write and on
every reclaim - even though after `Arc::into_inner` the map is uniquely owned
and can never be contended.

Replace the output maps with `OnceLock<T>`, the primitive that matches this
write-once access pattern: `write_to_output_map` becomes a lock-free `set`
(a failed `set` is the double-update error), and reclaiming via
`OnceLock::into_inner` drops the locking entirely. Pre-allocate the collected
map with the known capacity. `Leaf::Output` gains a `Sync` bound, required for
`OnceLock<L::Output>: Sync`. The leaf *input* map keeps `Mutex<Option<_>>`,
since its values are moved out (take-once), not written.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hing

`initialize_filled_tree_output_map_with_placeholders` created a
`HashMap::new()` (capacity 0) and grew it entry-by-entry, triggering
O(log N) rehash events for a skeleton with N nodes.  The underlying
`get_nodes()` iterator is backed by `HashMap::iter()`, so its
`size_hint().0` is the exact total node count — a tight upper bound for
the number of modified (non-UnmodifiedSubTree) nodes actually inserted.
Pre-allocating with that count eliminates all rehashing on the block-
finalization hot path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011ZqP3G62QowTMJHG6RM4Br
@gkaempfer gkaempfer requested a review from yoavGrs June 25, 2026 10:43
@reviewable-StarkWare

Copy link
Copy Markdown

This change is Reviewable

claude and others added 2 commits June 25, 2026 10:48
…t and variable name

Address Opus code review findings on the pre-allocation change:
- Blocking: tighten `UpdatedSkeletonTree::get_nodes` to return
  `impl ExactSizeIterator` so the `.len()` pre-allocation is
  contractually guaranteed, not reliant on the concrete impl.
- Suggestion: replace `size_hint().0` (Iterator lower bound, not
  guaranteed exact) with `.len()` (ExactSizeIterator API); fix
  comment wording to clarify the slight over-reservation.
- Nit: rename `nodes_iter` → `skeleton_nodes`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011ZqP3G62QowTMJHG6RM4Br
initialize_filled_tree_output_map_with_placeholders started with
HashMap::new() (zero capacity) and inserted up to N entries one by one,
triggering O(log N) rehash cycles for large skeletons. get_nodes() wraps
HashMap::iter() via map(), which preserves ExactSizeIterator, so
size_hint().0 gives the exact total node count. Using that as the initial
capacity eliminates all rehashing; UnmodifiedSubTree slots are merely
unused capacity, not a correctness issue.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Ucm8nxNrzrUBH51kGm8bfs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants