feat(migration): statehistory migration#3658
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3658 +/- ##
==========================================
- Coverage 76.45% 75.13% -1.32%
==========================================
Files 401 431 +30
Lines 36747 38629 +1882
==========================================
+ Hits 28094 29023 +929
- Misses 6673 7581 +908
- Partials 1980 2025 +45 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
5c0928e to
5a09f6c
Compare
0f8a352 to
c2eacce
Compare
fa47934 to
05dd000
Compare
| elapsed := now.Sub(c.start).Seconds() | ||
| if elapsed > c.timeLogRate.Seconds() { | ||
| mbs := float64(c.size) / float64(db.Megabyte) | ||
| c.logger.Info( |
There was a problem hiding this comment.
example log output:
statehistory/counter.go:39 write speed {"phase": "class-hash", "MB": 96.00005149841309, "MB/s": 1.0337050014199167, "completedContracts": 1074100, "completedContracts/s": 11565.64527513286, "entries": 1378950, "entries/s": 14848.195281765626, "time": 92.86987232}
ideally, we should round to ~last 2 digits after .
b0f86f9 to
c1fcadb
Compare
05dd000 to
5f59583
Compare
|
Claude finished @rodrodros's task in 4m 44s —— View job Review: statehistory migrationDone — reviewed the full diff (migrator, three ingestors, shared What I verified
Findingsnit
Both are non-blocking. Nice work on the per-phase design and the detailed doc comments. |
| fields = append(fields, | ||
| zap.Float64("MB", math.Round(mbs*cent)/cent), | ||
| zap.Float64("MB/s", math.Round(mbs/elapsed*cent)/cent), | ||
| zap.Uint64("completedContracts", c.completedAddrs), | ||
| zap.Float64("completedContracts/s", float64(c.completedAddrs)/elapsed), | ||
| zap.Uint64("entries", c.entryCount), | ||
| zap.Float64("entries/s", float64(c.entryCount)/elapsed), | ||
| zap.Float64("time", elapsed), | ||
| ) |
There was a problem hiding this comment.
nit (follow-up to @brbrr's earlier rounding request): MB and MB/s are now rounded to 2 decimals, but completedContracts/s, entries/s, and time are still emitted at full float precision — which is most of the noise in the example log line. Consider rounding them the same way:
| fields = append(fields, | |
| zap.Float64("MB", math.Round(mbs*cent)/cent), | |
| zap.Float64("MB/s", math.Round(mbs/elapsed*cent)/cent), | |
| zap.Uint64("completedContracts", c.completedAddrs), | |
| zap.Float64("completedContracts/s", float64(c.completedAddrs)/elapsed), | |
| zap.Uint64("entries", c.entryCount), | |
| zap.Float64("entries/s", float64(c.entryCount)/elapsed), | |
| zap.Float64("time", elapsed), | |
| ) | |
| fields = append(fields, | |
| zap.Float64("MB", math.Round(mbs*cent)/cent), | |
| zap.Float64("MB/s", math.Round(mbs/elapsed*cent)/cent), | |
| zap.Uint64("completedContracts", c.completedAddrs), | |
| zap.Float64("completedContracts/s", math.Round(float64(c.completedAddrs)/elapsed*cent)/cent), | |
| zap.Uint64("entries", c.entryCount), | |
| zap.Float64("entries/s", math.Round(float64(c.entryCount)/elapsed*cent)/cent), | |
| zap.Float64("time", math.Round(elapsed*cent)/cent), | |
| ) |
| t.EntryCount++ | ||
| if err := i.Flush(t, outputs); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| for { | ||
| block, err := parseBlockKey(depIt.Key(), prefix) | ||
| if err != nil { | ||
| return fmt.Errorf("class-hash(%s): %w", addr, err) | ||
| } | ||
| hasNext := depIt.Next() | ||
| historyValue := *headClassHash | ||
| if hasNext { | ||
| rawValue, err := depIt.Value() | ||
| if err != nil { | ||
| return fmt.Errorf("class-hash(%s): %w", addr, err) | ||
| } | ||
| historyValue = felt.FromBytes[felt.Felt](rawValue) | ||
| } | ||
| if err := state.WriteClassHashHistory(t.Batch, addr, block, &historyValue); err != nil { | ||
| return err | ||
| } | ||
| t.EntryCount++ | ||
| if err := i.Flush(t, outputs); err != nil { | ||
| return err | ||
| } | ||
| if !hasNext { | ||
| break | ||
| } | ||
| } | ||
|
|
||
| if err := t.Batch.DeleteRange(prefix, dbutils.UpperBound(prefix)); err != nil { | ||
| return fmt.Errorf("class-hash: DeleteRange deprecated(%s): %w", addr, err) | ||
| } | ||
| t.CompletedAddrs++ |
There was a problem hiding this comment.
nit (metrics only): writeShiftedHistory unconditionally increments EntryCount/CompletedAddrs on every pass, whereas writeDeployOnly short-circuits via Has(deployKey) when the entry already exists. On a resume that re-runs an already-rewritten-but-not-yet-deleted contract, the shifted path re-counts every entry, so the reported entries/completedContracts throughput can over-count after a crash/restart. Correctness is unaffected (writes are idempotent); just noting the progress numbers aren't resume-exact here.
Summary
Adds the
statehistorymigration: rewrites the three deprecated contract history layouts (class-hash, nonce, per-slot storage) so each entry stores the post-update value at its block instead of the pre-update value. Gated behind the existing--new-stateflag. Depends on the headstate migration (consolidated Contract record) shipped in the sibling PR — the new layout readscontract.{ClassHash, Nonce}and the head storage trie as the "last value" source.How it works
Runs three sequential phases — class-hash, nonce, storage — each iterating the
Contractbucket. Four worker goroutines (ingestorCount) per phase walk one contract's deprecated entries at a time, shift them into the new layout in shareddb.Batches, andDeleteRangethe deprecated rows in the same batch. One committer drains batches to disk; a semaphore caps in-flight batches atingestorCount + 1.deploy_hentry on top of the shifted history, growing the count by one per replaced contract.0(the deploy default). Shift only; entry count per contract / per slot is unchanged.felt.Zero.What changes
migration/statehistory/package (migrator, three per-phase ingestors, sharedbaseIngestor, committer, counter, parse helpers, tests).node/migration.goas an optional migration gated bycfg.NewState, running after the headstate migration.Resume safety
(shouldRerun, ctx.Err()).Alternatives considered
Two earlier attempts were benchmarked and dropped:
Wipe + rewrite from state updates. Drop the deprecated history entirely and rebuild the new layout by replaying state updates block-by-block. Conceptually clean but the resulting writes touch every history bucket in a near-random order — pebble compaction has to merge many small per-block updates across overlapping key ranges, so compact pressure dominated runtime.
Per-address instead of per-phase. Loop over contracts once and run all three phases (class-hash, nonce, storage) inside the same per-contract worker, finishing each contract before moving on. Saves two passes over the
Contractbucket but interleaves writes to three different history buckets per contract — again scattered, again heavy on compaction.The current per-phase approach writes are tightly clustered: one phase writes only one history bucket, in contract-address order, with deprecated
DeleteRanges landing in the same batch as the new rows that replace them. Sequential, large, mostly-sorted writes — pebble's happy path. The two extra Contract-bucket scans are negligible compared to the compaction savings.