apollo_l1_events,apollo_l1_events_config: chunk L1 getLogs range into bounded windows (M-25) by asaf-sw · Pull Request #14602 · starkware-libs/sequencer

asaf-sw · 2026-06-23T09:51:17Z

M-25 — Chunk the L1 getLogs range into bounded windows

Security review finding: M-25 (Low–Medium). The L1 events scraper built the inclusive range scraping_start..=latest_l1_block and asked the base layer for all tracked events across that entire range in a single events() / eth_getLogs call, with no per-iteration cap and no chunking (a // If this gets too high, send in batches. comment acknowledged the gap). After downtime, an L1 outage, or with a large startup rewind, that range can span thousands of L1 blocks. The dominant failure mode is a liveness stall: the single call is wrapped in a 1 s timeout plus an N+1 per-event header fetch, so an oversized range times out and the scraper spins in its retry loop, never advancing — and (via the cyclic wrapper) misfires the primary-down-since signal. Unbounded Vec<L1Event> / Vec<Event> materialization is a secondary memory risk.

What changed

New config param l1_events_scraper_config.max_blocks_per_fetch: u64 (apollo_l1_events_config), default 1000, validated >= 1.
Bounded fetch window (apollo_l1_events/src/l1_scraper.rs): fetch_events now requests scraping_start..=min(latest, start + max_blocks_per_fetch - 1) and returns the L1BlockReference for the window end, so the cursor advances by exactly one window per poll and the steady-state loop drains a backlog over successive polls. The finality ceiling and the once-per-poll reorg check are preserved. initialize sends the first bounded window via the provider's initialize() and lets the steady loop drain the rest via add_events, so the provider's initialize-once contract is unchanged. Saturating arithmetic guards a 0 cap even though validation rejects it.
Deployment presets updated. l1_events_scraper_config.max_blocks_per_fetch: 1000 added to the hand-maintained l1_events_scraper_config.json and replacer_l1_events_scraper_config.json presets, and config_schema.json regenerated. A new required schema param missing from the presets would MissingParam-panic the deployed node at startup (CrashLoopBackOff in system_test_hybrid) — exactly the bug L-16 (apollo_l1_events: bound catch-up commit-block backlog with a cap and metric #14590) hit.

Assumptions (documented in code)

Default 1000 is conservative versus common public-RPC eth_getLogs caps (~1k–10k) and the 1 s timeout; operators on private RPCs may raise it.

Tests

SEED=0 cargo nextest run -p papyrus_base_layer -p apollo_l1_events -p apollo_l1_events_config — 125 passed (new: range-cap, partial-window, finality-ceiling, multi-poll backlog drain). cargo nextest run -p apollo_deployments (incl. deployment_files_are_up_to_date) and -p apollo_node_config (default_config_file_is_up_to_date) both pass.

🤖 Generated with Claude Code

reviewable-StarkWare · 2026-06-23T09:51:27Z

This change is

asaf-sw · 2026-06-23T09:51:29Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

… bounded windows (M-25) The L1 events scraper fetched the entire range from its cursor to the latest (finality-adjusted) L1 block in a single events()/eth_getLogs request. After downtime, an L1 outage, or with a large startup rewind, that range can span thousands of blocks, materializing an unbounded Vec<L1Event>/Vec<Event> and a single oversized RPC request that most providers reject or that hits the 1s base-layer timeout — wedging the scraper and (via the cyclic wrapper) misfiring the primary-down-since alert. Cap each fetch to a configurable max_blocks_per_fetch (default 1000, validated >= 1). fetch_events now requests scraping_start..=min(latest, start + cap - 1) and returns the L1BlockReference for the window end, so the cursor advances by exactly one window per poll and the steady-state loop drains a backlog over successive polls. The finality ceiling and the once-per-poll reorg check are preserved. initialize sends the first bounded window via the provider's initialize() and lets the steady loop drain the rest via add_events, so the provider's initialize-once contract is unchanged. Regenerated config_schema.json for the new field. Assumptions (documented in code): default 1000 is conservative versus common public-RPC eth_getLogs caps (~1k-10k) and the 1s timeout; operators on private RPCs may raise it. Saturating arithmetic guards against a 0 cap even though validation rejects it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

asaf-sw mentioned this pull request Jun 23, 2026

papyrus_base_layer,apollo_l1_events: classify transient vs permanent L1 errors (L-13) #14601

Open

asaf-sw force-pushed the asaf/m25-l1-getlogs-chunking branch from 5910da2 to 1ddf871 Compare June 23, 2026 12:15

asaf-sw force-pushed the asaf/l13-l1-error-classification branch from 66ffac0 to 9821da4 Compare June 23, 2026 12:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apollo_l1_events,apollo_l1_events_config: chunk L1 getLogs range into bounded windows (M-25)#14602

apollo_l1_events,apollo_l1_events_config: chunk L1 getLogs range into bounded windows (M-25)#14602
asaf-sw wants to merge 1 commit into
asaf/l13-l1-error-classificationfrom
asaf/m25-l1-getlogs-chunking

asaf-sw commented Jun 23, 2026 •

edited

Loading

Uh oh!

reviewable-StarkWare commented Jun 23, 2026

Uh oh!

asaf-sw commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

asaf-sw commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

M-25 — Chunk the L1 getLogs range into bounded windows

What changed

Assumptions (documented in code)

Tests

Uh oh!

reviewable-StarkWare commented Jun 23, 2026

Uh oh!

asaf-sw commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

asaf-sw commented Jun 23, 2026 •

edited

Loading

asaf-sw commented Jun 23, 2026 •

edited

Loading