Skip to content

feat(dht)!: add trust quarantine thresholds#119

Open
mickvandijke wants to merge 8 commits into
mainfrom
feat/trust-quarantine-thresholds
Open

feat(dht)!: add trust quarantine thresholds#119
mickvandijke wants to merge 8 commits into
mainfrom
feat/trust-quarantine-thresholds

Conversation

@mickvandijke

@mickvandijke mickvandijke commented May 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add trust quarantine policy to DHT routing with three thresholds: 0.35 lazy swap eligibility, 0.20 quarantine/automatic-avoidance, and 0.45 new-peer admission/readmission.
  • Evict peers when they are in the current K-closest-to-self set, their trust drops below 0.20, and removal still leaves at least K routing-table peers. If the table is at K, keep the peer in the table but continue avoiding it in automatic lookup policy.
  • Re-run close-group quarantine after new routing-table admissions, including non-close additions, so deferred low-trust close-group peers are evicted once the table has surplus above K.
  • Gate all new routing-table admissions at 0.45, including peers that would occupy non-close routing slots and peers attempting to re-enter after quarantine.
  • Preserve existing routing-table peers with trust in [0.20, 0.45): they may remain in the table and can move into the close group, while peers below 0.20 are avoided by automatic lookup/dial paths.
  • Filter quarantined or below-threshold peers from local lookup results, FIND_NODE responses, automatic lookup, bootstrap, bucket refresh, and self-lookup paths.
  • Keep explicit sends unblocked; quarantine is routing-table and automatic lookup policy, not a blanket transport block.
  • Update docs and tests for the new trust policy and threshold semantics.

Breaking Changes

  • AdaptiveDhtConfig now includes quarantine_threshold and quarantine_readmit_threshold.
  • New routing-table peers must meet quarantine_readmit_threshold before admission when trust quarantine is enabled.
  • Peers below quarantine_threshold are no longer handed out through DHT lookup results or used by automatic lookup/dial maintenance paths.

Tests

  • cargo fmt --all -- --check
  • cargo check --all-features
  • cargo clippy --all-features -- -D clippy::panic -D clippy::unwrap_used -D clippy::expect_used
  • cargo test --all-features quarantine --lib
  • cargo test --all-features --lib
  • cargo test --all-features --test trust_flow --test sybil_protection
  • cargo test --all-features - not rerun for this amendment; prior run on this PR failed in tests/dht_self_advertisement.rs wildcard-bind self-entry assertions after all library tests passed; appears unrelated to this change.

Greptile Summary

This PR adds a three-tiered trust quarantine policy to the DHT routing layer: a 0.20 eviction/avoidance threshold, a 0.45 new-peer admission threshold, and the existing 0.35 lazy-swap threshold. The implementation gates new routing-table admissions, close-group evictions, local lookup results, automatic lookup/dial paths, bootstrap, and bucket refresh.

  • Adds quarantine_threshold and quarantine_readmit_threshold fields to AdaptiveDhtConfig, DhtNetworkConfig, and DhtCoreEngine, wired end-to-end from AdaptiveDHT through DhtNetworkManager to the core routing engine.
  • Introduces enforce_close_group_trust_gate, check_new_peer_admission, should_avoid_for_lookup, and should_avoid_automatic_candidate as new trust-gating primitives guarded by quarantine_enabled().
  • Adds broadcast_routing_events_with_quarantine to run deferred close-group evictions after any routing-table admission and emit a single merged KClosestPeersChanged event.

Confidence Score: 4/5

The new quarantine logic is well-tested, correctly gated behind quarantine_enabled(), and does not block explicit sends. All automatic lookup paths consistently apply the new filters.

The three-threshold eviction and lookup-filtering paths are covered by purpose-built unit tests exercising boundary conditions, deferred eviction, and readmission. The _previous_close_group dead parameter and quarantined_peers growth under adversarial churn are notable open questions but neither breaks current behaviour.

src/dht/core_engine.rs — the enforce_close_group_trust_gate dead parameter and quarantined_peers unbounded growth warrant a second look before long-running node deployments.

Important Files Changed

Filename Overview
src/dht/core_engine.rs Core trust-gating primitives added. _previous_close_group parameter in enforce_close_group_trust_gate is accepted at every call site but unused. quarantined_peers HashSet has no upper-bound or expiry mechanism.
src/dht_network_manager.rs Quarantine enforcement threaded through all automatic lookup paths. broadcast_routing_events_with_quarantine correctly merges admission and quarantine-eviction events into a single KClosestPeersChanged.
src/adaptive/dht.rs New thresholds added to AdaptiveDhtConfig with validation, defaults, and propagation. Tests cover defaults, validation, and threshold ordering.
src/network.rs trust_enforcement(false) zeroes all three thresholds; trust_enforcement(true) uses full AdaptiveDhtConfig::default().
tests/sybil_protection.rs Struct literals updated with ..Default::default() for the two new fields; test intent unchanged.
tests/trust_flow.rs Same struct-literal fixups; existing trust-flow tests unaffected.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Trust event reported] --> B[Update TrustEngine score]
    B --> C[enforce_trust_quarantine]
    C --> D{trust_engine present?}
    D -- No --> Z[return false]
    D -- Yes --> E[enforce_close_group_trust_gate]
    E --> F{quarantine enabled?}
    F -- No --> G[return empty events]
    F -- Yes --> H{node_count > K AND K-closest peer below quarantine_threshold?}
    H -- No --> I[return empty events]
    H -- Yes --> J[evict peer, add to quarantined_peers]
    J --> H
    J --> K[emit PeerRemoved + KClosestPeersChanged]
    K --> L[broadcast_routing_events]
    M[New peer admission] --> N{peer already known?}
    N -- No --> O[check_new_peer_admission]
    O --> P{quarantine enabled?}
    P -- No --> Q[admit peer]
    P -- Yes --> R{trust >= readmit_threshold?}
    R -- No --> S[Reject]
    R -- Yes --> T[remove from quarantined_peers, admit]
    T --> U[broadcast_routing_events_with_quarantine]
    U --> V{PeerAdded or KClosestPeersChanged?}
    V -- No --> W[broadcast directly]
    V -- Yes --> X[enforce_close_group_trust_gate deferred pass]
    X --> Y[merge KClosestPeersChanged, broadcast]
    AA[Automatic lookup / bootstrap / bucket refresh] --> AB[should_avoid_automatic_peer]
    AB --> AC{quarantine enabled?}
    AC -- No --> AD[include peer]
    AC -- Yes --> AE{trust below quarantine_threshold OR quarantined and below readmit?}
    AE -- Yes --> AF[skip peer]
    AE -- No --> AG{unknown peer below readmit_threshold?}
    AG -- Yes --> AF
    AG -- No --> AD
Loading
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
src/dht/core_engine.rs:1607
**Dead parameter silently dropped**

`_previous_close_group` is accepted at every call site (including `broadcast_routing_events_with_quarantine` which extracts the original old close group expressly to pass here) but the implementation never reads it. The eviction loop operates solely on the live K-closest snapshot, so the "which peers were newly promoted vs. long-standing" distinction the parameter implies is never applied. If future logic needs it (e.g., giving just-promoted peers a grace period), the current signature already supports it; if it is not needed, removing the parameter avoids misleading callers about the function's contract.

### Issue 2 of 4
src/dht/core_engine.rs:1396
**`quarantined_peers` grows without a bound or expiry**

Entries are added in `enforce_close_group_quarantine` / `enforce_close_group_trust_gate` and removed only in `check_new_peer_admission` when a peer successfully re-enters the routing table at or above the readmit threshold. A peer that is quarantined and then permanently leaves the network (or is never rediscovered) keeps its entry in this `HashSet` forever. For a long-running node operating under adversarial churn this can accumulate a large number of stale entries. A periodic sweep removing entries whose last-seen timestamp exceeds the `live_threshold`, or capping the set at a generous but finite size, would bound the growth.

### Issue 3 of 4
src/dht/core_engine.rs:1443-1444
**Inconsistent initial `quarantine_readmit_threshold` in `DhtCoreEngine::new()`**

`quarantine_threshold` is initialised to `0.0` (quarantine disabled) but `quarantine_readmit_threshold` is initialised to `DEFAULT_QUARANTINE_READMIT_THRESHOLD` (0.45). When `DhtNetworkManager` constructs the engine it immediately calls `set_trust_quarantine_thresholds(config.quarantine_threshold, config.quarantine_readmit_threshold)` which overrides the 0.45, so production code is unaffected. However, a future caller that constructs `DhtCoreEngine` directly and introspects the threshold before calling `set_trust_quarantine_thresholds` would observe an asymmetric disabled state (threshold=0.0, readmit=0.45). Initialising `quarantine_readmit_threshold: 0.0` here would make the default state symmetrically disabled.

### Issue 4 of 4
src/dht/core_engine.rs:1618-1634
**`trust_score` closure called twice per candidate in the eviction loop**

Inside the `while` loop the closure is invoked once in the `find` iterator (with the `is_finite()` guard) and then a second time unconditionally for the `quarantined_peers.insert` decision. Because the `find` predicate already requires `is_finite() && score < quarantine_threshold`, the second call will always produce the same result and the `if` branch will always be taken. Capturing the score from the first call into a local variable would eliminate the duplicate lookup and make the invariant explicit.

Reviews (1): Last reviewed commit: "fix(docs): document adaptive trust enfor..." | Re-trigger Greptile

Greptile also left 4 inline comments on this PR.

Add close-group quarantine below 0.20 with natural readmission at 0.45.

Keep 0.35 as lazy swap eligibility, avoid quarantined peers in automatic lookups, and keep explicit sends unblocked.

BREAKING CHANGE: AdaptiveDhtConfig now includes quarantine_threshold and quarantine_readmit_threshold fields.
@mickvandijke mickvandijke force-pushed the feat/trust-quarantine-thresholds branch from c5f7387 to 8721feb Compare May 19, 2026 19:26
Implement stricter trust gating for K-closest set admission and readmission, adding support for filtering newly promoted peers below the readmission threshold. Update routing table logic and wire compatibility to stabilize behavior across nodes. Extend related tests and documentation.

BREAKING CHANGE: Adjusts close-group thresholds affecting trust-based peer routing and admission policies.
Gate all new routing-table admissions at quarantine_readmit_threshold while allowing existing routing-table peers above the quarantine threshold to stay and move into the close group.

BREAKING CHANGE: new peers below quarantine_readmit_threshold are no longer admitted to the routing table, even for non-close routing slots.
@mickvandijke mickvandijke marked this pull request as ready for review May 21, 2026 12:15
Copilot AI review requested due to automatic review settings May 21, 2026 12:15

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a trust-based quarantine policy for DHT routing, adding separate thresholds for (1) swap eligibility, (2) quarantine/automatic-avoidance, and (3) new-peer admission/readmission, and propagates that policy through routing-table admission, lookup candidate selection, and DHT maintenance flows.

Changes:

  • Extend AdaptiveDhtConfig/DhtNetworkConfig to include quarantine thresholds and validate/enforce them throughout DHT operations.
  • Add close-group quarantine enforcement (evict low-trust close peers when RT has surplus above K) and filter quarantined/below-threshold peers from local lookup results and automatic maintenance paths.
  • Update integration tests and documentation to reflect the new trust threshold semantics and breaking config changes.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/trust_flow.rs Updates tests to build AdaptiveDhtConfig with new fields via ..Default::default().
tests/sybil_protection.rs Updates docs/comments and config construction to align with quarantine + swap semantics.
src/network.rs Updates node config docs and builder toggle to enable/disable all adaptive thresholds coherently.
src/dht/core_engine.rs Adds quarantine thresholds/state and enforcement logic + extensive unit tests for threshold behavior.
src/dht_network_manager.rs Applies quarantine filtering to lookup/bootstrap/refresh/self-lookup paths and re-runs quarantine after admissions.
src/adaptive/dht.rs Adds new config fields, validation, propagation into the manager config, and triggers quarantine enforcement after trust updates.
README.md Updates high-level trust system description to reflect quarantine model.
docs/trust-signals-api.md Re-documents API semantics around the new thresholds and behavior.
docs/SECURITY_MODEL.md Updates security model docs to match quarantine-based routing enforcement.
docs/ROUTING_TABLE_DESIGN.md Updates routing-table design invariants/threshold descriptions to reflect quarantine + admission gating.

Comment thread src/dht_network_manager.rs Outdated
Comment thread src/dht/core_engine.rs Outdated
Comment thread src/dht_network_manager.rs Outdated
Comment thread src/dht/core_engine.rs Outdated
Comment thread src/dht/core_engine.rs
Comment thread src/dht/core_engine.rs Outdated
Comment thread src/dht/core_engine.rs
Copilot AI review requested due to automatic review settings May 21, 2026 15:40

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Comment thread src/dht/core_engine.rs Outdated
Comment thread docs/ROUTING_TABLE_DESIGN.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants