Skip to content

[INS-468] Add improved lob detector to defaults.go#4971

Open
mustansir14 wants to merge 4 commits into
ins-465-add-datadogapikey-detector-to-defaultsfrom
ins-468-add-lob-detector-to-defaults-list
Open

[INS-468] Add improved lob detector to defaults.go#4971
mustansir14 wants to merge 4 commits into
ins-465-add-datadogapikey-detector-to-defaultsfrom
ins-468-add-lob-detector-to-defaults-list

Conversation

@mustansir14
Copy link
Copy Markdown
Contributor

@mustansir14 mustansir14 commented May 18, 2026

Summary

The Lob detector existed in the codebase but was never registered in the default detector list in defaults.go. This PR adds it to the defaults and, after discovering via corpora testing that the original regex was too loose and produced significant noise, refactors the detector to be more precise and follow current practices.

Regex tightened to reduce noise (the core fix):

The original regex relied on a loose proximity-based prefix match against the word "lob" and matched any 40-character alphanumeric string:

PrefixRegex([]string{"lob"}) + `\b([a-zA-Z0-9_]{40})\b`

Corpora testing showed this was extremely noisy. Lob API keys have a well-defined format — they always begin with live_ or test_ — so the new regex anchors on that structure:

`\b((live|test)_[a-zA-Z0-9_]{35})\b`

Keywords updated to match key prefix:

  • Before: ["lob"]
  • After: ["live_", "test_"]

This makes pre-filtering align with the actual key format rather than relying on a nearby context word.

Additional improvements (following current detector practices):

  • Scanner struct now accepts an injectable *http.Client (via getClient() helper) to support test mocking without a global variable.
  • Package-level client renamed to defaultClient to avoid shadowing.
  • Verification logic extracted into a dedicated verify() method.
  • Verification endpoint changed from GET /v1/addresses to POST /v1/us_verifications. The old endpoint returns 401 Unauthorized both for invalid keys and for active keys with no billing method on file, making it impossible to distinguish between the two cases. The new endpoint returns 403 Forbidden for active keys with no billing method, allowing a correct verification signal. Status code handling:
    • 403 Forbidden → verified (active key, no billing method on file)
    • 422 Unprocessable Entity → verified (active key, request body is invalid — expected for an empty POST)
    • 401 Unauthorized → not verified
    • anything else → verification error
  • Duplicate matches are now deduplicated before result construction.
  • ExtraData field added to expose the key environment (live or test).

Gating behind feature flag

Since this is considered a new detector addition, it is gated behind a feature flag. This is why the PR is based off of #4969 which contains some require plumbing for this.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

Note

Medium Risk
Medium risk because it changes default scanning behavior by enabling a new detector and updates its verification HTTP behavior, which could affect scan performance and result accuracy.

Overview
Adds the Lob detector to the default detector set (now included in defaults.go) and introduces a new feature.LobDetectorEnabled flag that is enabled by default in main.go and used to gate the detector in the default list.

Refactors the Lob detector to reduce noise and improve verification: updates the regex/keywords to match live_/test_ key formats, deduplicates matches, adds ExtraData.environment, and replaces the old verification call with a new POST /v1/us_verifications status-code based verifier (with injectable HTTP client). Tests are updated to reflect the new patterns, metadata, and verification expectations.

Reviewed by Cursor Bugbot for commit 01cd8bf. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 18, 2026

Corpora Test Results

Scans a corpus of real-world public code against only the detectors changed in this PR, then compares unique match counts between the PR build and the main baseline to catch regex regressions. Verification is disabled — each detector's regex is measured independently.

1 new · 0 clean  |  Scoped to: lob

Status Detector Unique matches (main) Unique matches (PR) New Removed
🆕 lob 0
  • 🔴 regression: >5 new, >20% increase over main, or any removed
  • ⚠️ warning: 1–5 new and ≤20% increase over main
  • ✅ clean
  • 🆕 new detector (no baseline)

@mustansir14 mustansir14 marked this pull request as ready for review May 19, 2026 08:32
@mustansir14 mustansir14 requested a review from a team May 19, 2026 08:32
@mustansir14 mustansir14 requested a review from a team as a code owner May 19, 2026 08:32
@mustansir14 mustansir14 changed the title [INS-468] Add lob detector to defaults.go [INS-468] Add improved lob detector to defaults.go May 19, 2026
Comment thread pkg/detectors/lob/lob.go
Comment thread pkg/detectors/lob/lob.go
@mustansir14 mustansir14 force-pushed the ins-468-add-lob-detector-to-defaults-list branch from d67397f to 5d7fa71 Compare May 21, 2026 10:04
@mustansir14 mustansir14 changed the base branch from main to ins-465-add-datadogapikey-detector-to-defaults May 21, 2026 10:05
@mustansir14 mustansir14 added the review/product-eng Team integrations reviewed, awaiting product-eng review label May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review/product-eng Team integrations reviewed, awaiting product-eng review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants