Skip to content

fix: allows nullable source_id in taxonomy#88

Merged
BhagyaAmarasinghe merged 2 commits into
mainfrom
fix/taxonomy-source-id
Jun 16, 2026
Merged

fix: allows nullable source_id in taxonomy#88
BhagyaAmarasinghe merged 2 commits into
mainfrom
fix/taxonomy-source-id

Conversation

@pandeymangg

Copy link
Copy Markdown
Contributor

Problem

feedback_records.source_id is nullable in the base Hub schema (001_initial_schema.sql), so feedback can be ingested with no attributed source. But taxonomy treated source_id as required everywhere, so those records could never get a taxonomy — they were silently invisible to discovery and rejected at run creation.

Approach

Taxonomy now supports a nullable source scope instead of forcing feedback_records.source_id to NOT NULL (which would be a broader ingestion contract change).

Key decision: rather than making source_id nullable inside the composite PK/FK/UNIQUE constraints — where NULL != NULL breaks key matching and forces a source_key/surrogate-PK workaround — taxonomy canonicalizes "no source" to the empty string '' at the Go boundary. The Go layer always sends a string, never NULL, so
source_id stays NOT NULL in taxonomy tables and '' is a valid, comparable key value. NULL stays confined to feedback_records, and only the queries that read it use null-safe matching.

Changes

  • migrations/010_taxonomy_schema.sql — dropped the three *_source_id_required blank-check constraints (taxonomy_runs, taxonomy_active_runs, taxonomy_node_events).
    source_id stays NOT NULL (satisfied by ''); composite keys unchanged. Added a header note documenting the empty-string convention. (Edited in place — taxonomy is unreleased,
    no new migration needed.)
  • internal/models/taxonomy.goTaxonomyScope.SourceID validation relaxed from required,min=1 to omitempty,no_null_bytes,max=255.
  • internal/service/id_validation.go — added normalizeOptionalIdentifier (empty → ""); shared checks extracted into normalizeIdentifierValue.
  • internal/service/taxonomy_service.gonormalizeTaxonomyScope now treats source_id as optional.
  • internal/repository/taxonomy_repository.goListFieldOptions removed the source_id IS NOT NULL AND btrim(...) <> '' filter and now groups on and emits
    COALESCE(NULLIF(btrim(source_id), ''), '') so NULL and blank collapse into a single "no source" bucket; CountScopeInput and GetRunInput match feedback_records with
    NULLIF(btrim(fr.source_id), '') IS NOT DISTINCT FROM NULLIF(btrim($3), '').

Tests

  • tests/taxonomy_no_source_test.go (new) — NULL + blank feedback collapse into one bucket; empty-source scope null-safe matches both rows; run creation and the in-progress guard work with the '' scope.
  • tests/taxonomy_schema_test.go (existing) — still passes, confirming the constraint drop didn't break schema relationships/cascades.

Verification

  • go build ./..., go vet ./..., golangci-lint — clean.
  • Migrations roll back and re-apply cleanly.
  • Taxonomy integration tests pass against a real Postgres.

Out of scope / follow-ups

  • feedback_records.source_id stays nullable — ingestion contract unchanged.
  • openapi.yaml — no taxonomy schemas there yet (unreleased), nothing to update.
  • ListRuns filter still treats empty source_id as "no filter" (history query). Listing only no-source runs would need an explicit flag — not added here.
  • Frontend renders the empty-source_id bucket as "No source" / "Unattributed"; that label lives in the frontend repo. The API now emits an empty source_id for the bucket.

@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

This PR introduces a canonical "no source" bucket for taxonomy scopes where feedback_records.source_id is NULL or blank. The empty string "" is adopted as the internal representation. Three CHECK constraints requiring btrim(source_id) <> '' are removed from the migration schema. TaxonomyScope.SourceID validation is changed from required to omitempty. In the service layer, a new normalizeOptionalIdentifier function accepts blank values as "" without error, and normalizeTaxonomyScope switches to using it for SourceID. Repository SQL is updated to normalize source_id via COALESCE(NULLIF(btrim(...), ''), '') in projections and grouping, and uses IS NOT DISTINCT FROM with NULLIF for NULL-safe predicate matching in CountScopeInput and GetRunInput. A new integration test validates the end-to-end behavior across ListFieldOptions, CountScopeInput, and CreateRunIfAvailable.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: making source_id optional/nullable in taxonomy validation and schema constraints.
Description check ✅ Passed The PR description is comprehensive, covering problem statement, approach, detailed changes by file, tests, verification steps, and out-of-scope items. All required template sections are well-addressed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/models/taxonomy.go`:
- Line 64: The change to make TaxonomyScope.SourceID optional with the omitempty
tag creates a filtering problem in the list-runs path where empty source_id
values are dropped, making it impossible to distinguish between "no filter
applied" and "filter by empty source_id". Update the filter contract across the
handler, service, and repository layers for the list-runs operation to use a
tri-state filter mechanism (such as a pointer to string (*string) or an explicit
presence flag) instead of a plain string filter. This ensures that empty-string
filtering is preserved and passed through the entire stack, allowing callers to
properly scope run history to the no-source bucket while maintaining consistent
tenant access rules across all resource access paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fc25f77f-26bb-4355-98dc-2bfe074cc837

📥 Commits

Reviewing files that changed from the base of the PR and between 9d175ee and d396a51.

📒 Files selected for processing (6)
  • internal/models/taxonomy.go
  • internal/repository/taxonomy_repository.go
  • internal/service/id_validation.go
  • internal/service/taxonomy_service.go
  • migrations/010_taxonomy_schema.sql
  • tests/taxonomy_no_source_test.go

Comment thread internal/models/taxonomy.go
@BhagyaAmarasinghe BhagyaAmarasinghe added this pull request to the merge queue Jun 16, 2026
Merged via the queue into main with commit 54abd6e Jun 16, 2026
12 checks passed
@BhagyaAmarasinghe BhagyaAmarasinghe deleted the fix/taxonomy-source-id branch June 16, 2026 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants