Skip to content

Add definitely_all_null() for cheap all-null detection#8475

Draft
joseph-isaacs wants to merge 5 commits into
developfrom
claude/epic-hawking-3ixvms-phase1
Draft

Add definitely_all_null() for cheap all-null detection#8475
joseph-isaacs wants to merge 5 commits into
developfrom
claude/epic-hawking-3ixvms-phase1

Conversation

@joseph-isaacs

Copy link
Copy Markdown
Contributor

Summary

This PR introduces a new definitely_all_null() method to ArrayRef that performs cheap, static detection of entirely-null arrays without executing compute. This enables short-circuiting in compute kernels to avoid unnecessary work and canonicalization when processing all-null inputs.

The method returns true only when all-null-ness can be proven without computation:

  • Constant-null arrays
  • Arrays with Validity::AllInvalid
  • Arrays with a constant-false validity array

A false result means "not provably all-null" (conservative), not "contains valid values", so callers must fall back to their normal path.

Changes

  1. New definitely_all_null() method (vortex-array/src/array/erased.rs):

    • Added public method with comprehensive documentation
    • Checks for constant-null scalars and static validity patterns
    • Includes unit tests covering constant-null detection and validity-based detection
  2. Removed unused Validity::not() method (vortex-array/src/validity.rs):

    • Cleaned up dead code that was not being used
  3. Updated compute kernels to use the new method:

    • is_null.rs: Short-circuit entirely-null inputs to return all-true
    • is_not_null.rs: Short-circuit entirely-null inputs to return all-false
    • struct/compute/take.rs: Return canonical all-null constant array instead of manually constructing with Validity::AllInvalid
    • list/compute/filter.rs: Return canonical all-null constant array instead of manually constructing
  4. Added ConstantArray::null() helper (vortex-array/src/arrays/constant/array.rs):

    • Convenience method to construct the canonical all-null representation
    • Documented as the standard representation for all-null arrays

Testing

Added comprehensive unit tests in vortex-array/src/array/erased.rs:

  • definitely_all_null_detects_constant_null: Verifies detection of constant-null and constant-non-null arrays
  • definitely_all_null_via_validity: Verifies detection via AllInvalid validity, constant-false validity arrays, and non-nullable/all-valid cases

Existing tests pass with the updated compute kernels using the new short-circuit path.

https://claude.ai/code/session_01Q8K741TL4zABgsL1N4kLWw

Foundation for representing all-null arrays as Constant(null) and removing
Validity::AllInvalid (#8443).

- ConstantArray::null(dtype, len) constructs the canonical all-null array: a
  single null scalar repeated, with no values buffer or validity child.
- ArrayRef::all_null() is a cheap, non-executing, conservative check for
  "entirely null": true for a constant-null array or a statically all-invalid
  validity (including a constant-false validity array, the representation
  all-null arrays will use once AllInvalid is gone). It runs no compute, so a
  false result means "not provably all-null", not "has valid values".

Compute entry points will call all_null() to short-circuit an entirely-null
input to Constant(null) and skip canonicalization.

Signed-off-by: Joseph Isaacs <joe.isaacs@live.co.uk>

https://claude.ai/code/session_01Q8K741TL4zABgsL1N4kLWw
Validity::not had no callers anywhere in the workspace: a repo-wide audit of
every `.not()` site found only Mask, BitBuffer, and ArrayRef receivers, with
no UFCS Validity::not call and no `impl Not for Validity`.

It is also the only place that constructs Validity::AllInvalid without a length
in scope (AllValid -> AllInvalid). Removing it eliminates the one structural
blocker to deleting the AllInvalid variant (#8443): every remaining producer
already has a length, so no length-threading through the Validity algebra is
required.

Signed-off-by: Joseph Isaacs <joe.isaacs@live.co.uk>

https://claude.ai/code/session_01Q8K741TL4zABgsL1N4kLWw
…8443)

Step 3 of removing Validity::AllInvalid. Exercises the foundation helpers:

- list filter and struct take now return ConstantArray::null(...) for the
  all-null result instead of constructing an all-null concrete array.
- is_null / is_not_null gain a cheap ArrayRef::all_null() short-circuit for
  entirely-null concrete inputs (the constant-input case is already handled).

All changes are logically behavior-preserving: an all-null result is the same
values and null mask whether encoded as Constant(null) or a concrete array.
This also confirms the previously-unused all_null() and ConstantArray::null
helpers now have real (non-test) call sites.

Signed-off-by: Joseph Isaacs <joe.isaacs@live.co.uk>

https://claude.ai/code/session_01Q8K741TL4zABgsL1N4kLWw
The check is conservative and non-executing: a false result means "not
provably all-null", not "has valid values". Rename to definitely_all_null to
make that contract explicit and mirror the existing Validity::definitely_no_nulls.

Signed-off-by: Joseph Isaacs <joe.isaacs@live.co.uk>

https://claude.ai/code/session_01Q8K741TL4zABgsL1N4kLWw
@joseph-isaacs joseph-isaacs added the changelog/chore A trivial change label Jun 17, 2026 — with Claude
…null (#8443)

Replace the explicit `matches!(validity, Validity::AllInvalid)` check in the
fill_null precondition with `array.definitely_all_null()?`. Behavior-preserving
and slightly more general: it also short-circuits a constant-null input or a
constant-false validity array (the representations all-null arrays move to),
without matching the variant directly. Prepares the consumer for the eventual
.validity() pivot.

Signed-off-by: Joseph Isaacs <joe.isaacs@live.co.uk>

https://claude.ai/code/session_01Q8K741TL4zABgsL1N4kLWw
@codspeed-hq

codspeed-hq Bot commented Jun 17, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 29.83%

⚡ 1 improved benchmark
❌ 15 regressed benchmarks
✅ 1493 untouched benchmarks
⏩ 83 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation slice_empty_vortex 368.3 ns 2,628.6 ns -85.99%
Simulation chunked_bool_canonical_into[(1000, 10)] 20.3 µs 35.8 µs -43.32%
Simulation slice_vortex_buffer[1024] 871.4 ns 1,335 ns -34.73%
Simulation slice_vortex_buffer[16384] 871.4 ns 1,335 ns -34.73%
Simulation slice_vortex_buffer[2048] 871.4 ns 1,335 ns -34.73%
Simulation slice_vortex_buffer[128] 871.4 ns 1,335 ns -34.73%
Simulation slice_vortex_buffer[65536] 871.4 ns 1,335 ns -34.73%
Simulation chunked_varbinview_canonical_into[(1000, 10)] 162.2 µs 198.2 µs -18.15%
Simulation chunked_varbinview_into_canonical[(1000, 10)] 177.7 µs 214.2 µs -17.04%
Simulation search_index_below_min_chunked 1.3 ms 1.5 ms -13.61%
Simulation search_index_mixed_out_of_range_chunked 1.3 ms 1.5 ms -13.32%
Simulation count_i32_clustered_nulls 47 µs 54 µs -12.97%
Simulation search_index_full_range_random_chunked 1.4 ms 1.6 ms -12.08%
Simulation chunked_varbinview_canonical_into[(100, 100)] 273.2 µs 308.1 µs -11.33%
Simulation chunked_varbinview_into_canonical[(100, 100)] 330.8 µs 367.7 µs -10.02%
Simulation chunked_varbinview_opt_into_canonical[(1000, 10)] 229.1 µs 193.6 µs +18.32%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing claude/epic-hawking-3ixvms-phase1 (803e530) with develop (85aad72)2

Open in CodSpeed

Footnotes

  1. 83 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on develop (0ed06b3) during the generation of this report, so 85aad72 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/chore A trivial change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant