Skip to content

perf(cd2pd): thread the structured cell->point interpolation loop (byte-exact)#149

Queued
akaszynski wants to merge 1 commit into
mainfrom
perf/cd2pd-image-threading
Queued

perf(cd2pd): thread the structured cell->point interpolation loop (byte-exact)#149
akaszynski wants to merge 1 commit into
mainfrom
perf/cd2pd-image-threading

Conversation

@akaszynski

Copy link
Copy Markdown
Member

Summary

Threads the per-output-point interpolation loop in
vtkCellDataToPointData::InterpolatePointData(input, output). A plain
vtkImageData (and other structured datasets with no blanking) routes here from
RequestData, and the ptId loop was still serial even though the rest of
fvtk threads via the default STDThread SMP backend.

The loop is now a vtkSMPTools::For wrapped in fvtk::RunSafeFilterParallel
(the established bit-exact-safe opt-in, with the usual GetSingleThread()-guarded
UpdateProgress/CheckAbort and re-entrancy guard). cellIds and the weights[]
buffer are thread-local (vtkSMPThreadLocalObject<vtkIdList> + a per-thread stack
buffer).

The crux of thread-safety: every output point-data array is pre-sized to
numberOfPoints tuples up front. InterpolateAllocate() only reserves capacity
(MaxId == -1); after the presize, each InterpolateTuple(ptId,…) /
InsertTuple(ptId,…) / NullData(ptId) is a pure store into an already-existing
tuple — no realloc, no MaxId bump on any thread. NullData() inserts into every
array in the output (not just the interpolated ones), so the pass-through arrays
copied from the input point data are resized too; they already hold exactly
numberOfPoints tuples, so that is a no-op.

Parity bucket: byte-exact, default-on

This is bucket 1 — byte-for-byte identical to stock VTK 9.6.2 (maxULP = 0,
same values AND same order), so it ships on by default.

Byte-exactness argument:

  • The output is index-addressed by ptId. Threads get disjoint ptId
    sub-ranges, so they write to disjoint, pre-sized output tuples — zero write
    conflict, and emission order is preserved exactly.
  • The per-point average sums the same (≤ 8) terms in the same index order
    regardless of how the range is partitioned across threads, so there is no
    floating-point reassociation
    across iterations. InterpolatePoint
    InterpolateTuple iterates the same cellIds list (produced identically by the
    existing pure StructuredGetPointCells) in the same order.
  • Reads are from the input cell data (processedCellData), a distinct object from
    the output; the per-thread scratch (cellIds, weights) is the only mutable
    state and it is thread-local.

The structured inputs that reach this path take the pure StructuredGetPointCells
traversal (no shared state). For the rare non-structured fallback, any lazy
incident-cell structure is primed once on the main thread before the parallel
region so the first GetPointCells() cannot race.

Expected win

2–6× on large vtkImageData cell-data → point-data conversions (capped at the
fvtk default of 4 threads), scaling with point count.

Validation gate

  • tests/bitexact/ops.py::op_cell2point drives vtkCellDataToPointData on a
    vtkImageData with cell-data scalars (the exact modified image path) and is in
    the modified gate group (float32/float64, sizes 20/32) — covered at maxULP = 0
    against stock VTK 9.6.2.
  • tests/bitexact/test_smp_determinism.py: added cell2point to THREADED_OPS,
    asserting byte-identical output at 1 / 4 / 8 threads (which holds by
    construction — disjoint index writes).

No local build was run (disk/time constrained); relying on CI, which installs the
built wheel and runs tests/bitexact at maxULP = 0.

@akaszynski akaszynski added this pull request to the merge queue Jun 23, 2026
Any commits made after this event will not be merged.
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Jun 23, 2026
@akaszynski akaszynski added this pull request to the merge queue Jun 23, 2026
Any commits made after this event will not be merged.
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Jun 23, 2026
@akaszynski akaszynski added this pull request to the merge queue Jun 23, 2026
Any commits made after this event will not be merged.
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Jun 23, 2026
@akaszynski akaszynski added this pull request to the merge queue Jun 23, 2026
Any commits made after this event will not be merged.
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Jun 23, 2026
@akaszynski akaszynski added this pull request to the merge queue Jun 23, 2026
Any commits made after this event will not be merged.
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Jun 23, 2026
@akaszynski akaszynski added this pull request to the merge queue Jun 23, 2026
Any commits made after this event will not be merged.
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to no response for status checks Jun 23, 2026
@akaszynski akaszynski added this pull request to the merge queue Jun 23, 2026
Any commits made after this event will not be merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant