Skip to content

DistABLPLoader: vectorized label remap + optional AnchorLabels edge-list output#681

Draft
kmontemayor2-sc wants to merge 18 commits into
mainfrom
kmonte/ablp-vectorized-labels-and-list-output
Draft

DistABLPLoader: vectorized label remap + optional AnchorLabels edge-list output#681
kmontemayor2-sc wants to merge 18 commits into
mainfrom
kmonte/ablp-vectorized-labels-and-list-output

Conversation

@kmontemayor2-sc

@kmontemayor2-sc kmontemayor2-sc commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Reworks how DistABLPLoader remaps Anchor-Based Link Prediction (ABLP) labels
from global node ids to subgraph-local indices, and adds an optional edge-list
output format for those labels.

Internally, we see small but

  • Vectorized label remap. Label remapping now runs as a single
    sorted-membership join (searchsorted over the node map) instead of a
    per-anchor Python loop. The result is loss-equivalent to the previous output
    (same (anchor, label) pairs; within-anchor order is unspecified, and the
    ABLP contrastive loss is order-invariant). We observe meaningful sampling/
    collation performance improvements at production scale.

  • AnchorLabels edge-list container + use_list_output flag.
    DistABLPLoader(..., use_list_output=True) returns labels as an
    AnchorLabels edge list (two co-indexed [E] tensors,
    anchor_index / label_index) that the loss can index directly with no
    padding or per-anchor Python loop. use_list_output=False (default) preserves
    the existing ragged dict[int, torch.Tensor] output, so this is backward
    compatible. AnchorLabels.to_dict() recovers the dict form.

  • Public API. AnchorLabels is exported from gigl.distributed.

  • Examples. The link-prediction training examples (homogeneous +
    heterogeneous, colocated + graph-store) are updated to consume the
    AnchorLabels edge-list output.

Tests

  • Equivalence of the vectorized remap output against constructed expected values,
    covering empty / fully-padded / duplicate-label / multi-edge-type cases.
  • use_list_output=True vs =False produce equivalent labels.
  • CUDA device-placement regression test for the remap kernel.
  • Guard: a non-unique node→global map raises ValueError.

Notes

  • Labels are loss-equivalent to the prior output, not order-identical.
  • Documentation-only follow-ups: worked examples for both output formats in the
    DistABLPLoader docstring and AnchorLabels shape docs.

kmontemayor and others added 12 commits June 25, 2026 14:57
…racle

Factor the inline per-anchor label-remap loop in DistABLPLoader._set_labels
into a module-level function _loop_set_labels. The new function is a
behavior-preserving extraction: _set_labels delegates to it, producing
identical output. _loop_set_labels will serve as the equivalence oracle
for the vectorized kernel added in the next task.

Also imports PADDING_NODE from gigl.utils.data_splitters (used by the
vectorized kernel in the next task) and adds the contract test file
tests/unit/distributed/vectorized_set_labels_test.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…oop oracle

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ls kernel

Add frozen dataclass AnchorLabels (anchor_index, label_index, num_anchors)
with to_dict() bridge; thin wrapper _remap_one_label_tensor_edge_list over
the shared _membership_remap; and edge_list_set_labels driver. Proves
to_dict() reproduces _loop_set_labels bit-for-bit via parametrized equivalence
tests (11 new tests, all classes pass).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eous training

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eneous training

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…test

Line-wrapping only (ruff format) for the edge-list label reads in the four
link-prediction training examples and the loader equivalence test. No logic
change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kmontemayor2-sc kmontemayor2-sc force-pushed the kmonte/ablp-vectorized-labels-and-list-output branch from 292f4d4 to bd97d03 Compare June 25, 2026 14:57
kmontemayor and others added 2 commits June 25, 2026 16:18
…ctors

**Behavioral change:** Drop the composite-key stable argsort from
`_membership_remap` (the `composite_key = anchor_kept * (num_nodes + 1) +
local_idx` / `torch.argsort(composite_key, stable=True)` block). The pair
stream is now emitted in column-visit (row-major masked flatten) order rather
than ascending-local-index (torch.nonzero) order.

**Why this is safe:** `RetrievalLoss` (`gigl/nn/loss.py`) is
`CrossEntropyLoss(reduction="sum")` over a diagonal-targeted score matrix that
masks collisions by id VALUE, not position. It is invariant to any joint
permutation of `(anchor_index, label_index)` pairs. The only constraint is
co-indexing (pair k stays intact) and per-anchor grouping, both of which are
preserved. Within-anchor label order was an implementation artifact of the loop
oracle -- never a loss requirement.

**Why the dict path is unaffected:** `anchor_of_entry` is built as
`arange(N).repeat_interleave(M)` (row-major), so the masked flatten is already
non-decreasing in anchor_index. `bincount`/`split` requires only contiguous
grouping by anchor, which row-major flatten guarantees without any argsort.

**Readability refactors in `_membership_remap`:**
- Rename terse tensors: `valid` -> `is_present`, `found` -> `is_exact_match`,
  `positions` -> `sorted_positions`, `local_idx` -> `local_index`,
  `anchor_kept` -> `anchor_of_matched`
- Add numbered step comments explaining the searchsorted membership lookup

**Readability refactors in outer kernels:**
- Add `_remap_group` helper to collapse the ~40-line duplicated positive/
  negative per-edge-type loops in both `vectorized_set_labels` and
  `edge_list_set_labels` (uses TypeVar for generic return type)
- `_sorted_for` closure pattern preserved, inlined per-kernel (memoizes
  torch.sort across pos/neg edge types of the same node type)

**Other improvements:**
- Add doctest to `AnchorLabels` showing a 2-anchor case + `to_dict()` round-trip
- Update `AnchorLabels` docstring: document column-visit order and
  loss-permutation-invariance rationale
- Update all docstrings to drop "bit-for-bit"/"torch.nonzero order" language

**Test contract relaxation (tests remain non-vacuous):**
- `vectorized_set_labels_test.py` / `edge_list_set_labels_test.py`:
  `_assert_label_dicts_equal` -> `_assert_label_dicts_set_equal` (uses
  `sorted()` per anchor; still catches membership errors + multiplicity)
- `test_unsorted_node_map_exact_order` -> `test_unsorted_node_map_correct_membership`:
  asserts SET {0, 2} instead of sequence [0, 2]; docstring explains column
  vs ascending-local order difference
- `RemapOneEdgeListTest.test_unsorted_node_map_nontrivial_sort_perm` ->
  `test_unsorted_node_map_correct_membership`: pins column order [2, 0] and
  explains it is SET-equal to [0, 2]; verifies sort_perm mapping is correct
- `dist_ablp_neighborloader_test.py`: `_ordered_global_pairs` ->
  `_global_pair_set` (sorts within anchor); `_collect_homogeneous_labels`
  docstring updated; the in-process exact-tensor assertion
  (`label_index == cat(dict values)`) is preserved -- both paths draw from the
  same `_membership_remap` pair stream so they remain identical
- CUDA device test docstring: remove "stable-argsort tie-break" reference;
  keep the duplicate [15,15] row (still tests duplicate handling + device placement)

Device fix preserved: `anchor_of_entry` is still built on `label_tensor.device`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er-identical

After dropping the order-reproduction argsort, per-anchor label order is
column-visit order, not the loop's nonzero order. The (query, label) pairs are
unchanged and the contrastive loss is order-invariant, so the read is equivalent
for training; the comments now say so instead of implying byte-identical order.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kmontemayor2-sc kmontemayor2-sc force-pushed the kmonte/ablp-vectorized-labels-and-list-output branch from 486e670 to d42d0df Compare June 26, 2026 19:24
GiGL main resolved ABLP labels with a per-anchor Python loop. This replaces it
with one vectorized kernel and a single output type, plus a thin dict view for
backward compatibility.

- Resolve labels with a sorted-membership join (_membership_remap: sort the node
  map once, searchsorted the label ids, keep exact matches) instead of the
  O(N_anchors * M * N_nodes) per-anchor loop. Delete _loop_set_labels (no
  production callers) and the redundant dict-producing kernel.
- Collapse the remap to two functions, _membership_remap and edge_list_set_labels;
  the dict path is just AnchorLabels.to_dict(), selected by use_list_output on
  DistABLPLoader (default False keeps the ragged dict). Drop the _remap_group
  callback indirection, the _LabelT TypeVar, and the vestigial
  supervision_edge_types parity param.
- AnchorLabels stores labels as two parallel (anchor_index, label_index) tensors
  so the loss can index them directly; within-anchor order is unspecified because
  the ABLP contrastive loss is order-invariant over the pairs.
- Make the __debug__ unique-node-map check a cheap adjacent-difference test on the
  already-sorted map (it ran torch.unique every batch, and GiGL is not launched
  with -O).
- Tests assert against constructed expected values and per-anchor label SETS, not
  within-anchor order. Examples consume the edge-list directly.

Behavior-preserving: per-anchor label sets are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kmontemayor2-sc kmontemayor2-sc force-pushed the kmonte/ablp-vectorized-labels-and-list-output branch from d42d0df to 641b24e Compare June 26, 2026 21:13
kmontemayor and others added 3 commits June 26, 2026 21:47
…ions

Comments and docstrings only -- no behavior change (verified: the AST with
docstrings stripped is identical to the prior commit for every touched file).

- Docstrings lead with why over what; the membership-remap algorithm is shown as
  a small worked example (node map + [N_anchors, M] padded label tensor -> pairs)
  with labeled steps instead of dense prose.
- Inline comments reference those docstring steps rather than floating free.
- Tensor dimensions annotated throughout (N_anchors / M / N_nodes / K / E),
  including the K (non-padding candidates) vs E (matched pairs) distinction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uniqueness guard

- Add a use_list_output=True worked example to DistABLPLoader.__init__ that
  mirrors the use_list_output=False example (same graph, AnchorLabels values).
- Document AnchorLabels.to_dict()'s non-decreasing anchor_index precondition.
- Promote the _membership_remap duplicate-node-map guard from a __debug__
  assert to an always-on ValueError so it still fires under `python -O`; update
  the test to expect ValueError.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comments should describe the code as it is, not as a delta from a prior or
never-committed version. Reword/remove comments that only parse if the reader
knows the code's development history:

- Drop the "Always-on (not __debug__) ... python -O" framing on the node-map
  uniqueness guard; keep the precondition + empty-slice safety rationale.
- Fix a stale _membership_remap docstring line still describing a __debug__
  assertion (it is now an always-on ValueError).
- Remove "with no argsort" / "no argsort needed" (presupposed the dropped
  order-reproduction argsort); state the grouped-by-anchor property directly.
- Reword "a sorted-membership join rather than a per-anchor scan ... trades a
  broadcast-compare for a search" to describe the algorithm and its complexity
  as-is.
- Drop "rather than maintaining a second kernel" in _set_labels.
- Same scrub in edge_list_set_labels_test.py (module docstring + guard test).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@kmontemayor2-sc kmontemayor2-sc changed the title Kmonte/ablp vectorized labels and list output DistABLPLoader: vectorized label remap + optional AnchorLabels edge-list output Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants