expressions: mixed T x ToT products in arbitrary expression trees (Phase F) by evaleev · Pull Request #564 · ValeevGroup/tiledarray

evaleev · 2026-06-12T02:04:32Z

Stacked on #563 (which stacks on #562). Completes mixed plain-tensor x tensor-of-tensors support in the expression layer for arbitrary expression trees, plus native support for no-external general products.

What

ScalMultEngine general products + tree deduction: the Phase E child-demand down-pass moves from MultEngine into BinaryEngine::init_children_indices (shared), and ScalMultEngine adopts it together with the full general-product routing (init_struct_general, init_distribution_general, make_trange_general, make_dist_eval_general, inner-product classification) — replacing its "use einsum() instead" exception. w("b,i,k;x") = 2.0 * (a("b,i,j") * c("b,j,k;x")) now evaluates.
Identity-tolerant inner-perm gate: the general-product ToT gate fired on a non-null but identity inner permutation (the bipartite perm is constructed whole when only outer modes are re-permuted by expressions: tree-general index deduction (Phase E) — inner-node general products #563's streaming wrapper); it now requires a genuinely non-identity inner perm. This unblocks mixed T x ToT general products at inner tree nodes, e.g. w("i,j;x") = (g("b,i") * c("b,j;x")) * h("b").
Scalar prefactor in inner-Scale ops: the mixed T x ToT element ops never carried the expression-level scalar factor — invisible while only MultEngine (factor == 1) reached them. The fallback op now absorbs factor_; the factor-free fused arena ops are gated to factor == 1 (scaled products take the fallback).
Native no-external general products: a general product whose every outer index is fused or contracted (e.g. C("i,j;a,b") = A("x,i,j;a") * B("x,i,j;b")) folds to a GEMM with no free modes, i.e. rank-0 tensors, which the tile kernels do not support (this shape segfaulted through wild stride reads). It is now evaluated with a SYNTHETIC UNIT left-external mode: the folded product becomes (1,K) x (K) -> (1), the exact shape of the already-supported one-sided neB == 0 case. The unit mode lives only in the tile op's GemmHelper; tranges, shapes and tiles carry the true (external-free) ranks, and BatchedContractReduce / SparseShape::gemm_batched detect the synthetic mode from the one-rank mismatch and pad their folded views with a unit extent.

Notable non-findings

Mixed T/ToT contraction chains at depth ≥ 2 — (s("i,j") * t("j,m")) * c("m,k;x") and s("i,j") * (t("j,m") * c("m,k;x")) — already worked unchanged through the Phase E deduction (the empty-inner-demand convention for plain subtrees composes correctly).
Sums nested under products — f("i,j") = a("x,i") * (b("x,k") * c("x,k,j") + d("x,j")), with a general product as a summand — work by construction: an Add's available_indices() is the leaf-union of its summands and the parent's demand intersection prunes summand-internal contraction indices automatically.
Block expressions compose: block operands in general products, block leaves under inner general nodes, and general products (including re-permuted, non-canonical-target ones) assigned into block views of the result.

Tests

Mixed: expression_mixed_t_tot_depth2_chains (both nesting orders), expression_mixed_t_tot_inner_general, expression_mixed_t_tot_scaled.
Composition: expression_general_sum_under_product; expression_general_kitchen_sink — w("i,j,m;a,b") = 2.0 * ((g("x,i") * cv("x,j;a")) * dv("x,i,m;b")), combining a THC-like batching index, a mixed T x ToT general product, a ToT x ToT general product with an inner outer-product, and a ScalMult prefactor.
Blocks: expression_general_product_block_operands, expression_general_product_into_block, expression_general_product_block_in_tree, expression_general_product_repermute_into_block.
No-external: dense ToT (incl. the no-external root fed by a general T x ToT inner node), plain dense (the Hadamard-reduction shape C("i") = A("i,j") * B("i,j")), and block-sparse (exercising the gemm_batched unit handling), all differential-tested against legacy einsum.
Full regression: general_product, einsum_*, sparse_shape, expressions{,_sparse} (modulo the two pre-existing assign_subblock_block_base1 failures), tot suites — green.
mpqc c6h14/cc-pVDZ PNO-CCSD energy unchanged (3e-11, run-to-run noise).

Notes / still out of scope

einsum() is NOT cut over for no-external products: its !e regime ("hadamard-reduction-local", the arena kernel) handles them before the generalized-contraction dispatch and remains the right tool for distributed workloads — the engine's no-external path uses a degenerate 1x1 process grid (all result tiles on one rank), so it is correctness-first; unifying the einsum regime under the engine remains gated on a perf/distribution comparison (see the design doc's open decisions).
Inner-index (nested-dim) General products remain gated (also in ScalMultEngine, with a matching message).
ToT*ToT -> T inner reductions (DeNest) stay on the einsum path.

…e-deduction down-pass The Phase E child-demand deduction moves from MultEngine into BinaryEngine::init_children_indices and ScalMultEngine adopts it, along with the full MultEngine routing for general products (inner_product_type_ classification + inner-General gate, init_struct_general, init_distribution_general, make_trange_general, make_dist_eval_general), replacing its use-einsum-instead exception.

…factor in inner-Scale ops The general-product ToT gate fired on a non-null but IDENTITY inner permutation (the bipartite perm is constructed whole when only the outer modes are re-permuted by the streaming wrapper); require a genuinely non-identity inner perm. The inner-Scale element ops (mixed T x ToT) never carried the expression-level scalar prefactor -- invisible while only MultEngine (factor == 1) reached them; the fallback op now absorbs factor_ and the factor-free fused arena ops are gated to factor == 1.

…al, scaled)

…nment into a block view)

…der products, kitchen-sink, blocks in trees A ToT x ToT general product with no external (free) outer indices -- every outer index fused or contracted -- segfaulted in the folded GEMM; gate it with an informative error (einsum() evaluates this shape natively via its no-external regime). New tests: a SUM nested under a product with a general summand (the down-pass prunes summand-internal contraction indices from the sum's demand by construction); the kitchen-sink expression combining a THC-like batching index, a mixed T x ToT general product, a ToT x ToT general product with an inner outer-product, and a ScalMult prefactor; a block leaf under an inner general node; a re-permuted general product assigned into a block view; the no-external gate.

… left-external mode A general product whose every outer index is fused or contracted (e.g. C("i,j;a,b") = A("x,i,j;a") * B("x,i,j;b")) folds to a GEMM with no free modes, i.e. rank-0 tensors, which the tile kernels do not support (this shape used to segfault through wild stride reads). Evaluate it with a synthetic unit left-external mode instead: the folded product becomes (1,K) x (K) -> (1), the exact shape of the already-supported one-sided neB == 0 case. The unit mode lives only in the tile op's GemmHelper; tranges, shapes and tiles carry the true (external-free) ranks, and BatchedContractReduce / SparseShape::gemm_batched detect the synthetic mode from the one-rank mismatch and pad their folded views with a unit extent. Replaces the interim gate. Tests: dense ToT (incl. the no-external root fed by a general T x ToT inner node), plain dense (the Hadamard-reduction shape), and block-sparse (exercising the gemm_batched unit handling), all differential-tested against legacy einsum.

evaleev added 6 commits June 11, 2026 21:51

tests: mixed T x ToT at inner tree nodes (depth-2 chains, inner gener…

c04c83c

…al, scaled)

tests: general products with block expressions (block operands; assig…

1d086cb

…nment into a block view)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expressions: mixed T x ToT products in arbitrary expression trees (Phase F)#564

expressions: mixed T x ToT products in arbitrary expression trees (Phase F)#564
evaleev wants to merge 6 commits into
evaleev/feature/general-product-tree-deductionfrom
evaleev/feature/mixed-t-tot-trees

evaleev commented Jun 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

evaleev commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Notable non-findings

Tests

Notes / still out of scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

evaleev commented Jun 12, 2026 •

edited

Loading