Skip to content

WIP: Feature Quantization Scaffolding#682

Draft
jchmura-sc wants to merge 25 commits into
mainfrom
jchmura/feature_quantization_gigl
Draft

WIP: Feature Quantization Scaffolding#682
jchmura-sc wants to merge 25 commits into
mainfrom
jchmura/feature_quantization_gigl

Conversation

@jchmura-sc

@jchmura-sc jchmura-sc commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

TODO

  • migrate preprocessing to beam (away from tft)
  • serialize as raw bytes and avoid int64 default constructions
  • figure out best api boundary
  • tests

@jchmura-sc jchmura-sc self-assigned this Jun 29, 2026
# node ids in the produced batch when reading serialized tfrecords.
if entity_key not in feature_spec_dict:
logger.info(
f"Injecting entity key {entity_key} into feature spec dictionary with value `tf.io.FixedLenFeature(shape=[], dtype=tf.int64)`"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double check why we inject tf.int64

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jchmura-sc

Copy link
Copy Markdown
Collaborator Author

/unit_test

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

GiGL Automation

@ 20:23:16UTC : 🔄 C++ Unit Test started.

@ 20:25:02UTC : ✅ Workflow completed successfully.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

GiGL Automation

@ 20:23:18UTC : 🔄 Python Unit Test started.

@ 20:28:19UTC : ❌ Workflow failed.
Please check the logs for more details.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

GiGL Automation

@ 20:23:19UTC : 🔄 Scala Unit Test started.

@ 20:33:56UTC : ✅ Workflow completed successfully.

"Cannot materialize quantized features with "
f"{dequantized.size(0)} rows into existing x with {x.size(0)} rows."
)
node_store.x = torch.cat([x, dequantized], dim=1)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where we'd need to scatter write raw and quantized node features if preserving original feature order is a requirement.

@jchmura-sc

Copy link
Copy Markdown
Collaborator Author

/unit_test

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

GiGL Automation

@ 22:21:08UTC : 🔄 Python Unit Test started.

@ 23:37:31UTC : ✅ Workflow completed successfully.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

GiGL Automation

@ 22:21:08UTC : 🔄 Scala Unit Test started.

@ 22:29:50UTC : ✅ Workflow completed successfully.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

GiGL Automation

@ 22:21:08UTC : 🔄 C++ Unit Test started.

@ 22:22:54UTC : ✅ Workflow completed successfully.

"Computed 1-bit quantization stats: "
f"neg_mean={q.neg_mean}, pos_mean={q.pos_mean}"
)
else:

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we can simplify proto model and compute all stats regardless of bits, but I suspect avoiding tft.quantiles in the 1-bit case is worth doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant