fep(sig-framework): add PyTorch-Plugin-FL v0.1.0 CUDA backend dispatch proposal#25
fep(sig-framework): add PyTorch-Plugin-FL v0.1.0 CUDA backend dispatch proposal#25Hchnr wants to merge 6 commits into
Conversation
|
Thanks for the proposal. From a reviewer perspective, the current FEP provides a good architectural overview, but it does not yet contain sufficient implementation and validation details for reproducible verification. In particular, the proposal does not currently specify:
Without these details, it is difficult for reviewers to reproduce the proposed workflow or assess the completeness of the implementation plan. Could you consider adding a dedicated "Implementation and Validation Plan" section covering environment setup, test procedures, sample commands, expected outputs, and acceptance criteria? |
…ification plan Major updates to the PyTorch-Plugin-FL v0.1.0 FEP: - Rename from "CUDA Backend" to "Multi-Backend Operator Dispatch" - Add Ascend (Huawei) native kernel support alongside CUDA - Introduce Dispatcher<FnPtr> template-based routing mechanism - Support three dispatch paths: native CUDA, native Ascend, FlagGems Triton (C++ and Python) - Define 32 first-phase operators with cross-platform implementations - Add detailed architecture diagrams and registration flow - Provide complete testing strategy with per-operator and end-to-end tests (Qwen3-0.6B) - Document full verification environments for both CUDA (A800) and Ascend (910B) platforms - Include step-by-step installation, test procedures, and expected outputs - Add CI/CD integration and regression testing guidelines
|
Update: expand CUDA dispatch to multi-backend architecture with full verification plan Major updates to the PyTorch-Plugin-FL v0.1.0 FEP:
Scope: Expands from CUDA-only prototype to production-ready multi-backend framework with comprehensive validation plan. |
| | GPU | NVIDIA A800-SXM4-80GB | | ||
| | Driver | 535.154.05 | | ||
| | CUDA Toolkit | 12.8 | | ||
| | Conda Env | `pytorch` (Python 3.12.13) | |
There was a problem hiding this comment.
could you share the full Docker image pytorch2.11.0_cuda12.8_triton3.6.0_flaggems5.0.2?
There was a problem hiding this comment.
docker pull harbor.baai.ac.cn/flagscale/cuda12.8.1-cudnn9.15.1-python3.12-torch2.7.1-train:2512031616
No description provided.