fep(sig-framework): Add megatron-lm-fl/te-fl v0.2.0 new features#26
Open
zhaoyinglia wants to merge 10 commits into
Open
fep(sig-framework): Add megatron-lm-fl/te-fl v0.2.0 new features#26zhaoyinglia wants to merge 10 commits into
zhaoyinglia wants to merge 10 commits into
Conversation
This was referenced May 29, 2026
JosephNew
reviewed
Jun 4, 2026
JosephNew
left a comment
Contributor
There was a problem hiding this comment.
镜像问题
Megatron-LM-FL 的 Test Plan 中 Image Acquisition 部分写的是:
Base image: FlagOS 2.1 training image (CUDA variant or MetaX variant as applicable).
Source: Internal container registry ordocker pullfrom FlagOS CI.
这不是一个可拉取的具体镜像名,SVT 测试团队无法据此复现环境。相比之下,同一 FEP 中的 TE-FL 部分给出了 nvcr.io/nvidia/pytorch:24.07-py3,可以直接拉取。
请补充具体的镜像地址(如 nvcr.io/nvidia/pytorch:24.07-py3 或 Harbor 中的具体路径 + tag),否则本 FEP 的测试环境不可复现。
zckzck
reviewed
Jun 4, 2026
| |--------|-------------|-----------------| | ||
| | All features | `python transformer_engine/plugin/tests/run_all_tests.py` | All tests pass | | ||
|
|
||
| ### Performance Verification |
There was a problem hiding this comment.
Please provide relevant commands and operation procedures for performance testing.
zckzck
reviewed
Jun 9, 2026
| | Platform | Base Image | Source | | ||
| |----------|-----------|--------| | ||
| | CUDA (NVIDIA) | `nvcr.io/nvidia/pytorch:24.07-py3` | NVIDIA NGC | | ||
| | MetaX MACA | FlagCICD MetaX runner (pre-configured) | FlagCICD platform | |
There was a problem hiding this comment.
Please provide the specific link for the Metax image.
Author
There was a problem hiding this comment.
Added installation command for flash-attn package.
Added installation of wandb and tensorboard to the setup instructions.
Added instructions for METAX setup and clarified the process for CUDA and METAX.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.