-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate TransformerEngine #1098
Comments
Fairly mature implementation at https://github.com/NVIDIA/Megatron-LM |
As discussed on Discord - if you need some extra dev manpower then I'll happily take this one off your hands |
Thank you! |
Hi, I am curious about the state of the efforts and don't see a related branch. I read on discord that FP8 was working but there were struggles with convergence. @Quentin-Anthony IIUC you spent some time on this as well, could you tell me more? :) |
Hi - just commenting to say that I’m afraid that I got distracted by other projects, and didn’t make any significant progress on this. Removing myself as assignee as agreed with Quentin so that I don’t block anyone else from picking it up, as I should have done much sooner. |
There are a few things to unpack here. I had a look at the difference between GPT-NeoX megatron and the upstream megatron, which has a mature implementation as @Quentin-Anthony said. Here's a draft PR with some thoughts on the diff: #1185 It includes a few thoughts on the matter, let's discuss there :) |
Needed for fp8 training, and adds some nice fp16/bf16 optimizations for Ampere and newer architectures that we can make use of regardless.
https://github.com/EleutherAI/TransformerEngine
The text was updated successfully, but these errors were encountered: