Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: modify rope for llama-3 and support llama-3.2 #131

Merged
merged 7 commits into from
Oct 10, 2024
Merged

Conversation

rheasukthanker
Copy link
Collaborator

@rheasukthanker rheasukthanker commented Oct 8, 2024

Reference Issues/PRs

Closes #130 and #129

What does this implement/fix? Explain your changes.

Fixes rope for llama-3 models. Using whittle API in eval-harness we can now match llama-3.1-8B performance from huggingface api.

Minimal Example / How should this PR be tested?

Added test for llama 3.2

@rheasukthanker rheasukthanker marked this pull request as draft October 8, 2024 14:26
@rheasukthanker rheasukthanker marked this pull request as ready for review October 8, 2024 14:46
@rheasukthanker rheasukthanker marked this pull request as draft October 8, 2024 14:54
@rheasukthanker rheasukthanker marked this pull request as ready for review October 8, 2024 15:16
from whittle.models.gpt.blocks import Block
from whittle.modules.embedding import Embedding
from whittle.modules.layernorm import LayerNorm
from whittle.modules.linear import Linear
from whittle.modules.rmsnorm import RMSNorm


class GPT(nn.Module):
class GPT(torch.nn.Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we import torch.nn above, so we can use nn.Module here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this

@rheasukthanker rheasukthanker merged commit 4e54c6e into main Oct 10, 2024
7 checks passed
@rheasukthanker rheasukthanker deleted the fix_llama branch October 10, 2024 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Meta-llama-3.2
2 participants