Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: [enforce fail at alloc_cpu.cpp:114] data. DefaultCPUAllocator: not enough memory: you tried to allocate 33507323568 bytes. #21

Open
Godly-GM opened this issue Nov 2, 2024 · 1 comment

Comments

@Godly-GM
Copy link

Godly-GM commented Nov 2, 2024

I tried to pass the context from a 19-page PDF to the model, but I encountered this error:
RuntimeError: [enforce fail at alloc_cpu.cpp:114] data. DefaultCPUAllocator: not enough memory: you tried to allocate 33507323568 bytes.

here input_text is the content of pdf.
Screenshot 2024-11-02 200912

@guenthermi
Copy link
Member

It looks like your machine doesn't have enough memory to encode very long sequences of text. You could use the long late chunking method, which is implemented in the _embed_with_overlap method (

model_outputs = self._embed_with_overlap(model, model_inputs)
) in our evaluation code together with a lower number of tokens ( long_late_chunking_embed_size in the function) property to circumvent this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants