You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While implementing disaggregated prefill, we found an error regarding loading weights from safetensors files. We have filed a JIRA ticket(HS-3164) as we believe this is a synapseAI bug.
import torch
import habana_frameworks.torch as htorch
from safetensors import safe_open
from safetensors.torch import save_file
if __name__ == "__main__":
safetensor_file_path = "tmp.safetensors"
# create safetensors file
save_file({"foo": torch.randn(10)}, safetensor_file_path)
# load safetensors file
with torch.device("hpu"):
f = safe_open(safetensor_file_path, framework="pt")
foo = f.get_tensor("foo")
# following line yields error in hpu LAZY mode,
print(foo) # RuntimeError: Reshape doesnt support change in number of elements: [40] Size of output: [10]
This is the minimal reproducer code of the torch.device context + safetensors.safe_open runtime error, just in case if you don't have access to the JIRA ticket.
Anything you want to discuss about vllm.
While implementing disaggregated prefill, we found an error regarding loading weights from safetensors files. We have filed a JIRA ticket(HS-3164) as we believe this is a synapseAI bug.
However, we found out that the code in vllm-fork is currently doing the same thing: loading safetensors file under torch.device(“hpu”) context, without involving any significant errors.
We’ll be very glad to know what made this possible, please let us know if we are missing something.
FYI, we are using IDC node with synapseAI version 1.17.0.
The text was updated successfully, but these errors were encountered: