You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to extract hidden layer output from all the layers in the model. As per the documentation, the output_all_encoded_layers: boolean which controls the content of the encoded_layers output as described below. Default: True.. However Line 586 (https://huggingface.co/zhihan1996/DNABERT-2-117M/blob/main/bert_layers.py#L586) has this set to False, which I was expecting to be the case in contrast to what documentation says because only last layer was returned in the output. However, when I set it to True the inference fails. The traceback is as follows:
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /lustre/scratch124/casm/team113/users/pg20/data/supporting/huggingface_models/modules/transformers_modules/zhihan1996/DNABERT-2-117M/81ac6a98387cf94bc283553260f3fa6b88cef2fa/bert_layers.py:616, in BertModel.forward(self, input_ids, token_type_ids, attention_mask, position_ids, output_all_encoded_layers, masked_tokens_mask, **kwargs)
614 if masked_tokens_mask is None:
615 sequence_output = encoder_outputs[-1]
--> 616 pooled_output = self.pooler(
617 sequence_output) if self.pooler is not None else None
618 else:
619 # TD [2022-03-01]: the indexing here is very tricky.
620 attention_mask_bool = attention_mask.bool()
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /lustre/scratch124/casm/team113/users/pg20/data/supporting/huggingface_models/modules/transformers_modules/zhihan1996/DNABERT-2-117M/81ac6a98387cf94bc283553260f3fa6b88cef2fa/bert_layers.py:501, in BertPooler.forward(self, hidden_states, pool)
495 def forward(self,
496 hidden_states: torch.Tensor,
497 pool: Optional[bool] = True) -> torch.Tensor:
498 # We "pool" the model by simply taking the hidden state corresponding
499 # to the first token.
500 first_token_tensor = hidden_states[:, 0] if pool else hidden_states
--> 501 pooled_output = self.dense(first_token_tensor)
502 pooled_output = self.activation(pooled_output)
503 return pooled_output
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
I am trying to extract hidden layer output from all the layers in the model. As per the documentation, the
output_all_encoded_layers
: boolean which controls the content of theencoded_layers
output as described below. Default:True
.. However Line 586 (https://huggingface.co/zhihan1996/DNABERT-2-117M/blob/main/bert_layers.py#L586) has this set toFalse
, which I was expecting to be the case in contrast to what documentation says because only last layer was returned in the output. However, when I set it toTrue
the inference fails. The traceback is as follows:RuntimeError Traceback (most recent call last)
Cell In[60], line 1
----> 1 output = model(**b, output_all_encoded_layers=True)
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /lustre/scratch124/casm/team113/users/pg20/data/supporting/huggingface_models/modules/transformers_modules/zhihan1996/DNABERT-2-117M/81ac6a98387cf94bc283553260f3fa6b88cef2fa/bert_layers.py:616, in BertModel.forward(self, input_ids, token_type_ids, attention_mask, position_ids, output_all_encoded_layers, masked_tokens_mask, **kwargs)
614 if masked_tokens_mask is None:
615 sequence_output = encoder_outputs[-1]
--> 616 pooled_output = self.pooler(
617 sequence_output) if self.pooler is not None else None
618 else:
619 # TD [2022-03-01]: the indexing here is very tricky.
620 attention_mask_bool = attention_mask.bool()
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /lustre/scratch124/casm/team113/users/pg20/data/supporting/huggingface_models/modules/transformers_modules/zhihan1996/DNABERT-2-117M/81ac6a98387cf94bc283553260f3fa6b88cef2fa/bert_layers.py:501, in BertPooler.forward(self, hidden_states, pool)
495 def forward(self,
496 hidden_states: torch.Tensor,
497 pool: Optional[bool] = True) -> torch.Tensor:
498 # We "pool" the model by simply taking the hidden state corresponding
499 # to the first token.
500 first_token_tensor = hidden_states[:, 0] if pool else hidden_states
--> 501 pooled_output = self.dense(first_token_tensor)
502 pooled_output = self.activation(pooled_output)
503 return pooled_output
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1518, in Module._wrapped_call_impl(self, *args, **kwargs)
1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1517 else:
-> 1518 return self._call_impl(*args, **kwargs)
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/module.py:1527, in Module._call_impl(self, *args, **kwargs)
1522 # If we don't have any hooks, we want to skip the rest of the logic in
1523 # this function, and just call forward.
1524 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1525 or _global_backward_pre_hooks or _global_backward_hooks
1526 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1527 return forward_call(*args, **kwargs)
1529 try:
1530 result = None
File /lustre/scratch124/casm/team113/users/pg20/venvs/huggingface/lib/python3.10/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x5 and 768x768)
Steps to reproduce the error:
import torch
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("zhihan1996/DNABERT-2-117M", trust_remote_code=True)
model = AutoModel.from_pretrained("zhihan1996/DNABERT-2-117M", trust_remote_code=True)
b = tokenizer('ATCG', return_tensors='pt', return_attention_mask=True)
output = model(**b, output_all_encoded_layers=True)
P.S. I am not using triton since it was failing in another step.
The text was updated successfully, but these errors were encountered: