Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get all hidden layers' output of pre-trained BERTurk model in HuggingFace Transformers library? #19

Open
katirasole opened this issue Aug 26, 2020 · 2 comments

Comments

@katirasole
Copy link

katirasole commented Aug 26, 2020

Hi Stefan,
I have a problem to get the all hidden layer's output of BERTurk. I tried as follows:

model = AutoModel.from_pretrained("dbmdz/bert-base-turkish-uncased")

Convert inputs (length 20) to PyTorch tensors

tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])

model.eval()

with torch.no_grad():
outputs = model(tokens_tensor, segments_tensors)

the outputs contain two tensors.

print (outputs[0])
print (len(outputs[0][0])) # have 20 array, each is belongs to each token of sentence
print (outputs[0][0][0]) # for each token outputs[0][0][i] this is for [CLS]
print (len(outputs[0][0][0])) #768 embedding size

I am not sure outputs[0] is the final hidden state or not too.

and outputs[1] is as following:
print (outputs[1][0])
print (len(outputs[1][0])) # have 768 entries

Also I tried as what is described in https://huggingface.co/transformers/model_doc/bert.html#tfbertmodel but I got an error when I define output_hidden_states = True.

@ozcangundes
Copy link

Maybe I can help you on this issue. Here is my sample code to use all hidden layers of each Transformer layers' output. You should define output_hidden_states attribute in the config for AutoModels.

config=AutoConfig.from_pretrained("dbmdz/bert-base-turkish-128k-cased",**output_hidden_states=True**)
model=AutoModel.from_pretrained("dbmdz/bert-base-turkish-128k-cased",**config=config**)
with torch.no_grad():    
    all_hidden_states = model(inputs,attention_mask=masks)[2]
    final_hidden_states = model(inputs,attention_mask=masks)[0]

The all_hidden_states is a tuple with length 13 (1 for embedding layer and 12 for Transformer layers).
For example, CLS output of 10th layer can be found with all_hidden_states[-3][:,0,:].

I hope it helps.

@katirasole
Copy link
Author

Thank you so much @ozcangundes , I will try it and let you know if it works for me or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants