-
Hello, I am currently exploring your project and I can't find the version of whisper used for the following model: whisper-vq-stoks-v2.model. Thank you for your time and effort in developing this project. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hey, sorry for not replying earlier but I did not remember and I had to check myself. This model was based on Whisper If you want use the semantic tokens to build something I would strongly recommend Btw. You can > torch.load('../../hub/whisper-vq-stoks-medium-en+pl.model')
{'config': {'codebook_dim': 64,
'vq_codes': 512,
'q_depth': 1,
'n_head': 16,
'head_width': 64,
'ffn_mult': 4,
'depth': 1,
'use_cosine_sim': True,
'downsample': 2,
'whisper_model_name': 'medium'},
... |
Beta Was this translation helpful? Give feedback.
Hey, sorry for not replying earlier but I did not remember and I had to check myself.
This model was based on Whisper
base.en
. It has 512 semantic codes and downsamples the Whisper encoder by 2 (so it has 25 tokens/s).If you want use the semantic tokens to build something I would strongly recommend
whisper-vq-stoks-medium-en+pl.model
. This one is based on Whispermedium
, has the same token parameters (512 codes, 25 toks/s) but it was a lot better in every way I tested it.Btw. You can
torch.load
all my.model
files to check their configuration: