The definition of WHISPER_AHEADS_LARGE_V3_TURBO is missing in the whisper_alignment_heads_preset enum variable. #2462

ppcfan · 2024-10-07T05:28:02Z

whisper.h:

enum whisper_alignment_heads_preset {
WHISPER_AHEADS_NONE,
WHISPER_AHEADS_N_TOP_MOST, // All heads from the N-top-most text-layers
WHISPER_AHEADS_CUSTOM,
WHISPER_AHEADS_TINY_EN,
WHISPER_AHEADS_TINY,
WHISPER_AHEADS_BASE_EN,
WHISPER_AHEADS_BASE,
WHISPER_AHEADS_SMALL_EN,
WHISPER_AHEADS_SMALL,
WHISPER_AHEADS_MEDIUM_EN,
WHISPER_AHEADS_MEDIUM,
WHISPER_AHEADS_LARGE_V1,
WHISPER_AHEADS_LARGE_V2,
WHISPER_AHEADS_LARGE_V3,
};

lithium0003 · 2024-10-07T13:03:03Z

Needed value is

static const whisper_ahead g_aheads_large_v3_turbo[]  = { {2, 4}, {2, 11}, {3, 3}, {3, 6}, {3, 11}, {3, 14} };

static const std::map<whisper_alignment_heads_preset, whisper_aheads> g_aheads {
...
    { WHISPER_AHEADS_LARGE_V3_TURBO,  { 6, g_aheads_large_v3_turbo  } },
};

ppcfan · 2024-10-07T14:29:07Z

Needed value is

static const whisper_ahead g_aheads_large_v3_turbo[]  = { {2, 4}, {2, 11}, {3, 3}, {3, 6}, {3, 11}, {3, 14} };

static const std::map<whisper_alignment_heads_preset, whisper_aheads> g_aheads {
...
    { WHISPER_AHEADS_LARGE_V3_TURBO,  { 6, g_aheads_large_v3_turbo  } },
};

@lithium0003 Thank you so much! Could you explain how do you get this value?

lithium0003 · 2024-10-07T21:26:48Z

Original source says,
https://github.com/openai/whisper/blob/25639fc17ddc013d56c594bfbf7644f2185fad84/whisper/__init__.py#L49

    "large-v3-turbo": b"ABzY8j^C+e0{>%RARaKHP%t(lGR*)0g!tONPyhe`",

https://github.com/openai/whisper/blob/25639fc17ddc013d56c594bfbf7644f2185fad84/whisper/model.py#L278

        array = np.frombuffer(
            gzip.decompress(base64.b85decode(dump)), dtype=bool
        ).copy()

So decode it like this,

import gzip, base64
import numpy as np
array = np.frombuffer(gzip.decompress(base64.b85decode(b"ABzY8j^C+e0{>%RARaKHP%t(lGR*)0g!tONPyhe`")), dtype=bool)
idx = np.where(array)[0]
n_text_head = 20
idx_pair = np.array(list(zip(idx // 20, idx % 20)))

idx
array([44, 51, 63, 66, 71, 74])

idx_pair
array([[ 2,  4],
       [ 2, 11],
       [ 3,  3],
       [ 3,  6],
       [ 3, 11],
       [ 3, 14]])

chnbr · 2024-10-10T08:12:17Z

Needed value is

static const whisper_ahead g_aheads_large_v3_turbo[]  = { {2, 4}, {2, 11}, {3, 3}, {3, 6}, {3, 11}, {3, 14} };

static const std::map<whisper_alignment_heads_preset, whisper_aheads> g_aheads {
...
    { WHISPER_AHEADS_LARGE_V3_TURBO,  { 6, g_aheads_large_v3_turbo  } },
};

@lithium0003 Thank you so much! Could you explain how do you get this value?

Is it sufficient to put these lines into the code ? I mean, how can we be sure that they are used ?
So far, I was able to load the v3-turbo model without these lines in code and it worked. The question is what it really did, though. I think the changes should be incorporated into the repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The definition of WHISPER_AHEADS_LARGE_V3_TURBO is missing in the whisper_alignment_heads_preset enum variable. #2462

The definition of WHISPER_AHEADS_LARGE_V3_TURBO is missing in the whisper_alignment_heads_preset enum variable. #2462

ppcfan commented Oct 7, 2024

lithium0003 commented Oct 7, 2024

ppcfan commented Oct 7, 2024

lithium0003 commented Oct 7, 2024

chnbr commented Oct 10, 2024

The definition of WHISPER_AHEADS_LARGE_V3_TURBO is missing in the whisper_alignment_heads_preset enum variable. #2462

The definition of WHISPER_AHEADS_LARGE_V3_TURBO is missing in the whisper_alignment_heads_preset enum variable. #2462

Comments

ppcfan commented Oct 7, 2024

lithium0003 commented Oct 7, 2024

ppcfan commented Oct 7, 2024

lithium0003 commented Oct 7, 2024

chnbr commented Oct 10, 2024