-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect timetstamps #2271
Comments
Fixes ggerganov#2271 - Adds consecutive timestamps after end of last segment as the new starting ts - Add these timestamp to output when "print-special" enabled - Fixes fflush usage in live reporting I was not able to test this with the special "token_timestamps" option.
@thewh1teagle How did you generate the |
@SimpleVictor |
Fixes ggerganov#2271 - Adds consecutive timestamps after end of last segment as the new starting ts - Add these timestamp to output when "print-special" enabled - Fixes fflush usage in live reporting I was not able to test this with the special "token_timestamps" option.
Fixes ggerganov#2271 - Adds consecutive timestamps after end of last segment as the new starting ts - Add these timestamp to output when "print-special" enabled - Fixes fflush usage in live reporting I was not able to test this with the special "token_timestamps" option.
I found another weird wrong timestamps when word timestamps enabled. sam_altman.mp4Open the details and search for transcript.json[
{
"start": 0,
"stop": 19,
"text": ""
},
{
"start": 19,
"stop": 34,
"text": " What"
},
{
"start": 34,
"stop": 48,
"text": " do"
},
{
"start": 48,
"stop": 72,
"text": " you"
},
{
"start": 72,
"stop": 112,
"text": " think"
},
{
"start": 112,
"stop": 151,
"text": " about"
},
{
"start": 151,
"stop": 191,
"text": " like"
},
{
"start": 191,
"stop": 216,
"text": " when"
},
{
"start": 216,
"stop": 248,
"text": " Elon"
},
{
"start": 248,
"stop": 272,
"text": " was"
},
{
"start": 272,
"stop": 336,
"text": " causing"
},
{
"start": 336,
"stop": 384,
"text": " calling"
},
{
"start": 384,
"stop": 408,
"text": " for"
},
{
"start": 408,
"stop": 416,
"text": " a"
},
{
"start": 416,
"stop": 456,
"text": " pause"
},
{
"start": 456,
"stop": 474,
"text": " on"
},
{
"start": 474,
"stop": 494,
"text": " AI"
},
{
"start": 764,
"stop": 514,
"text": " He"
},
{
"start": 514,
"stop": 530,
"text": " was"
},
{
"start": 530,
"stop": 587,
"text": " like"
},
{
"start": 587,
"stop": 670,
"text": " starting"
},
{
"start": 670,
"stop": 711,
"text": " then"
},
{
"start": 711,
"stop": 721,
"text": " a"
},
{
"start": 721,
"stop": 762,
"text": " GI"
},
{
"start": 762,
"stop": 815,
"text": " company"
},
{
"start": 815,
"stop": 867,
"text": " while"
},
{
"start": 867,
"stop": 888,
"text": " he"
},
{
"start": 888,
"stop": 919,
"text": " was"
},
{
"start": 919,
"stop": 971,
"text": " doing"
},
{
"start": 971,
"stop": 1018,
"text": " that"
},
{
"start": 1104,
"stop": 1257,
"text": " Yeah,"
},
{
"start": 1257,
"stop": 1272,
"text": " so"
},
{
"start": 1272,
"stop": 1310,
"text": " didn't"
},
{
"start": 1310,
"stop": 1323,
"text": " he"
},
{
"start": 1323,
"stop": 1357,
"text": " start"
},
{
"start": 1357,
"stop": 1367,
"text": " it"
},
{
"start": 1367,
"stop": 1414,
"text": " like"
},
{
"start": 1414,
"stop": 1431,
"text": " after"
},
{
"start": 1431,
"stop": 1446,
"text": " he"
},
{
"start": 1446,
"stop": 1464,
"text": " was"
},
{
"start": 1464,
"stop": 1512,
"text": " calling"
},
{
"start": 1512,
"stop": 1532,
"text": " for"
},
{
"start": 1532,
"stop": 1553,
"text": " the"
},
{
"start": 1553,
"stop": 1605,
"text": " pause."
},
{
"start": 1605,
"stop": 1628,
"text": " I"
},
{
"start": 1694,
"stop": 1658,
"text": " Think"
},
{
"start": 1658,
"stop": 1694,
"text": " before"
},
{
"start": 1694,
"stop": 1712,
"text": " but"
},
{
"start": 1712,
"stop": 1718,
"text": " I"
},
{
"start": 1718,
"stop": 1748,
"text": " don't"
},
{
"start": 1748,
"stop": 1772,
"text": " know"
},
{
"start": 1772,
"stop": 1784,
"text": " in"
},
{
"start": 1784,
"stop": 1803,
"text": " any"
},
{
"start": 1803,
"stop": 1832,
"text": " cases"
},
{
"start": 1832,
"stop": 1850,
"text": " one"
},
{
"start": 1850,
"stop": 1866,
"text": " of"
},
{
"start": 1866,
"stop": 1892,
"text": " those"
},
{
"start": 1892,
"stop": 1910,
"text": " you"
},
{
"start": 1910,
"stop": 1940,
"text": " can't"
},
{
"start": 1940,
"stop": 1964,
"text": " beat"
},
{
"start": 1964,
"stop": 1981,
"text": " him"
},
{
"start": 1981,
"stop": 2006,
"text": " join"
},
{
"start": 2006,
"stop": 2030,
"text": " them"
},
{
"start": 2030,
"stop": 2084,
"text": " things."
},
{
"start": 2084,
"stop": 2108,
"text": " Um,"
},
{
"start": 2108,
"stop": 2126,
"text": " I"
},
{
"start": 2410,
"stop": 2185,
"text": " Think"
},
{
"start": 2185,
"stop": 2220,
"text": " the"
},
{
"start": 2220,
"stop": 2315,
"text": " instinct"
},
{
"start": 2315,
"stop": 2338,
"text": " of"
},
{
"start": 2338,
"stop": 2430,
"text": " saying"
},
{
"start": 2430,
"stop": 2461,
"text": " like"
},
{
"start": 2461,
"stop": 2518,
"text": " we've"
},
{
"start": 2518,
"stop": 2585,
"text": " really"
},
{
"start": 2585,
"stop": 2620,
"text": " got"
},
{
"start": 2620,
"stop": 2643,
"text": " to"
},
{
"start": 2643,
"stop": 2714,
"text": " figure"
},
{
"start": 2714,
"stop": 2756,
"text": " out"
},
{
"start": 2756,
"stop": 2784,
"text": " how"
},
{
"start": 2784,
"stop": 2820,
"text": " to"
},
{
"start": 2840,
"stop": 2872,
"text": " Make"
},
{
"start": 2872,
"stop": 2920,
"text": " this"
},
{
"start": 2920,
"stop": 2970,
"text": " safe"
},
{
"start": 2970,
"stop": 3008,
"text": " and"
},
{
"start": 3008,
"stop": 3058,
"text": " good"
},
{
"start": 3058,
"stop": 3096,
"text": " and"
},
{
"start": 3096,
"stop": 3164,
"text": " like"
},
{
"start": 3164,
"stop": 3222,
"text": " widely"
},
{
"start": 3222,
"stop": 3272,
"text": " good"
},
{
"start": 3272,
"stop": 3306,
"text": " is"
},
{
"start": 3306,
"stop": 3454,
"text": " really"
},
{
"start": 3454,
"stop": 3486,
"text": " important"
},
{
"start": 3486,
"stop": 3524,
"text": " but"
},
{
"start": 3524,
"stop": 3535,
"text": " I"
},
{
"start": 3535,
"stop": 3606,
"text": " think"
},
{
"start": 3816,
"stop": 3845,
"text": " Calling"
},
{
"start": 3845,
"stop": 3977,
"text": " for"
},
{
"start": 3977,
"stop": 4016,
"text": " a"
},
{
"start": 4108,
"stop": 4078,
"text": " Pause"
},
{
"start": 4078,
"stop": 4091,
"text": " is"
},
{
"start": 4091,
"stop": 4133,
"text": " like"
},
{
"start": 4133,
"stop": 4188,
"text": " naive"
},
{
"start": 4188,
"stop": 4209,
"text": " it"
},
{
"start": 4209,
"stop": 4230,
"text": " at"
},
{
"start": 4230,
"stop": 4273,
"text": " best"
},
{
"start": 4273,
"stop": 4337,
"text": " for"
},
{
"start": 4337,
"stop": 4337,
"text": " the"
},
{
"start": 4337,
"stop": 4402,
"text": " latest"
},
{
"start": 4402,
"stop": 4446,
"text": " tech"
},
{
"start": 4446,
"stop": 4543,
"text": " insights"
},
{
"start": 4543,
"stop": 4586,
"text": " visit"
},
{
"start": 4586,
"stop": 4693,
"text": " em"
},
{
"start": 4693,
"stop": 4704,
"text": " 360"
},
{
"start": 4704,
"stop": 4750,
"text": " tech"
},
{
"start": 4750,
"stop": 4800,
"text": " calm"
},
{
"start": 4800,
"stop": 4800,
"text": ""
},
{
"start": 4800,
"stop": 4843,
"text": " visit"
},
{
"start": 4843,
"stop": 5054,
"text": " EM360tech.com."
},
{
"start": 5054,
"stop": 5054,
"text": ""
},
{
"start": 5054,
"stop": 6054,
"text": " [BLANK_AUDIO]"
}
] Is there a way we can 'tell' whisper the segments instead of letting him segment it? The diarization is actually pretty simple and once I'll find an approach to use it along with whisper.cpp I can add it to whisper.cpp / implement in Rust. https://github.com/thewh1teagle/ort-diarize/blob/main/main.py |
When transcribing the following file, the timestamps are incorrect.
As you can see the start timestamp of the second segment is the same as the end timestamp of the previous one, although there's a gap of few seconds between.
never.give.you.up.mp4
transcript.srt
transcript.json
word_timestamps.json
The text was updated successfully, but these errors were encountered: