Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample audio infer error _lzma.LZMAError: Corrupt input data #834

Open
cncbec opened this issue Dec 5, 2024 · 2 comments
Open

sample audio infer error _lzma.LZMAError: Corrupt input data #834

cncbec opened this issue Dec 5, 2024 · 2 comments
Labels
documentation Improvements or additions to documentation

Comments

@cncbec
Copy link

cncbec commented Dec 5, 2024

My code :

from tools.audio import load_audio

from tools.normalizer import normalizer_en_nemo_text, normalizer_zh_tn

import ChatTTS
import scipy
import numpy as np
import torch
from typing import Optional

chat = ChatTTS.Chat()
is_load = chat.load(source='local')

# Define the function
def on_upload_sample(sample_audio_input: Optional[str]) -> str:
    if sample_audio_input is None:
        return ""
    sample_audio = torch.tensor(load_audio(sample_audio_input, 24000)).to('cuda')
    spk_smp = chat.sample_audio_speaker(sample_audio)
    del sample_audio
    return spk_smp
    
spk = on_upload_sample(r"./origin.WAV")
# spk = torch.load('青年女生.pth')

reftext = chat.infer("嗨怪怪星,这是我刚配的隐形眼镜,终于可以摆脱沉重的眼镜,以后,以后再也不会有人。", refine_text_only=True)
print(reftext[0])
params_infer_code = ChatTTS.Chat.InferCodeParams(
    spk_emb=spk,  # add sampled speaker
    txt_smp=reftext[0],
    temperature = .3,      # using custom temperature
    top_P = 0.7,           # top P decode
    top_K = 20,            # top K decode
)

params_refine_text = ChatTTS.Chat.RefineTextParams(
    prompt='[oral_1][laugh_2][break_6]',
)

tmp_text = "频泛化功能是指数字人能够自动适应多种视频内容,从中提取20元的关键信息并进行处理的能力" #阿拉伯数字中文化
wavs = chat.infer(tmp_text, params_refine_text=params_refine_text, params_infer_code=params_infer_code, use_decoder=True)
scipy.io.wavfile.write(filename=f"./4.wav", rate=24000, data=wavs[0].T)

but out:

嗨 [uv_break] 怪 怪 星 [uv_break] , 这 是 我 刚 配 的 隐 形 眼 镜 [uv_break] , 终 于 可 以 摆 脱 沉 重 的 眼 镜 [uv_break] , 以 后 , 以  
后 再 也 不 会 有 人 [uv_break] 。
found invalid characters: {'2', '0'}
text:  12%|████████████▏                                                                                    | 48/384(max) [00:01, 44.37it/s]
Traceback (most recent call last):
  File "D:\python_project\chattts\ChatTTS\clone.py", line 50, in <module>
    wavs = chat.infer(tmp_text, params_refine_text=params_refine_text, params_infer_code=params_infer_code, use_decoder=True)
  File "D:\python_project\chattts\ChatTTS\ChatTTS\core.py", line 220, in infer
    return next(res_gen)
  File "D:\python_project\chattts\ChatTTS\ChatTTS\core.py", line 390, in _infer
    for result in self._infer_code(
  File "F:\anaconda\envs\chattts\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "D:\python_project\chattts\ChatTTS\ChatTTS\core.py", line 551, in _infer_code
    self.speaker.apply(
  File "F:\anaconda\envs\chattts\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "D:\python_project\chattts\ChatTTS\ChatTTS\model\speaker.py", line 32, in apply
    spk_emb_tensor = torch.from_numpy(self._decode(spk_emb))
  File "D:\python_project\chattts\ChatTTS\ChatTTS\model\speaker.py", line 148, in _decode
    lzma.decompress(
  File "F:\anaconda\envs\chattts\lib\lzma.py", line 342, in decompress
    res = decomp.decompress(data)
_lzma.LZMAError: Corrupt input data
@fumiama
Copy link
Member

fumiama commented Dec 5, 2024

传参有误,不要传给spk_emb。

@fumiama fumiama added the documentation Improvements or additions to documentation label Dec 5, 2024
@cncbec
Copy link
Author

cncbec commented Dec 5, 2024

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants