Other playable models-Text2Image #1

Wulx2050 · 2022-06-24T13:26:51Z

playable models

dalle-mini & craiyon
https://github.com/borisdayma/dalle-mini
CogView2
https://github.com/THUDM/CogView2

待添加

No pretrained models

imagen
https://github.com/lucidrains/imagen-pytorch
文心 ERNIE-ViLG
https://wenxin.baidu.com/wenxin/modelbasedetail/ernie_vilg/

待添加

HighCWu · 2022-06-24T14:36:17Z

If we have enough time, we will try to migrate. However, I hope that Baidu official can release an open source model of text to image on paddlepaddle.
I also know a popular model trained by Tsinghua University, although it is also a pytorch version.
CogView2: https://github.com/THUDM/CogView2

Wulx2050 · 2022-06-25T00:22:03Z

If we have enough time, we will try to migrate. However, I hope that Baidu official can release an open source model of text to image on paddlepaddle. I also know a popular model trained by Tsinghua University, although it is also a pytorch version. CogView2: https://github.com/THUDM/CogView2

我刚刚找了一下，文心 ERNIE-ViLG 文本生成图像的能力在开放领域公开数据集 MS-COCO 上进行了验证。评估指标使用 FID(该指标数值越低效果越好), 在 zero-shot 和 finetune 两种方式下，文心 ERNIE-ViLG 都取得了最佳成绩，效果远超 OpenAI 发布的 DALL-E 等模型。他们提供 ERNIE-ViLG API 体验调用的入口，也许你可以联系作者团队，找他们要预训练模型？

I just found it, and the ability of Wenxin ERNIE-ViLG to generate images from text is verified on the open domain public dataset MS-COCO. The evaluation index uses FID (the lower the value of the index, the better the effect). In both zero-shot and finetune methods, Wenxin ERNIE-ViLG has achieved the best results, and the effect is far superior to the models such as DALL-E released by OpenAI. They provide an entry to the ERNIE-ViLG API experience call, maybe you can contact the author team and ask them to pre-train the model?

文心 ERNIE-ViLG
https://wenxin.baidu.com/wenxin/modelbasedetail/ernie_vilg/
paper:
https://arxiv.org/pdf/2112.15283.pdf

Wulx2050 · 2022-06-25T00:43:42Z

Another project with code and models

ERNIE-SAT
类别文心·跨模态大模型
应用语音编辑、语音生成、语音克隆、带语音克隆的语音到语音翻译

ERNIE-SAT 采用语音-文本联合训练的方式在中文和英文数据集上进行预训练。使得模型学到了语音和文本的对齐关系，并且生成频谱的精度更高，合成声音的质量更高。

https://wenxin.baidu.com/wenxin/modelbasedetail/ernie_sat/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Other playable models-Text2Image #1

Other playable models-Text2Image #1

Wulx2050 commented Jun 24, 2022 •

edited

Loading

HighCWu commented Jun 24, 2022

Wulx2050 commented Jun 25, 2022 •

edited

Loading

Wulx2050 commented Jun 25, 2022

Other playable models-Text2Image #1

Other playable models-Text2Image #1

Comments

Wulx2050 commented Jun 24, 2022 • edited Loading

HighCWu commented Jun 24, 2022

Wulx2050 commented Jun 25, 2022 • edited Loading

Wulx2050 commented Jun 25, 2022

Wulx2050 commented Jun 24, 2022 •

edited

Loading

Wulx2050 commented Jun 25, 2022 •

edited

Loading