Replies: 3 comments
-
(venv) root@ip-10-0-1-86:xm_test# cat data/vocab_zh.txt |
Beta Was this translation helpful? Give feedback.
-
/data/DeepSpeech/tools/venv/lib/python3.8/distutils/init.py:4: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
|
Beta Was this translation helpful? Give feedback.
-
这个数据集太小了, val_loss: 374.868286, epoch 5 。这根本没法训练ASR。 |
Beta Was this translation helpful? Give feedback.
-
系统:18.04.1-Ubuntu
paddlepaddle:Version: 2.0.2
DeepSpeech:develop
Python:3.8.6
/data/DeepSpeech/tools/venv/lib/python3.8/distutils/init.py:4: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
/data/DeepSpeech/tools/venv/lib/python3.8/site-packages/pkg_resources/_vendor/pyparsing.py:943: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
collections.MutableMapping.register(ParseResults)
/data/DeepSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning:
np.int
is a deprecated alias for the builtinint
. To silence this warning, useint
by itself. Doing this will not modify any behavior and is safe. When replacingnp.int
, you may wish to use e.g.np.int64
ornp.int32
to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
def convert_to_list(value, n, name, dtype=np.int):
----------- Configuration Arguments -----------
checkpoint_path: ckpt-baseline/checkpoints/step-0
config: conf/deepspeech2.yaml
device: cpu
dump_config: None
export_path: None
nprocs: 1
opts: []
output: None
data:
augmentation_config: conf/augmentation.config
batch_size: 64
dev_manifest: data/manifest_zh.dev
keep_transcription_text: False
max_duration: 27.0
max_freq: None
mean_std_filepath: data/mean_std_zh.npz
min_duration: 0.0
n_fft: None
num_workers: 0
random_seed: 0
shuffle_method: batch_shuffle
sortagrad: True
specgram_type: linear
stride_ms: 10.0
target_dB: -20
target_sample_rate: 16000
test_manifest: data/manifest_zh.test
train_manifest: data/manifest_zh.train
use_dB_normalization: True
vocab_filepath: data/vocab_zh.txt
window_ms: 20.0
decoding:
alpha: 2.5
batch_size: 128
beam_size: 300
beta: 5.0
cutoff_prob: 0.99
cutoff_top_n: 40
decoding_method: ctc_beam_search
error_rate_type: cer
lang_model_path: data/lm/zh_giga.no_cna_cmn.prune01244.klm
num_proc_bsearch: 10
model:
num_conv_layers: 2
num_rnn_layers: 3
rnn_layer_size: 2048
share_rnn_weights: False
use_gru: True
training:
global_grad_clip: 5.0
lr: 0.002
lr_decay: 0.83
n_epoch: 5
weight_decay: 1e-06
2021-04-21 02:50:21,344 - INFO - Setup test Dataloader!
2021-04-21 02:50:23,853 - INFO - Setup model!
-----------------ckpt-baseline/checkpoints/step-0.pdparams
2021-04-21 02:50:23,853 - INFO - [checkpoint] Rank 0: loaded model from ckpt-baseline/checkpoints/step-0.pdparams
2021-04-21 02:50:33,831 - INFO - Test Total Examples: 4
2021-04-21 02:50:33,855 - INFO - begin to initialize the external scorer for decoding
2021-04-21 02:50:33,944 - INFO - language model: is_character_based = 1, max_order = 5, dict_size = 0
2021-04-21 02:50:33,944 - INFO - end initializing scorer
/data/DeepSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.9 it will stop working
return (isinstance(seq, collections.Sequence) and
2021-04-21 02:50:45,719 - INFO -
Target Transcription: 猫
Output Transcription: 广州二手房市场二手房市场二手房市场二手房市场
2021-04-21 02:50:45,719 - INFO - Current error rate [cer] = 22.000000
2021-04-21 02:50:45,727 - INFO -
Target Transcription: 由改善型换房人士承购
Output Transcription: 广州二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场
2021-04-21 02:50:45,735 - INFO - Current error rate [cer] = 11.600000
2021-04-21 02:50:45,746 - INFO -
Target Transcription: 七月广州二手住宅市场当中
Output Transcription: 广州二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场
2021-04-21 02:50:45,758 - INFO - Current error rate [cer] = 11.500000
2021-04-21 02:50:45,777 - INFO -
Target Transcription: 六月仍有百分之七单价在三万元的房源
Output Transcription: 广州二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场二手房市场
2021-04-21 02:50:45,796 - INFO - Current error rate [cer] = 9.764706
2021-04-21 02:50:45,796 - INFO - Error rate [cer] (4/?) = 11.050000
2021-04-21 02:50:45,796 - INFO - Test: epoch: 0, step: 0, , Final error rate [cer] (4/4) = 11.050000
Beta Was this translation helpful? Give feedback.
All reactions