Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get the data? #27

Open
YeJinPaark opened this issue Aug 24, 2023 · 9 comments
Open

How can I get the data? #27

YeJinPaark opened this issue Aug 24, 2023 · 9 comments

Comments

@YeJinPaark
Copy link

First, I downloaded 'Transformer-en' and renamed it like './model/syngec/english_transformer_baseline.pt'.
Then, I downloaded the preprocessed data.

And I run the code './pipeline_gopar.sh'.
But the error is:

Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 17, in
input_sentences = load(sys.argv[1])
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 9, in load
with open(filename, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt'
Loading resources...
Processing parallel files...
Traceback (most recent call last):
File "/opt/conda/bin/errant_parallel", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in main
in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor]
File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in
in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/convert_gec_data_to_parsing_data_english.py", line 153, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt.conll_predict'
/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects --local-rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
WARNING:torch.distributed.run:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 30676) of binary: /opt/conda/bin/python
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 196, in
main()
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 192, in main
launch(args)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 177, in launch
run(args)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

supar.cmds.biaffine_dep FAILED

Failures:
[1]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 30677)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 30678)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[3]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 30679)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[4]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 4 (local_rank: 4)
exitcode : 1 (pid: 30680)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[5]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 5 (local_rank: 5)
exitcode : 1 (pid: 30681)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[6]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 6 (local_rank: 6)
exitcode : 1 (pid: 30682)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[7]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 7 (local_rank: 7)
exitcode : 1 (pid: 30683)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2023-08-24_08:10:36
host : 309e7fc0781e
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 30676)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'

How can I fix it? plz help me...

@HillZhang1999
Copy link
Owner

I notice that the file path in your error message seems strange, such as ``/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py''.
Please try to enter the corresponding directory and then re-run the bash file.

@YeJinPaark
Copy link
Author

Ok, I'll try soon

Then, I wonder that the how can I get the data like:

FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt'

@HillZhang1999
Copy link
Owner

You should download the preprocessed data, unzip them, and put them into https://github.com/HillZhang1999/SynGEC/tree/main/data

@YeJinPaark
Copy link
Author

Is that preprocessed data same the link of data:
https://drive.google.com/file/d/1dIDfYhELrh3BEKgGpsPYAy5ehcobmMov/view

So I downloaded the data and unzip ./data/
but I got the error like

Apply BPE...
./preprocess_syngec_transformer.sh: line 22: ../../data/clang8_train/src.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 23: ../../data/clang8_train/tgt.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 24: ../../data/bea19_dev/src.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 25: ../../data/bea19_dev/tgt.txt: No such file or directory
Align subwords and words...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt'
cp: cannot stat '../../data/clang8_train/src.txt': No such file or directory
cp: cannot stat '../../data/clang8_train/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/clang8_train/tgt.txt': No such file or directory
cp: cannot stat '../../data/clang8_train/tgt.txt.bpe': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory
cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory
cp: cannot stat '../../data/clang8_train/src.txt.swm': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm'
cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory
Calculate dependency distance...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/clang8_train/src.txt.conll_predict_gopar'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar'
cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/clang8_train/src.txt.conll_predict_gopar_np.probs': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory
Preprocess...
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
usage: preprocess.py [-h] [--no-progress-bar]
[--log-interval LOG_INTERVAL]
[--log-format LOG_FORMAT]
[--tensorboard-logdir TENSORBOARD_LOGDIR]
[--seed SEED] [--cpu] [--tpu] [--bf16]
[--memory-efficient-bf16] [--fp16]
[--memory-efficient-fp16]
[--fp16-no-flatten-grads]
[--fp16-init-scale FP16_INIT_SCALE]
[--fp16-scale-window FP16_SCALE_WINDOW]
[--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
[--min-loss-scale MIN_LOSS_SCALE]
[--threshold-loss-scale THRESHOLD_LOSS_SCALE]
[--user-dir USER_DIR]
[--empty-cache-freq EMPTY_CACHE_FREQ]
[--all-gather-list-size ALL_GATHER_LIST_SIZE]
[--model-parallel-size MODEL_PARALLEL_SIZE]
[--checkpoint-suffix CHECKPOINT_SUFFIX]
[--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
[--quantization-config-path QUANTIZATION_CONFIG_PATH]
[--profile]
[--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}]
[--tokenizer {space,moses,nltk}]
[--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}]
[--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}]
[--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}]
[--scoring {wer,sacrebleu,bleu,chrf}]
[--task TASK] [-s SRC] [-t TARGET]
[--source-lang-with-nt SRC] [--trainpref FP]
[--validpref FP] [--testpref FP]
[--align-suffix FP] [--conll-suffix FP [FP ...]]
[--dpd-suffix FP [FP ...]]
[--probs-suffix FP [FP ...]] [--swm-suffix FP]
[--destdir DIR] [--thresholdtgt N]
[--thresholdsrc N] [--tgtdict FP] [--srcdict FP]
[--labeldict FP [FP ...]] [--nwordstgt N]
[--nwordssrc N] [--alignfile ALIGN]
[--dataset-impl FORMAT] [--joined-dictionary]
[--only-source] [--padding-factor N]
[--workers N]
preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model'
Finished!
Apply BPE...
./preprocess_syngec_transformer.sh: line 112: ../../data/error_coded_train/src.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 113: ../../data/error_coded_train/tgt.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 114: ../../data/bea19_dev/src.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 115: ../../data/bea19_dev/tgt.txt: No such file or directory
Align subwords and words...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt'
cp: cannot stat '../../data/error_coded_train/src.txt': No such file or directory
cp: cannot stat '../../data/error_coded_train/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/error_coded_train/tgt.txt': No such file or directory
cp: cannot stat '../../data/error_coded_train/tgt.txt.bpe': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory
cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory
cp: cannot stat '../../data/error_coded_train/src.txt.swm': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm'
cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory
Calculate dependency distance...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/error_coded_train/src.txt.conll_predict_gopar'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar'
cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/error_coded_train/src.txt.conll_predict_gopar_np.probs': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory
Preprocess...
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
usage: preprocess.py [-h] [--no-progress-bar]
[--log-interval LOG_INTERVAL]
[--log-format LOG_FORMAT]
[--tensorboard-logdir TENSORBOARD_LOGDIR]
[--seed SEED] [--cpu] [--tpu] [--bf16]
[--memory-efficient-bf16] [--fp16]
[--memory-efficient-fp16]
[--fp16-no-flatten-grads]
[--fp16-init-scale FP16_INIT_SCALE]
[--fp16-scale-window FP16_SCALE_WINDOW]
[--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
[--min-loss-scale MIN_LOSS_SCALE]
[--threshold-loss-scale THRESHOLD_LOSS_SCALE]
[--user-dir USER_DIR]
[--empty-cache-freq EMPTY_CACHE_FREQ]
[--all-gather-list-size ALL_GATHER_LIST_SIZE]
[--model-parallel-size MODEL_PARALLEL_SIZE]
[--checkpoint-suffix CHECKPOINT_SUFFIX]
[--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
[--quantization-config-path QUANTIZATION_CONFIG_PATH]
[--profile]
[--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}]
[--tokenizer {space,moses,nltk}]
[--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}]
[--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}]
[--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}]
[--scoring {wer,sacrebleu,bleu,chrf}]
[--task TASK] [-s SRC] [-t TARGET]
[--source-lang-with-nt SRC] [--trainpref FP]
[--validpref FP] [--testpref FP]
[--align-suffix FP] [--conll-suffix FP [FP ...]]
[--dpd-suffix FP [FP ...]]
[--probs-suffix FP [FP ...]] [--swm-suffix FP]
[--destdir DIR] [--thresholdtgt N]
[--thresholdsrc N] [--tgtdict FP] [--srcdict FP]
[--labeldict FP [FP ...]] [--nwordstgt N]
[--nwordssrc N] [--alignfile ALIGN]
[--dataset-impl FORMAT] [--joined-dictionary]
[--only-source] [--padding-factor N]
[--workers N]
preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model'
Finished!
Apply BPE...
./preprocess_syngec_transformer.sh: line 202: ../../data/wi_locness_train/src.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 203: ../../data/wi_locness_train/tgt.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 204: ../../data/bea19_dev/src.txt: No such file or directory
./preprocess_syngec_transformer.sh: line 205: ../../data/bea19_dev/tgt.txt: No such file or directory
Align subwords and words...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt'
cp: cannot stat '../../data/wi_locness_train/src.txt': No such file or directory
cp: cannot stat '../../data/wi_locness_train/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/wi_locness_train/tgt.txt': No such file or directory
cp: cannot stat '../../data/wi_locness_train/tgt.txt.bpe': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/bea19_dev/tgt.txt': No such file or directory
cp: cannot stat '../../data/bea19_dev/tgt.txt.bpe': No such file or directory
cp: cannot stat '../../data/wi_locness_train/src.txt.swm': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.swm': No such file or directory
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.swm'
cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np': No such file or directory
Calculate dependency distance...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/src.txt.conll_predict_gopar'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_dev/src.txt.conll_predict_gopar'
cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/wi_locness_train/src.txt.conll_predict_gopar_np.probs': No such file or directory
cp: cannot stat '../../data/bea19_dev/src.txt.conll_predict_gopar_np.probs': No such file or directory
Preprocess...
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
usage: preprocess.py [-h] [--no-progress-bar]
[--log-interval LOG_INTERVAL]
[--log-format LOG_FORMAT]
[--tensorboard-logdir TENSORBOARD_LOGDIR]
[--seed SEED] [--cpu] [--tpu] [--bf16]
[--memory-efficient-bf16] [--fp16]
[--memory-efficient-fp16]
[--fp16-no-flatten-grads]
[--fp16-init-scale FP16_INIT_SCALE]
[--fp16-scale-window FP16_SCALE_WINDOW]
[--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
[--min-loss-scale MIN_LOSS_SCALE]
[--threshold-loss-scale THRESHOLD_LOSS_SCALE]
[--user-dir USER_DIR]
[--empty-cache-freq EMPTY_CACHE_FREQ]
[--all-gather-list-size ALL_GATHER_LIST_SIZE]
[--model-parallel-size MODEL_PARALLEL_SIZE]
[--checkpoint-suffix CHECKPOINT_SUFFIX]
[--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
[--quantization-config-path QUANTIZATION_CONFIG_PATH]
[--profile]
[--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}]
[--tokenizer {space,moses,nltk}]
[--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}]
[--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}]
[--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}]
[--scoring {wer,sacrebleu,bleu,chrf}]
[--task TASK] [-s SRC] [-t TARGET]
[--source-lang-with-nt SRC] [--trainpref FP]
[--validpref FP] [--testpref FP]
[--align-suffix FP] [--conll-suffix FP [FP ...]]
[--dpd-suffix FP [FP ...]]
[--probs-suffix FP [FP ...]] [--swm-suffix FP]
[--destdir DIR] [--thresholdtgt N]
[--thresholdsrc N] [--tgtdict FP] [--srcdict FP]
[--labeldict FP [FP ...]] [--nwordstgt N]
[--nwordssrc N] [--alignfile ALIGN]
[--dataset-impl FORMAT] [--joined-dictionary]
[--only-source] [--padding-factor N]
[--workers N]
preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model'
Finished!
Apply BPE...
./preprocess_syngec_transformer.sh: line 290: ../../data/conll14_test/src.txt: No such file or directory
Align subwords and words...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt'
cp: cannot stat '../../data/conll14_test/src.txt': No such file or directory
cp: cannot stat '../../data/conll14_test/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/conll14_test/src.txt.swm': No such file or directory
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.swm'
cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np': No such file or directory
Calculate dependency distance...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/conll14_test/src.txt.conll_predict_gopar'
cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/conll14_test/src.txt.conll_predict_gopar_np.probs': No such file or directory
Preprocess...
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
usage: preprocess.py [-h] [--no-progress-bar]
[--log-interval LOG_INTERVAL]
[--log-format LOG_FORMAT]
[--tensorboard-logdir TENSORBOARD_LOGDIR]
[--seed SEED] [--cpu] [--tpu] [--bf16]
[--memory-efficient-bf16] [--fp16]
[--memory-efficient-fp16]
[--fp16-no-flatten-grads]
[--fp16-init-scale FP16_INIT_SCALE]
[--fp16-scale-window FP16_SCALE_WINDOW]
[--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
[--min-loss-scale MIN_LOSS_SCALE]
[--threshold-loss-scale THRESHOLD_LOSS_SCALE]
[--user-dir USER_DIR]
[--empty-cache-freq EMPTY_CACHE_FREQ]
[--all-gather-list-size ALL_GATHER_LIST_SIZE]
[--model-parallel-size MODEL_PARALLEL_SIZE]
[--checkpoint-suffix CHECKPOINT_SUFFIX]
[--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
[--quantization-config-path QUANTIZATION_CONFIG_PATH]
[--profile]
[--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}]
[--tokenizer {space,moses,nltk}]
[--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}]
[--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}]
[--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}]
[--scoring {wer,sacrebleu,bleu,chrf}]
[--task TASK] [-s SRC] [-t TARGET]
[--source-lang-with-nt SRC] [--trainpref FP]
[--validpref FP] [--testpref FP]
[--align-suffix FP] [--conll-suffix FP [FP ...]]
[--dpd-suffix FP [FP ...]]
[--probs-suffix FP [FP ...]] [--swm-suffix FP]
[--destdir DIR] [--thresholdtgt N]
[--thresholdsrc N] [--tgtdict FP] [--srcdict FP]
[--labeldict FP [FP ...]] [--nwordstgt N]
[--nwordssrc N] [--alignfile ALIGN]
[--dataset-impl FORMAT] [--joined-dictionary]
[--only-source] [--padding-factor N]
[--workers N]
preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model'
Finished!
Apply BPE...
./preprocess_syngec_transformer.sh: line 358: ../../data/bea19_test/src.txt: No such file or directory
Align subwords and words...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/subword_align.py", line 41, in
with open(file_word, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt'
cp: cannot stat '../../data/bea19_test/src.txt': No such file or directory
cp: cannot stat '../../data/bea19_test/src.txt.bpe': No such file or directory
cp: cannot stat '../../data/bea19_test/src.txt.swm': No such file or directory
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.swm'
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/syntax_information_reprocess.py", line 37, in
swm_list = [[int(i) for i in line.rstrip("\n").split()] for line in open(swm_file, "r").readlines()]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.swm'
cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np': No such file or directory
Calculate dependency distance...
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../utils/calculate_dependency_distance.py", line 126, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/bea19_test/src.txt.conll_predict_gopar'
cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np.dpd': No such file or directory
cp: cannot stat '../../data/bea19_test/src.txt.conll_predict_gopar_np.probs': No such file or directory
Preprocess...
/opt/conda/lib/python3.10/site-packages/torch/cuda/init.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
usage: preprocess.py [-h] [--no-progress-bar]
[--log-interval LOG_INTERVAL]
[--log-format LOG_FORMAT]
[--tensorboard-logdir TENSORBOARD_LOGDIR]
[--seed SEED] [--cpu] [--tpu] [--bf16]
[--memory-efficient-bf16] [--fp16]
[--memory-efficient-fp16]
[--fp16-no-flatten-grads]
[--fp16-init-scale FP16_INIT_SCALE]
[--fp16-scale-window FP16_SCALE_WINDOW]
[--fp16-scale-tolerance FP16_SCALE_TOLERANCE]
[--min-loss-scale MIN_LOSS_SCALE]
[--threshold-loss-scale THRESHOLD_LOSS_SCALE]
[--user-dir USER_DIR]
[--empty-cache-freq EMPTY_CACHE_FREQ]
[--all-gather-list-size ALL_GATHER_LIST_SIZE]
[--model-parallel-size MODEL_PARALLEL_SIZE]
[--checkpoint-suffix CHECKPOINT_SUFFIX]
[--checkpoint-shard-count CHECKPOINT_SHARD_COUNT]
[--quantization-config-path QUANTIZATION_CONFIG_PATH]
[--profile]
[--criterion {label_smoothed_cross_entropy,composite_loss,ctc,masked_lm,legacy_masked_lm_loss,wav2vec,sentence_prediction,label_smoothed_cross_entropy_with_alignment,cross_entropy,nat_loss,adaptive_loss,sentence_ranking,vocab_parallel_cross_entropy}]
[--tokenizer {space,moses,nltk}]
[--bpe {characters,bert,subword_nmt,fastbpe,byte_bpe,bytes,sentencepiece,gpt2,hf_byte_bpe}]
[--optimizer {adafactor,nag,adam,adagrad,adamax,lamb,sgd,adadelta}]
[--lr-scheduler {cosine,triangular,inverse_sqrt,tri_stage,fixed,reduce_lr_on_plateau,polynomial_decay}]
[--scoring {wer,sacrebleu,bleu,chrf}]
[--task TASK] [-s SRC] [-t TARGET]
[--source-lang-with-nt SRC] [--trainpref FP]
[--validpref FP] [--testpref FP]
[--align-suffix FP] [--conll-suffix FP [FP ...]]
[--dpd-suffix FP [FP ...]]
[--probs-suffix FP [FP ...]] [--swm-suffix FP]
[--destdir DIR] [--thresholdtgt N]
[--thresholdsrc N] [--tgtdict FP] [--srcdict FP]
[--labeldict FP [FP ...]] [--nwordstgt N]
[--nwordssrc N] [--alignfile ALIGN]
[--dataset-impl FORMAT] [--joined-dictionary]
[--only-source] [--padding-factor N]
[--workers N]
preprocess.py: error: argument --user-dir: invalid Optional value: '../../src/src_syngec/syngec_model'
Finished!

@HillZhang1999
Copy link
Owner

please enter the directory of this bash file, then run cat ../../data/clang8_train/src.txt, check whether there is actually a file. If not, please check the way you unzip the data.

@YeJinPaark
Copy link
Author

First I use the unzip like "tar -zxvf syngec_preprocess.tar.gz"

and then the log is
preprocess/
preprocess/chinese_hsk+lang8_with_syntax_transformer/
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.label.txt
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.src.txt
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/dict.tgt.txt
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/preprocess.log
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.conll.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.conll.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.probs.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.probs.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.tgt.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/train.src-tgt.tgt.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.src.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.src.idx
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.tgt.bin
preprocess/chinese_hsk+lang8_with_syntax_transformer/bin/valid.src-tgt.tgt.idx
preprocess/chinese_mucgec_with_syntax_transformer/
preprocess/chinese_mucgec_with_syntax_transformer/bin/
preprocess/chinese_mucgec_with_syntax_transformer/bin/dict.label.txt
preprocess/chinese_mucgec_with_syntax_transformer/bin/dict.src.txt
preprocess/chinese_mucgec_with_syntax_transformer/bin/preprocess.log
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.conll.src-tgt.src.bin
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.conll.src-tgt.src.idx
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.dpd.src-tgt.src.bin
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.dpd.src-tgt.src.idx
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.probs.src-tgt.src.bin
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.probs.src-tgt.src.idx
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.src-tgt.src.bin
preprocess/chinese_mucgec_with_syntax_transformer/bin/test.src-tgt.src.idx
preprocess/english_bea19_with_syntax_bart/
preprocess/english_bea19_with_syntax_bart/bin/
preprocess/english_bea19_with_syntax_bart/bin/dict.label.txt
preprocess/english_bea19_with_syntax_bart/bin/dict.src.txt
preprocess/english_bea19_with_syntax_bart/bin/preprocess.log
preprocess/english_bea19_with_syntax_bart/bin/test.conll.src-tgt.src.bin
preprocess/english_bea19_with_syntax_bart/bin/test.conll.src-tgt.src.idx
preprocess/english_bea19_with_syntax_bart/bin/test.dpd.src-tgt.src.bin
preprocess/english_bea19_with_syntax_bart/bin/test.dpd.src-tgt.src.idx
preprocess/english_bea19_with_syntax_bart/bin/test.probs.src-tgt.src.bin
preprocess/english_bea19_with_syntax_bart/bin/test.probs.src-tgt.src.idx
preprocess/english_bea19_with_syntax_bart/bin/test.src-tgt.src.bin
preprocess/english_bea19_with_syntax_bart/bin/test.src-tgt.src.idx
preprocess/english_bea19_with_syntax_transformer/
preprocess/english_bea19_with_syntax_transformer/dict.label.txt
preprocess/english_bea19_with_syntax_transformer/dict.src.txt
preprocess/english_bea19_with_syntax_transformer/preprocess.log
preprocess/english_bea19_with_syntax_transformer/test.conll.src-tgt.src.bin
preprocess/english_bea19_with_syntax_transformer/test.conll.src-tgt.src.idx
preprocess/english_bea19_with_syntax_transformer/test.dpd.src-tgt.src.bin
preprocess/english_bea19_with_syntax_transformer/test.dpd.src-tgt.src.idx
preprocess/english_bea19_with_syntax_transformer/test.probs.src-tgt.src.bin
preprocess/english_bea19_with_syntax_transformer/test.probs.src-tgt.src.idx
preprocess/english_bea19_with_syntax_transformer/test.src-tgt.src.bin
preprocess/english_bea19_with_syntax_transformer/test.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/
preprocess/english_clang8_with_syntax_bart/bin/
preprocess/english_clang8_with_syntax_bart/bin/dict.label.txt
preprocess/english_clang8_with_syntax_bart/bin/dict.src.txt
preprocess/english_clang8_with_syntax_bart/bin/dict.tgt.txt
preprocess/english_clang8_with_syntax_bart/bin/preprocess.log
preprocess/english_clang8_with_syntax_bart/bin/train.conll.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/train.conll.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/train.dpd.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/train.dpd.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/train.probs.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/train.probs.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.tgt.bin
preprocess/english_clang8_with_syntax_bart/bin/train.src-tgt.tgt.idx
preprocess/english_clang8_with_syntax_bart/bin/valid.conll.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/valid.conll.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/valid.probs.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/valid.probs.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.src.bin
preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.src.idx
preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.tgt.bin
preprocess/english_clang8_with_syntax_bart/bin/valid.src-tgt.tgt.idx
preprocess/english_clang8_with_syntax_transformer/
preprocess/english_clang8_with_syntax_transformer/bin/
preprocess/english_clang8_with_syntax_transformer/bin/dict.label.txt
preprocess/english_clang8_with_syntax_transformer/bin/dict.src.txt
preprocess/english_clang8_with_syntax_transformer/bin/dict.tgt.txt
preprocess/english_clang8_with_syntax_transformer/bin/preprocess.log
preprocess/english_clang8_with_syntax_transformer/bin/train.conll.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/train.conll.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/train.probs.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/train.probs.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.tgt.bin
preprocess/english_clang8_with_syntax_transformer/bin/train.src-tgt.tgt.idx
preprocess/english_clang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.src.bin
preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.src.idx
preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.tgt.bin
preprocess/english_clang8_with_syntax_transformer/bin/valid.src-tgt.tgt.idx
preprocess/english_conll14_with_syntax_bart/
preprocess/english_conll14_with_syntax_bart/bin/
preprocess/english_conll14_with_syntax_bart/bin/dict.label.txt
preprocess/english_conll14_with_syntax_bart/bin/dict.src.txt
preprocess/english_conll14_with_syntax_bart/bin/preprocess.log
preprocess/english_conll14_with_syntax_bart/bin/test.conll.src-tgt.src.bin
preprocess/english_conll14_with_syntax_bart/bin/test.conll.src-tgt.src.idx
preprocess/english_conll14_with_syntax_bart/bin/test.dpd.src-tgt.src.bin
preprocess/english_conll14_with_syntax_bart/bin/test.dpd.src-tgt.src.idx
preprocess/english_conll14_with_syntax_bart/bin/test.probs.src-tgt.src.bin
preprocess/english_conll14_with_syntax_bart/bin/test.probs.src-tgt.src.idx
preprocess/english_conll14_with_syntax_bart/bin/test.src-tgt.src.bin
preprocess/english_conll14_with_syntax_bart/bin/test.src-tgt.src.idx
preprocess/english_conll14_with_syntax_transformer/
preprocess/english_conll14_with_syntax_transformer/dict.label.txt
preprocess/english_conll14_with_syntax_transformer/dict.src.txt
preprocess/english_conll14_with_syntax_transformer/preprocess.log
preprocess/english_conll14_with_syntax_transformer/test.conll.src-tgt.src.bin
preprocess/english_conll14_with_syntax_transformer/test.conll.src-tgt.src.idx
preprocess/english_conll14_with_syntax_transformer/test.dpd.src-tgt.src.bin
preprocess/english_conll14_with_syntax_transformer/test.dpd.src-tgt.src.idx
preprocess/english_conll14_with_syntax_transformer/test.probs.src-tgt.src.bin
preprocess/english_conll14_with_syntax_transformer/test.probs.src-tgt.src.idx
preprocess/english_conll14_with_syntax_transformer/test.src-tgt.src.bin
preprocess/english_conll14_with_syntax_transformer/test.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/
preprocess/english_error_coded_with_syntax_bart/bin/
preprocess/english_error_coded_with_syntax_bart/bin/dict.label.txt
preprocess/english_error_coded_with_syntax_bart/bin/dict.src.txt
preprocess/english_error_coded_with_syntax_bart/bin/dict.tgt.txt
preprocess/english_error_coded_with_syntax_bart/bin/preprocess.log
preprocess/english_error_coded_with_syntax_bart/bin/train.conll.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/train.conll.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/train.dpd.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/train.dpd.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/train.probs.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/train.probs.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.tgt.bin
preprocess/english_error_coded_with_syntax_bart/bin/train.src-tgt.tgt.idx
preprocess/english_error_coded_with_syntax_bart/bin/valid.conll.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/valid.conll.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/valid.probs.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/valid.probs.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.tgt.bin
preprocess/english_error_coded_with_syntax_bart/bin/valid.src-tgt.tgt.idx
preprocess/english_error_coded_with_syntax_transformer/
preprocess/english_error_coded_with_syntax_transformer/bin/
preprocess/english_error_coded_with_syntax_transformer/bin/dict.label.txt
preprocess/english_error_coded_with_syntax_transformer/bin/dict.src.txt
preprocess/english_error_coded_with_syntax_transformer/bin/dict.tgt.txt
preprocess/english_error_coded_with_syntax_transformer/bin/preprocess.log
preprocess/english_error_coded_with_syntax_transformer/bin/train.conll.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/train.conll.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/train.probs.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/train.probs.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.tgt.bin
preprocess/english_error_coded_with_syntax_transformer/bin/train.src-tgt.tgt.idx
preprocess/english_error_coded_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.src.bin
preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.src.idx
preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.tgt.bin
preprocess/english_error_coded_with_syntax_transformer/bin/valid.src-tgt.tgt.idx
preprocess/english_wi_locness_with_syntax_bart/
preprocess/english_wi_locness_with_syntax_bart/bin/
preprocess/english_wi_locness_with_syntax_bart/bin/dict.label.txt
preprocess/english_wi_locness_with_syntax_bart/bin/dict.src.txt
preprocess/english_wi_locness_with_syntax_bart/bin/dict.tgt.txt
preprocess/english_wi_locness_with_syntax_bart/bin/preprocess.log
preprocess/english_wi_locness_with_syntax_bart/bin/train.conll.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/train.conll.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/train.dpd.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/train.dpd.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/train.probs.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/train.probs.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.tgt.bin
preprocess/english_wi_locness_with_syntax_bart/bin/train.src-tgt.tgt.idx
preprocess/english_wi_locness_with_syntax_bart/bin/valid.conll.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/valid.conll.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/valid.dpd.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/valid.dpd.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/valid.probs.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/valid.probs.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.tgt.bin
preprocess/english_wi_locness_with_syntax_bart/bin/valid.src-tgt.tgt.idx
preprocess/english_wi_locness_with_syntax_transformer/
preprocess/english_wi_locness_with_syntax_transformer/bin/
preprocess/english_wi_locness_with_syntax_transformer/bin/dict.label.txt
preprocess/english_wi_locness_with_syntax_transformer/bin/dict.src.txt
preprocess/english_wi_locness_with_syntax_transformer/bin/dict.tgt.txt
preprocess/english_wi_locness_with_syntax_transformer/bin/preprocess.log
preprocess/english_wi_locness_with_syntax_transformer/bin/train.conll.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/train.conll.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/train.dpd.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/train.dpd.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/train.probs.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/train.probs.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.tgt.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/train.src-tgt.tgt.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.conll.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.conll.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.dpd.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.dpd.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.probs.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.probs.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.src.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.src.idx
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.tgt.bin
preprocess/english_wi_locness_with_syntax_transformer/bin/valid.src-tgt.tgt.idx

and I run the bash file:

root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/data# cd /mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/
root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp# ls
generate_syngec_bart.sh preprocess_syngec_bart.sh
generate_syngec_transformer.sh preprocess_syngec_transformer.sh
nohup.out train_syngec_bart.sh
pipeline_gopar.sh train_syngec_transformer.sh
root@309e7fc0781e:/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp# ./pipeline_gopar.sh
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 17, in
input_sentences = load(sys.argv[1])
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/parse.py", line 9, in load
with open(filename, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt'
Loading resources...
Processing parallel files...
Traceback (most recent call last):
File "/opt/conda/bin/errant_parallel", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in main
in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor]
File "/opt/conda/lib/python3.10/site-packages/errant/commands/parallel_to_m2.py", line 16, in
in_files = [stack.enter_context(open(i)) for i in [args.orig]+args.cor]
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt'
Traceback (most recent call last):
File "/mnt/ssd_mnt/pyj/SynGEC/bash/english_exp/../../src/src_gopar/convert_gec_data_to_parsing_data_english.py", line 153, in
with open(conll_file, "r") as f1:
FileNotFoundError: [Errno 2] No such file or directory: '../../data/wi_locness_train/tgt.txt.conll_predict'
/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects --local-rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions

warnings.warn(
WARNING:torch.distributed.run:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
/opt/conda/bin/python: No module named supar.cmds.biaffine_dep
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 34107) of binary: /opt/conda/bin/python
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 196, in
main()
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 192, in main
launch(args)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launch.py", line 177, in launch
run(args)
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/opt/conda/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

supar.cmds.biaffine_dep FAILED

Failures:
[1]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 34108)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 34109)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[3]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 34110)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[4]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 4 (local_rank: 4)
exitcode : 1 (pid: 34111)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[5]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 5 (local_rank: 5)
exitcode : 1 (pid: 34112)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[6]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 6 (local_rank: 6)
exitcode : 1 (pid: 34113)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[7]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 7 (local_rank: 7)
exitcode : 1 (pid: 34114)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2023-08-25_12:30:09
host : 309e7fc0781e
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 34107)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'
nohup: appending output to 'nohup.out'

what's the problem and how can I fix it?

@HillZhang1999
Copy link
Owner

If you don't want to re-train the parser, you can directly skip the data preprocess step. The preprocessed file can be directly downloaded from our Google Drive.
If you want to re-train the parser, you must download the required datasets from their official websites, and put them into the corresponding director (src.txt, tgt.txt, one sentence one line).

@hwlys
Copy link

hwlys commented Jan 17, 2024

您好,请问您解决了这个问题吗,我解压缩后也没有明确的src.txt和tgt.txt,解压缩的文件是这样的,请问该如何做呢
屏幕截图 2024-01-17 203001

@HillZhang1999
Copy link
Owner

您好,请问您解决了这个问题吗,我解压缩后也没有明确的src.txt和tgt.txt,解压缩的文件是这样的,请问该如何做呢 屏幕截图 2024-01-17 203001

由于版权问题,我们没有提供文本文件,只有处理好的二进制文件,可以直接拿来训练

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants