Unofficial PyTorch implementation of MSRA's: PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network.
Maybe there is something different with the paper, but it worked not bad.
- install dependency:
pip install -r requirements.txt
- download datasets
if you don't have WSJ0, you can follow this use aishell-1 by following this se-cldnn-torch
There is something different from se-cldnn-torch: the two list for train (tr.lst, cv.lst ...) need duration information, but se-cldnn-torch dose not need it (because the two dataset.py are different).
So, in this repo, train and cross-validation list nead to be like this
/path/noisy1.wav /path/ref1.wav 3.0233
/path/noisy2.wav /path/ref2.wav 2.3213
/path/noisy2.wav /path/ref2.wav 8.8127
...
To add duration information, you can use tools/add_duration.py
like:
python tools/add_duration.py data/tr_wsj0.lst
As for inference stage (decode stage, eval stage), the list only need the path of noisy path:
/path/noisy1.wav
/path/noisy2.wav
/path/noisy2.wav
...
- run.
before you run it, please set the correct params in
./run_phasen.sh
bash run_phasen.sh
funcwj's voice-filter
wangkenpu's Conv-Tasnet
pseeth's torch-stft