Inquiry about Default Instrument Settings and Dual Loudness Augmentation in bs_roformer #39

EuiYeonKim · 2024-07-17T09:26:08Z

Hello, I am currently analyzing your code in an attempt to reproduce the paper performance of bs_roformer. While examining the code, I have come across a few points that I am curious about and would like to inquire further.

Upon reviewing the settings related to bs_roformer, I noticed that the configuration predominantly uses only vocals and other in the instruments setup. Generally, MSS datasets like MUSDB18 are composed of a 4-stem setup: [vocals, drums, bass, other], and the mixture is a combination of these four stems. However, your default setting is [vocals, other], and using this setting results in a mixture composed only of vocals + other. I am curious whether this configuration is an error or if there was a specific task intended for this particular setup.

Additionally, while examining the code, I noticed that loudness augmentation is applied once during the load_random_mix function when performing mixup, and again in getitem. I would like to clarify whether applying loudness augmentation twice is by design, a misunderstanding on my part, or a coding error.

I would appreciate your response to these two questions.

Thank you.

EuiYeonKim · 2024-07-18T05:32:10Z

Additionally, I would like to reproduce the performance presented in the paper using only the MUSDB18-HQ dataset. Have you trained the model using only the MUSDB18-HQ dataset? Also, did the performance results align closely with those reported in the paper?

Thank you.

ZFTurbo · 2024-07-25T16:44:03Z

Hello. I didn't try to reproduce the paper results using only MUSDB18 dataset. I trained only vocals and bass models using big datasets. Also as I remember from paper authors trained independent model for each musdb stem.

You can use this config as starting point:
https://github.com/ZFTurbo/Music-Source-Separation-Training/blob/main/configs/config_musdb18_bs_roformer.yaml

Also note that probably authors has more efficient implementation of model because they used larger batch sizes on the same GPUs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiry about Default Instrument Settings and Dual Loudness Augmentation in bs_roformer #39

Inquiry about Default Instrument Settings and Dual Loudness Augmentation in bs_roformer #39

EuiYeonKim commented Jul 17, 2024

EuiYeonKim commented Jul 18, 2024

ZFTurbo commented Jul 25, 2024 •

edited

Loading

Inquiry about Default Instrument Settings and Dual Loudness Augmentation in bs_roformer #39

Inquiry about Default Instrument Settings and Dual Loudness Augmentation in bs_roformer #39

Comments

EuiYeonKim commented Jul 17, 2024

EuiYeonKim commented Jul 18, 2024

ZFTurbo commented Jul 25, 2024 • edited Loading

ZFTurbo commented Jul 25, 2024 •

edited

Loading