Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Rename/refactor BioSoundSegBench dataset -> CMACBench #776

Open
2 of 5 tasks
NickleDave opened this issue Sep 10, 2024 · 1 comment
Open
2 of 5 tasks

CLN: Rename/refactor BioSoundSegBench dataset -> CMACBench #776

NickleDave opened this issue Sep 10, 2024 · 1 comment

Comments

@NickleDave
Copy link
Collaborator

NickleDave commented Sep 10, 2024

  • rename as per CLN: Rename BioSoundSegBench -> CMACBench CMACBench#3
  • refactor into subpackage with modules transforms, helper, cmacbench so we don't have a single 750 line module
  • add classmethod from_config that we can call like so CMACBench.from_config(**dataset_config)
  • move logic for determining labelmap to classmethod
  • to classmethod, add arg labelmap_path that allows us to specify an alternate labelmap; if none is provided, use default labelmap instead
@NickleDave
Copy link
Collaborator Author

NickleDave commented Sep 15, 2024

After thinking about it more, I'm going to replace the "splits_path" parameter with "metadata_path", as discussed in vocalpy/CMACBench#4

We already include splits as part of the metadata with datasets that are prepped by vak.prep for use with model-specific datapipes, so we should just extend this logic to the built-in dataset

This also lets us remove the gross function in vak.datasets.cmacbench.helper that infers metadata from a naming scheme, and the dataclass that goes along with it. Instead we'll just convert the other dataclass that's currently called vak.datasets.cmacbench.helper.SplitsJson to Metadata like we have for datasets that vak can prep, and in that way declare programatically what is required in the metadata, as far as vak is concerned: in this case, the frame duration, the path to the splits csv, the path to the bookkeeping vectors, and the path to the labelmap in json

And it lets us avoid adding some even more convoluted classmethod

NickleDave added a commit that referenced this issue Sep 16, 2024
- move class into sub-package with separate modules for helper functions and default transforms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant