Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize dataset and classes/label for training #1

Open
JanRodriguez opened this issue Sep 16, 2024 · 2 comments
Open

Customize dataset and classes/label for training #1

JanRodriguez opened this issue Sep 16, 2024 · 2 comments
Assignees

Comments

@JanRodriguez
Copy link

It would be great if you could customize a dataset to use for training, for instance with CoNLL files (similar to this).

@richardjonker2000
Copy link
Collaborator

Hi, thanks for your interest in the work.

We are currently looking into the code, and developing a more user freindly way to utilize other datasets. If you have other suggestions for formats let us know and we will also implement them.

We will try to complete these changes by the end of next week, however if you urgently need a solution, the easiest way would be to create a variation of the data.py/load_train_val()/data.py/load_train_test() that loads your dataset in a similar format. However, as mentioned, we are currently working on a much simple solution, and will let you know when this solution is implemented.

@T-Almeida
Copy link
Collaborator

Hi @JanRodriguez ,

Apologies for the delayed response - we've been focused on meeting several critical deadlines. @richardjonker2000 has implemented an enhanced API in PR #2 that provides better support for external datasets. The implementation details are documented in the README. Please let us know if you need any clarification.
Additionally, we've identified a minor bug (issue #3) in the inference module. Furthermore, we'll also be making some adjustments to inference API itself in the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants