Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with data set with multiple files #58

Open
megjhani opened this issue Oct 6, 2023 · 0 comments
Open

Help with data set with multiple files #58

megjhani opened this issue Oct 6, 2023 · 0 comments

Comments

@megjhani
Copy link

megjhani commented Oct 6, 2023

Hi gzerveas,

Thank you for sharing the code. I successfully ran the code on a regression datasets (tsra). Additionally, I also explored two ways to run my custom dataset

a) I created my own .ts file, although, as you mentioned in other discussions, this might not be necessary and I can create a custom class to load the dataset and simply update the 'readdata' function.

b) I developed a custom class that traverses the directory and loads all the pickle datasets (multiple files). However, I've noticed that it loads all the data regardless of the batch size. Since I have multiple files, each ranging from a few megabytes to gigabytes in size, it can be a bottleneck to load all the data at once into memory. Is there any workaround for this? I was wondering dataloader but if it was already implemented or if you have explored then will appreciate your insights on this.

Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant