Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider TorchData, DataLoader2, etc #1382

Closed
albertz opened this issue Aug 22, 2023 · 0 comments
Closed

Reconsider TorchData, DataLoader2, etc #1382

albertz opened this issue Aug 22, 2023 · 0 comments

Comments

@albertz
Copy link
Member

albertz commented Aug 22, 2023

There were multiple reasons why we adopted TorchData:

  • It seemed to be the future for datasets in PyTorch, thus deprecating the old way.
  • The design looked overall better, specifically better suited for big datasets, iterable datasets.

However, it also had some downsides:

  • It was still in beta. I did not really encounter any errors, but docs etc could all be improved more.
  • The internal design was a bit strange to me. Specifically that it heavily relied on deepcopy.

And now, it seems as if they are also not happy with the overall design, and development has been halted.

⚠️ As of July 2023, we have paused active development on TorchData and have paused new releases. We have learnt a lot from building it and hearing from users, but also believe we need to re-evaluate the technical design and approach given how much the industry has changed since we began the project. During the rest of 2023 we will be re-evaluating our plans in this space. Please reach out if you suggestions or comments (please use pytorch/data#1196 for feedback).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant