random access on remote slides #76

mdrio · 2023-09-06T10:19:57Z

mdrio
Sep 6, 2023

Hi, opening remote slides (like on s3) sounds a cool feature. In an old discussion (openslide/openslide.github.io#11 (comment)), it is stated that using s3 somehow limits reading slides to sequential access. Anyway it seems Range Header is supported by s3, allowing to randomly access to a file.

Does anyone have some experience in randomly reading regions of slides stored on s3? What about the performance?
Thanks

Answered by ap--

Sep 6, 2023

Hi @mdrio

This is fully supported for tiffslide and not limited to sequential access as with fuse mounting mentioned on the open slide thread.

Performance wise, for random sampling, it's best to sample along tile boundaries in multiples of the file internal tile sizes, to avoid having to load a lot of data just to throw it away.

Theoretically throughput on ec2 should be possible to scale up to 100Gbit / sec from s3. Whenever time allows I'm working on setting up some benchmarks for this explicitly. Might become available in the near term future.

You might want to checkout www.github.com/bayer-group/pado if you need a data loader for pathology image datasets that works with cloud native im…

View full answer

ap-- · 2023-09-06T13:52:02Z

ap--
Sep 6, 2023
Maintainer

Hi @mdrio

This is fully supported for tiffslide and not limited to sequential access as with fuse mounting mentioned on the open slide thread.

Performance wise, for random sampling, it's best to sample along tile boundaries in multiples of the file internal tile sizes, to avoid having to load a lot of data just to throw it away.

Theoretically throughput on ec2 should be possible to scale up to 100Gbit / sec from s3. Whenever time allows I'm working on setting up some benchmarks for this explicitly. Might become available in the near term future.

You might want to checkout www.github.com/bayer-group/pado if you need a data loader for pathology image datasets that works with cloud native image locations too.

Cheers,
Andreas 😊

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

random access on remote slides #76

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

random access on remote slides #76

mdrio Sep 6, 2023

Replies: 1 comment

ap-- Sep 6, 2023 Maintainer

mdrio
Sep 6, 2023

ap--
Sep 6, 2023
Maintainer