Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (S3) transfer settings #3622

Closed
wants to merge 1 commit into from
Closed

Add (S3) transfer settings #3622

wants to merge 1 commit into from

Conversation

brainstorm
Copy link

Motivation and Context

The motivation attempting to fix awslabs/aws-sdk-rust#968 and being able to implement and test the big files copy S3 Batch Operations strategy outlined in this AWS blogpost but using a Rust runtime lambda instead of Python/Boto3.

This feature is in turn needed for our microservices, AWS EventBus based service, OrcaBus, which is a work in progress @umccr.

Description

Just learned about smithy-rs this afternoon after reading docs, code and watching the excellent @Velfi's Rust Linz youtube intro talk... so this is very much an early draft and attempt to understand where code for this feature would fit (and to possibly inform a RFC if needed).

Learning as I go and trying to implement it by reading similar PRs. At present I'm following boto3's transfer.py names... all in all, feedback and discussion/hints/guidance is very much welcome.

Testing

None yet.

Checklist

  • I have updated CHANGELOG.next.toml if I made changes to the smithy-rs codegen or runtime crates
  • I have updated CHANGELOG.next.toml if I made changes to the AWS SDK, generated SDK code, or SDK runtime crates

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

/cc @andrewpatto @mmalenic

@aajtodd
Copy link
Contributor

aajtodd commented May 3, 2024

Thanks for taking the time to submit a PR and contribute!

What you're looking for is what we refer to as an S3 "transfer manager". It is actually a higher level library built on top of the base S3 client. This actively being discussed but we have no ETA to share at this time. Unfortunately we can't accept work related to this right now as there is actually a fair bit of nuance to getting it right as well as some (sadly) internal design specs that all languages need to adhere to when implementing this capability.


I did see your comment here about the need for this. I don't think you're necessarily limited to Rust here for performance though as boto3 has integrated the CRT based S3 client (a high performance C based S3 client). I can't comment on how to use this but I'm sure the Python folks can help you out if you have trouble. Just an option if you're waiting on Rust support for a transfer utility because while it's something we're actively discussing we don't have firm plans for delivering it at the moment.

@aajtodd aajtodd closed this May 3, 2024
@brainstorm
Copy link
Author

brainstorm commented May 3, 2024

Thanks for the explanation @aajtodd! Yes I was aware of the CRT based Boto3 but since our FileManager component in OrcaBus is written in Rust I was looking for better integration, not just performance criteria.

Looking forward to a more public discussion about the transfer manager and seeing it on the public roadmap eventually!

@brainstorm
Copy link
Author

brainstorm commented May 22, 2024

@aajtodd

I don't think you're necessarily limited to Rust here for performance though as boto3 has integrated the CRT based S3 client (a high performance C based S3 client).

Well, unfortunately (and surprisingly) there's no CRT-backed S3 boto3 copy() (bucket to bucket)... it falls back to 'classic' python transfer client, see this snippet:

https://github.com/boto/boto3/blob/fb608de3453155578fd68a3a627e27b39f44647f/boto3/s3/inject.py#L437-L439

   # copy is not supported in the CRT
    new_config = python_copy.copy(config)
    new_config.preferred_transfer_client = "classic"

/cc @reisingerf

@brainstorm
Copy link
Author

Tracking issue awslabs/aws-sdk-rust#1159, yay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to pick S3 download multipart threshold, and other settings as well.
2 participants