-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upload Post-Processing #190
Comments
Possible approaches:
|
Thanks for bringing this up. I've been meaning to answer this for a couple of weeks now but haven't had the time. It's been a long day, so please forgive any ramblings. If it's completely unclear, let me know and I'll explain better. I know we are in the tus repo but some of these issues also apply to RUFH. If we in the tus community agree on them I would be happy to open a similar issue on the IETF repo. Anyhow... While both adding an additional header and and additional link to a status endpoint is viable, I'm not sure that this would solve the issue completely. Either of these would be fairly specific to the domain handling the upload, i.e. a video processing app would need to return other data than say a file sharing app. So let's assume we go with the specific URL. This URL would need to be specified by the app running tus, i.e. in user code (for the previously mentioned reason). We could also chose to go with a well known URL, but then the actual endpoint would still need to be app specific. This would work in some cases but not all and I think it's a very complex task to get some kind of automatic behavior here. This would also not cover the case where the actual user code handling the upload fails for some reason. To try clarify, this is the flow that might (and probably) will happen. This issue exists in both tus and RUFH. I wouldn't really call it a protocol issue but every server implementation I've seen of both tus and RUFH have this issue. It is caused by separation of concern in the app, where the tus part is handling the upload and once the upload is completed, some kind of user code is executed, be it events/hooks or some other type of code (e.g. modelbinding in tusdotnet or the handler in SwiftNIO for RUFH).
This is the issue described in the tus-js-client issue linked in the first post. I somewhat agree with the OP that So questions from me:
|
Wow, that is an incredible idea you came up with. It is the epitheny of hackiness, but also extremely ingenious!
That's a good point. Just because the client received a 500 with the Tus-Resumable header, doesn't mean that something specific to handling uploads failed and is the cause for this 500 error.
This is actually the solution I am preferring now, also because of the feedback we got from the IETF regarding httpwg/http-extensions#2312. The server could include the Content-Location header in a response to POST, HEAD, and PATCH requests. Clients can then GET this URL to retrieve the post-processing result from the uploaded file. If the processing takes a longer time, the server can respond with a 503 status code and the Retry-After header to indicate that the client should retry later. A tus client could then be configured to other pass this URL to the application and let the application handle fetch the processing results. Alternatively, tus clients may include functionality to fetch the results on their own and then make the results directly available to the application.
I think using Content-Location would mean a separation between uploading and post-processing. The client will have one upload URL for performing the upload and will then get another URL for fetching the post-processing results. The client does not have to manually start the post-processing, but it will use another endpoint for querying the results. Does that make sense? What do you think about using Content-Location for this purpose? |
Thank you! That's one of the best compliments I've gotten in life so far ;)
After reading http-extensions#2312 I think this would be a good approach! Reading the RFC it seems like just the use case for it (see below):
Are we allowed to use content-location with a
I think passing the URL is probably a better thing as the processing of the file can take a long time. The response might need to be polled several times before the processing is done. To solve the issue with the processing failing the server could chose to include a "retry processing" url in the response in the content-location header, which the app specific code in the client could then request. |
From reading RFC 9110, I don't see anything speaking against using Content-Location in a 204 response. The RFC commonly refers to cases where the header field is included in 2XX responses, which also counts for 204. So we should be good to go. I am also open to being less restrictive about the response code in responses to PATCH requests. A 200 should also be fine.
The decision whether the post-processing has to be retried should probably be made by the server-side, for example by adding the job again to the queue. If an application wants to let the client decide this, they should provide an application-specific method for doing so. I don't think that this really fits the scope of upload handling :) |
I agree. It just seems odd to return a status code saying that there is no content and at the same time returning a URL to the content that does not exist :) I'm sure it's fine though, so let's go ahead with it.
We mean the same thing here :) The content of the response from the Content-Location is application specific so the server can return whatever data in whatever format they wish. My point was just that the client-retry-thing is solvable if we go with this solution which is good. I agree with you that a queue or similar is better but people don't always use it that way. See for instance the linked issue in tus-js-client where the OP wants to do post-processing in a hook. |
Great, then we are on the same page! Maybe we can revamp #159 and bring it across the finish line! |
Yes let's do that! :) |
After thinking some more on this, maybe we should wait for the RUFH spec to be finalized on this subject and then just backport that to tus? It might be that someone in the httpbis group comes up with something more sophisticated? |
We can wait with formalizing this into tus, yes. That being said, I would like to implement a prototype into tusd anyways to try it out and gather experience with it. The feedback can then be used for RUFH and tus together. Of course, maybe in the end we find out that Content-Location is not suitable at all. |
I updated my tusdotnet RUFH POC to include Content-Location and have written about it on the httpbis' github: httpwg/http-extensions#2312 (comment) For tus, there are some more specific things that we probably need to sort out:
|
That's great, thanks for working on this! I wasn't able to try it out in tusd yet.
I wonder whether Content-Location will every be part of tus v1 or if we should rather wait until the IETF work has completed and then publish tus v2 on top of RUFH with mentions of Content-Location.
That's up to the client in my opinion. Simple clients might simply expose the URL to the user, but more sophisticated clients could also fetch the content of the provided URL (with retries etc). There might be use cases where the user is interested in the content served at that URL, but they might also only care about the URL to pass it to another component for further processing. |
Been meaning to answer this for a while but it slipped my mind...
Yeah I think this is the way to go 👍 |
Most files will be processed after their upload in some form, for example:
While these post-processing steps are application-specific, tus does not have any special support for communication the post-processing or its results back to the client. Originally, the decision to exclude this was intentional because tus should focus on the uploading itself, not any of the subsequent steps. However, this also leads to issues such as tus/tus-js-client#557. Hence, I think tus should have some capability to communicate this post-processing back to the client.
The requirements of such an extension are:
The text was updated successfully, but these errors were encountered: