Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept query id and slug in QueuedStatement #23407

Merged
merged 1 commit into from
Aug 29, 2024

Conversation

prithvip
Copy link
Contributor

@prithvip prithvip commented Aug 8, 2024

Description

This change adds support for QueuedStatementResource to accept a pre-minted query id and slug.

Motivation and Context

This is required for RFC5 prestodb/rfcs#23

Impact

There is no user-facing or performance impact.

Test Plan

Added test in TestServer

Contributor checklist

  • Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

General Changes
* Add support in QueuedStatement protocol to accept pre-minted query id and slug :pr:`23407`

@prithvip prithvip requested a review from a team as a code owner August 8, 2024 21:18
rschlussel
rschlussel previously approved these changes Aug 12, 2024
Copy link
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a release note about the protocol change.

@prithvip
Copy link
Contributor Author

Add a release note about the protocol change.

@rschlussel done

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add some documentation here in statement.rst.

String statement,
@PathParam("queryId") QueryId queryId,
@QueryParam("slug") String slug,
@DefaultValue("false") @QueryParam("binaryResults") boolean binaryResults,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added for fuzzer testing, and no client presently uses this. I'd remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that no client is using it, and I kept it here since the POST supports it. If we want to remove support for this, we should probably raise an issue first and remove it as part of a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked around and it looks like we are actually using this flag for our client, so we should keep it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share more details on the client that's using it? Is it an internal client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amitkdutta do you have any details on this client?

tracerProviderManager.getTracerProvider(),
Optional.of(sessionPropertyManager));
Query query = new Query(statement, sessionContext, dispatchManager, queryResultsProvider, 0, queryId, slug);
queries.put(query.getQueryId(), query);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best thing to do, to preserve idempotency while also ensuring that retries work, would be to check if the query already exists. If it does, check if the slug matches. If it does, then ensure that the token should be 0, otherwise, fail.

If the query doesn't already exist, then proceed as here.

Copy link
Contributor

@tdcmeehan tdcmeehan Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just remembered that POST is not an idempotent HTTP verb. Idempotency is a nice attribute, and I'd recommend using PUT instead, which is. (The other endpoint doesn't care about idempotency, because you can call POST multiple times for the same query, but the only one that matters is the first one whose nextUri is followed. Not so here--creating multiple queries with the same identical query ID is impossible.)

I still recommend the improvement above, because I think the API should guard against adversarial usage, even if it's trivial or unlikely. For example, POSTing an already executing query here would at best cause an error, at worst could be a security issue.

* @return {@link javax.ws.rs.core.Response} HTTP response code
*/
@POST
@Path("/v1/statement/queued/{queryId}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Path("/v1/statement/queued/{queryId}")
@Path("/v1/statement/{queryId}")

Can we make this consistent with the POST? I don't think the queued signifies any additional information.

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to have this added to the OpenAPI spec we're working on getting into the docs:

https://github.com/prestodb/presto/blob/master/presto-openapi/src/main/resources/queued_statement.yaml

@prithvip
Copy link
Contributor Author

It would be great to have this added to the OpenAPI spec we're working on getting into the docs:

https://github.com/prestodb/presto/blob/master/presto-openapi/src/main/resources/queued_statement.yaml

@ZacBlanco Is there a README for how to add and test this?

@ZacBlanco
Copy link
Contributor

ZacBlanco commented Aug 16, 2024

We don't currently have a readme but it wouldn't hurt to have one. I'll try to submit a PR for that soon with instructions for adding new paths and components.

For the time being, I would look at the existing endpoints in the spec and try mimicking them for your new endpoint. You can refer the official spec for the full set of options: https://spec.openapis.org/oas/latest.html

Once you've added a new path to the YAML you can build the openapi module with ./mvnw install -pl presto-openapi and check the generated index.html in the target/ directory has your new endpoint that would be enough. We don't have tests but the build will error if your new additions to the YAML don't align with the OpenAPI spec.

@prithvip
Copy link
Contributor Author

@tdcmeehan Addressed all comments:

  1. Changed verb from POST to PUT
  2. Added idempotency checks and tests for same
  3. Added documentation to statement.rst

@ZacBlanco
Added docs to OpenAPI

steveburnett
steveburnett previously approved these changes Aug 22, 2024
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local docs build, looks good. Thanks!

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good % one more comment

if (queries.containsKey(queryId)) {
Query query = queries.get(queryId);
if (!query.getSlug().equals(slug) || query.getLastToken() != 0) {
throw badRequest(CONFLICT, "Query already exists");
}
}

abortIfPrefixUrlInvalid(xPrestoPrefixUrl);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (queries.containsKey(queryId)) {
Query query = queries.get(queryId);
if (!query.getSlug().equals(slug) || query.getLastToken() != 0) {
throw badRequest(CONFLICT, "Query already exists");
}
}
abortIfPrefixUrlInvalid(xPrestoPrefixUrl);
abortIfPrefixUrlInvalid(xPrestoPrefixUrl);
if (queries.containsKey(queryId)) {
Query query = queries.get(queryId);
if (!query.getSlug().equals(slug) || query.getLastToken() != 0) {
throw badRequest(CONFLICT, "Query already exists");
}
Query query = queries.get(queryId);
return withCompressionConfiguration(Response.ok(query.getInitialQueryResults(uriInfo, xForwardedProto, xPrestoPrefixUrl, binaryResults)), compressionEnabled).build();
}

I'd recommend returning the same query ID instead of proceeding and creating a new one, to guard against a race condition where we've retried the PUT, but meanwhile a GET has been made on the old copy of the query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdcmeehan Done, changed to return the same query ID.

Comment on lines 277 to 278
Query query = new Query(statement, sessionContext, dispatchManager, executingQueryResponseProvider, 0, queryId, slug);
queries.put(query.getQueryId(), query);
Copy link
Contributor

@tdcmeehan tdcmeehan Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing just to make it 100% foolproof:

Suggested change
Query query = new Query(statement, sessionContext, dispatchManager, executingQueryResponseProvider, 0, queryId, slug);
queries.put(query.getQueryId(), query);
Query query = queries.computeIfAbsent(queryId, unused -> new Query(statement, sessionContext, dispatchManager, executingQueryResponseProvider, 0, queryId, slug));

Copy link
Contributor

@tdcmeehan tdcmeehan Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could combine them into one check, and remove the check you have above.

Suggested change
Query query = new Query(statement, sessionContext, dispatchManager, executingQueryResponseProvider, 0, queryId, slug);
queries.put(query.getQueryId(), query);
Query attemptedQuery = new Query(statement, sessionContext, dispatchManager, executingQueryResponseProvider, 0, queryId, slug);
Query query = queries.putIfAbsent(query.getQueryId(), unused -> attemptedQuery);
if (attemptedQuery != query && !attemptedQuery.getSlug().equals(query.getSlug()) || query.getLastToken() != 0) {
throw badRequest(CONFLICT, "Query already exists");
}

tdcmeehan
tdcmeehan previously approved these changes Aug 23, 2024
@prithvip
Copy link
Contributor Author

@ZacBlanco could I get a stamp? I've added the requested OpenAPI documentation.

Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Sorry for the slow review. Was OOO last week.

Spec looks good, just need to shift the path component it is under.

@prithvip
Copy link
Contributor Author

@tdcmeehan Could I get another stamp?

@prithvip prithvip merged commit 72c07cc into prestodb:master Aug 29, 2024
57 checks passed
@jaystarshot jaystarshot mentioned this pull request Nov 1, 2024
25 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants