Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC for Multi-Cluster Resource Management #23

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

prithvip
Copy link

@prithvip prithvip commented Aug 1, 2024

A proposal for multi-cluster resource management

A proposal for multi-cluster resource management
@prithvip
Copy link
Author

prithvip commented Aug 1, 2024

6. Dispatching this query to the cluster
7. Redirecting the client to the picked cluster

The internal details of this queueing service are still under development, and this document focuses on the changes required to Presto for such a service to work. If there is interest from the community, we can provide an open-source reference implementation of this queueing service, once mature.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 we at uber will also be interested. What language will this be implemented in? Will it be similar to https://github.com/trinodb/trino-gateway?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be implemented in Java. It is similar in some ways, but Trino Gateway is only a routing layer, this queueing service will also include queueing and scheduling. We would likely expose an interface to plug in any custom routing logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is routing + queueing. We already have a module for the routing component, that's the presto-router. Could the reference implementation be added there--i.e., could the Presto Router be configured to also take on queueing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I read this as, a separate internal implementation will be developed, but also a reference implementation will separately be developed. i.e. there will be two implementations. Could we all work off of the reference implementation, and it can be configured through an SPI to integrate with internal components, rather than have two separate implementations?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tdcmeehan We have to see how the queueing implementation evolves, so this is TBD. We expect to use internal components and its not clear how easily we can configure these through an SPI, while still maintaining usefulness. Two integration points that could make sense are an SPI for ResourceGroupManager and for the router. The reference implementation could use the presto-router. We can discuss these points when the design is more mature. For this RFC, I want to focus on the changes required to integrate with this service. Mostly the work here for now is straightforward refactoring.

The internal details of this queueing service are still under development, and this document focuses on the changes required to Presto for such a service to work. If there is interest from the community, we can provide an open-source reference implementation of this queueing service, once mature.

### Diagram
![Presto Github Issue GRM](https://github.com/user-attachments/assets/7b2af8fd-5d82-411d-ac97-795e54e35164)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to add this diagram as a separate file like here

Because we want this to be transparent to Presto clients, the front-end of the queueing service will reuse the existing functionality of classes in the presto-dispatcher package such as QueuedStatementResource, DispatchManager, DispatchQuery, etc. The high-level work items are:

1. Allow for the queueing service to forward all required query metadata to the cluster, such as query ID, authenticated identity, query state timings, etc. This will involve changes to QueuedStatementResource, QueryStateMachine, and DispatchManager, so that coordinator can create a query with an already populated QueryId and AuthorizedIdentity.
2. De-couple the dispatching and execution phases of the query lifecycle. This will involve separating and removing dependencies that are required for execution of the query, but not for queueing, such as Metadata class. We will create a new module, presto-dispatcher, that has all classes required for the queueing phase, such as QueuedStatementResource, DispatchManager, ResourceGroupManager, etc. The presto-main module will have a dependency on this new module. Interfaces such as ResourceGroupManager and DispatchQuery will be in presto-dispatcher, with implementations such as InternalResourceGroupManager and LocalDispatchQuery in presto-main. In addition, it could be necessary to move some classes into a common module, if they are required by both presto-main and presto-dispatcher.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will create a new module, presto-dispatcher
Will presto-dispatcher be used by the new service?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the queueing service will have a dependency on presto-dispatcher. The queueing service "implementation" of the dispatcher is as standalone server, and the presto-main "implementation" of the dispatcher will be in-process with the coordinator. The code is already organized this way - for example, DispatchQuery is an interface that would live in presto-dispatcher, with LocalDispatchQuery being the implementation in presto-main and a hypothetical RemoteDispatchQuery that could be implemented by the queueing service.

Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some details on how authentication and authorization will work between the queueing service and the individual coordinators?

### Work Items
Because we want this to be transparent to Presto clients, the front-end of the queueing service will reuse the existing functionality of classes in the presto-dispatcher package such as QueuedStatementResource, DispatchManager, DispatchQuery, etc. The high-level work items are:

1. Allow for the queueing service to forward all required query metadata to the cluster, such as query ID, authenticated identity, query state timings, etc. This will involve changes to QueuedStatementResource, QueryStateMachine, and DispatchManager, so that coordinator can create a query with an already populated QueryId and AuthorizedIdentity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you go into more detail on what sort of changes would be needed in the client protocol (QueuedStatementResource)?

6. Dispatching this query to the cluster
7. Redirecting the client to the picked cluster

The internal details of this queueing service are still under development, and this document focuses on the changes required to Presto for such a service to work. If there is interest from the community, we can provide an open-source reference implementation of this queueing service, once mature.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is routing + queueing. We already have a module for the routing component, that's the presto-router. Could the reference implementation be added there--i.e., could the Presto Router be configured to also take on queueing?

6. Dispatching this query to the cluster
7. Redirecting the client to the picked cluster

The internal details of this queueing service are still under development, and this document focuses on the changes required to Presto for such a service to work. If there is interest from the community, we can provide an open-source reference implementation of this queueing service, once mature.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I read this as, a separate internal implementation will be developed, but also a reference implementation will separately be developed. i.e. there will be two implementations. Could we all work off of the reference implementation, and it can be configured through an SPI to integrate with internal components, rather than have two separate implementations?

@prithvip
Copy link
Author

prithvip commented Aug 8, 2024

@tdcmeehan

Can you add some details on how authentication and authorization will work between the queueing service and the individual coordinators?

The queueing service will authenticate and, as a result of the authentication, will create an Identity object, same as how the coordinator does it today. This Identity object will be serialized and passed as a HTTP header to the coordinator's QueuedStatementResource. The coordinator will have a trusted relationship with the queueing service, and do authentication to make sure the request is being received from the queueing service (configured trusted principal). This trusted relationship is similar to how Presto Proxy works today with coordinator. There is no change to authorization and authorization requests will happen at the coordinator (during plan time).

Can you go into more detail on what sort of changes would be needed in the client protocol (QueuedStatementResource)?

There will be no changes required to any client. QueuedStatementResource will be extended with a new POST endpoint, as "/v1/statement/{queryId}?slug=SLUG". Two headers will be added: 1) forward the serialized Identity object 2) forward query state timings such as waitingForPrerequisitiesTime, queuedTime, etc. DispatchManager will require some minor modifications to accept a externally-minted queryId, slug, and pre-existing state timings. This is the high level overview, but of course, more detailed discussion can happen on the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants