-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flink Kubernetes Support #2
base: release-1.7
Are you sure you want to change the base?
Conversation
allocating (or sharing) resources
flink-kubernetes/README.md
Outdated
|
||
## Task Manager | ||
Task manager is a temporary essence and is created (and deleted) by a job manager for a particular slot. | ||
No deployments/jobs/services are created for a task manager only pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"for a task manager, only pods" comma missing?
Example: | ||
``` | ||
kubectl create -f jobmanager-deployment.yaml | ||
kubectl create -f jobmanager-service.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jobmanager-exposer-deployment.yaml ?
Also, a question сomes up instantly how exactly it exposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That creates the deployment with one job manager and service around it that exposes
(ClusterIP/NodePort/LoadBalancer/ExternalName) the job manager
https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
flink-kubernetes/README.md
Outdated
TBD | ||
|
||
## Kubernetes Resource Management | ||
Resource management uses a default service account every pod contains. It should has admin privileges to be able |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"should have"
package org.apache.flink.kubernetes.client; | ||
|
||
/** | ||
* represent a endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what endpoint?
void terminateClusterPod(ResourceID resourceID) throws KubernetesClientException; | ||
|
||
/** | ||
* stop cluster and clean up all resources, include services, auxiliary services and all running pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments begin with a capital letter and some don't
public Collection<ResourceProfile> startNewWorker(ResourceProfile resourceProfile) { | ||
LOG.info("Starting a new worker."); | ||
try { | ||
nodeManagerClient.createClusterPod(resourceProfile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So at higher level we provide a worker with one slot only, does that strategy have a downside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, it is our basis, we consciously do the same on samza.
It's a reasonable solution because in this case, different slot threads will not compete for the CPU and memory (since task manager doesn't isolate these resources). Also recovering is easier. However, we will use a slot sharing feature and share slots between different Flink operations according to pipeline logic to get rid of high network usage between task managers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for a downside, you asked, I may mention the absence of resource sharing. In the case of low job utilization, a task manager will simply stand idle without much load.
Also, in this case, there will be no slot grouping. This feature tends to reduce network traffic by allocating slots on a single task manager. However, we will use slot sharing instead.
The current implementation of Kubernetes support is made for a session cluster only.
For additional information please see README file