Scheduler defaults to 5 runs, so it goes into a CrashLoopBackoff when deployed #20

jdavidheiser · 2017-12-08T22:10:05Z

kube-airflow/airflow.all.yaml

Line 194 in 01ae78a

args: ["scheduler", "-n", "5"]

This causes the file read loop to happen five times, then the scheduler exits. It seems like a strange default setup.

I'm a bit confused why this is set this way - shouldn't the scheduler be looping indefinitely? I'm also seeing the scheduler failing to queue up tasks, same as #19, and I wonder if this is the cause in that case, or something else.

gsemet · 2017-12-09T08:48:29Z

airflow is weird. The whole purpose of this setting is to let the scheduler kills itself periodically to reload DAGs. In kubernetes this does not have a huge impact since it will be restarted automatically, and the whole kill/restart cicle can take a while, but airflow does not do sub seconds precision.

-1 means you can never update your DAG, 1 means scheduler kills itself at every task launch

jdavidheiser · 2017-12-11T19:25:21Z

I feel like it would have less impact in Docker, but with Kube managing the pods it ends up putting the cluster in a not-happy state with backoffs because the exiting script looks like a crash. Thanks for the heads up on the motivation to exit after a few task runs - I'm going to modify the start shell script in my version of the Docker container. I think it makes sense to run the scheduler in a while loop but break if it returns a bad error code, so Kube can still manage those incidents as real crashes.

gsemet · 2017-12-11T19:40:17Z

feel free to submit a pull request. I do have my scheduler restarting regularly, I don't see problems except it takes a few minutes to power on (so delaying next dag start)

ryan-riopelle · 2018-10-31T22:49:12Z

The issue that I had with kubernetes is that it tracks the number of restarts, so if you run this application indefinitely you could see large reset numbers over a long period of time which would be a red flag to an administrator that runs "kubectl get pods" on the cluster, unless I am understanding it wrong.

As a solution, maybe this pod could be run as a kubernetes cronjob or kubernetes job.
Change in YAML would be similar to below but have not fully debugged yet.

Would this break the way the scheduler works?

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: scheduler
  labels:
    app: airflow
    tier: scheduler
spec:
  schedule: "*/2 * * * *" #every 5 minutes
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: Never
          containers:
          - name: scheduler
            image: <image-location>
            # volumes:
            #     - /localpath/to/dags:/usr/local/airflow/dags
            env:
            - name: AIRFLOW_HOME
              value: "/usr/local/airflow"
            args: ["scheduler", "-n", "5"]

aditinabar · 2019-09-06T21:04:30Z

@gsemet How/where did you change the config for the scheduler to restart automatically? I'm not seeing it in airflow.cfg.

Lord-Y · 2019-09-30T14:40:33Z

@gsemet when scheduler args n != -1 it will restart and then go to CrashLoopBackOff later. You can see it in helm chart

chrismclennon mentioned this issue Jan 5, 2019

Run scheduler in a loop #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduler defaults to 5 runs, so it goes into a CrashLoopBackoff when deployed #20

Scheduler defaults to 5 runs, so it goes into a CrashLoopBackoff when deployed #20

jdavidheiser commented Dec 8, 2017 •

edited

Loading

gsemet commented Dec 9, 2017 •

edited

Loading

jdavidheiser commented Dec 11, 2017

gsemet commented Dec 11, 2017

ryan-riopelle commented Oct 31, 2018 •

edited

Loading

aditinabar commented Sep 6, 2019 •

edited

Loading

Lord-Y commented Sep 30, 2019

Scheduler defaults to 5 runs, so it goes into a CrashLoopBackoff when deployed #20

Scheduler defaults to 5 runs, so it goes into a CrashLoopBackoff when deployed #20

Comments

jdavidheiser commented Dec 8, 2017 • edited Loading

gsemet commented Dec 9, 2017 • edited Loading

jdavidheiser commented Dec 11, 2017

gsemet commented Dec 11, 2017

ryan-riopelle commented Oct 31, 2018 • edited Loading

aditinabar commented Sep 6, 2019 • edited Loading

Lord-Y commented Sep 30, 2019

jdavidheiser commented Dec 8, 2017 •

edited

Loading

gsemet commented Dec 9, 2017 •

edited

Loading

ryan-riopelle commented Oct 31, 2018 •

edited

Loading

aditinabar commented Sep 6, 2019 •

edited

Loading