Skip to content

Latest commit

 

History

History
51 lines (37 loc) · 2.92 KB

docker_airflow_setup.md

File metadata and controls

51 lines (37 loc) · 2.92 KB

Docker Airflow Setup 🔝

Download the repository

Clone the repository in your $HOME directory (e.g. for Mac: /Users/<user>/, for Windows is \Users\<user>\). We'll refer this location during the workshop.

$ cd $HOME
$ git clone https://github.com/deliveryhero/pyconde2019-airflow-ml-workshop

Download the dockerized Airflow image docker pull puckel/docker-airflow

📌 For more information about using Airflow in a Docker container check the puckel docker-airflow project.

Move into the directory ($HOME/pyconde2019-airflow-ml-workshop) where you downloaded the repository

$ cd pyconde2019-airflow-ml-workshop

Launch the Airflow Docker container:

$ docker run -p 127.0.0.1:8080:8080 -e LOAD_EX=y -e PYTHONPATH="/usr/local/airflow/pyconde2019-airflow-ml-workshop" -v $HOME/pyconde2019-airflow-ml-workshop/requirements.txt:/requirements.txt -v $HOME/pyconde2019-airflow-ml-workshop/:/usr/local/airflow/pyconde2019-airflow-ml-workshop:rw -v $HOME/pyconde2019-airflow-ml-workshop/dags/:/usr/local/airflow/dags:rw puckel/docker-airflow webserver

The above command specify:

  • 8080:8080: Airflow is reachable at localhost:8080
  • LOAD_EX=y : load the Airflow examples
  • -v $HOME/pyconde2019-airflow-ml-workshop/requirements.txt:/requirements.txt \: install the requirements for the workshop exercises
  • -v $HOME/pyconde2019-airflow-ml-workshop/:/usr/local/airflow/pyconde2019-airflow-ml-workshop:rw : mount the project repository volume
  • -v $HOME/pyconde2019-airflow-ml-workshop/dags/:/usr/local/airflow/dags:rw: mount the volume that contains the dags (the exercise worflows)
  • puckel/docker-airflow webserver: run Airflow with SequentialExecutor

📌 In this tutorial we use Airflow with SQLite DB and the SequentialExecutor.

Executors are the mechanism by which tasks in workflow get run. The Sequential one allows to run one task instance at a time (this is not a production setup). Consider also that SQLite doesn't support multiple connections.

🕚 Wait 1-2 minutes to let Docker be ready with a fresh live Airflow instance! With this dockerized AF version we run a container that spins up both Scheduler and Webserver.

Go to http://localhost:8080 to see that Airflow is running. When it will be ready you should see a screen like this:

airflow docker

🏆 Great! Now everything is ready for starting the Exercises!

✅ Jump to the Airflow main concepts section for continuing the tutorial.