Skip to content
This repository has been archived by the owner on Jun 15, 2024. It is now read-only.

Guide: Pipeline Structure Refactoring

Florian Sellmayr edited this page Dec 19, 2015 · 8 revisions

In this guide, we are going to walk through some simple refactorings to make your build pipelines more expressive, easier to read and easier to maintain.

While these steps are in a particular order and tell a story of iterating on the pipeline, don't assume that the final result is necessarily what you should aim for in every situation. Each of the styles and refactorings here has potential benefits and drawbacks in terms of complexity, maintainability and readability. Try them for yourself and mix and match what fits well in your context.

The all the code and the refactorings made in this guide are available on GitHub:

https://github.com/flosell/lambdacd-pipeline-structure-refactoring-example

Intro: The initial set up

We start off with a pipeline you might expect to find to test and deploy a typical web application:

(def pipeline-def
  `(
     (either
       wait-for-manual-trigger
       wait-for-commit)

     (with-repo
       run-unit-tests
       run-acceptance-tests
       build-artifact
       publish-artifact)

     check-preconditions-ci
     deploy-ci
     smoke-test-ci
     run-ci-tests

     check-preconditions-qa
     deploy-qa
     smoke-test-qa

     wait-for-manual-trigger

     check-preconditions-live
     deploy-live
     smoke-test-live

     report-live-deployment))

It waits for commits to a repository (or someone triggering it), runs a few tests and deploys to three environments: A CI environment where additional automated tests will run, a QA environments for manual, exploratory testing and, after a manual signoff, to production. Each deployment first checks if the environment is ready to be deployed to, deploys and runs a smoke test to make sure everything went well.

Step 1: Adding some structure

The above pipeline does the job and the code doesn't look too bad. But the only thing structuring it right now is newlines in the code. Removing them, you'd probably be lost. And looking at the pipeline in the UI, you are probably lost and can't even fit a nice overview onto your screen:

So let's try adding a bit of structure to the code. The run control flow element is perfect for this: it gives things that belong together a common container but doesn't add any additional behavior.

Let's start with deployment:

(run 
       check-preconditions-ci
       deploy-ci
       smoke-test-ci
       run-ci-tests)

Ok, that looks better. But now, pipeline just looks like a set of run steps cluttered across the UI:

So we need a way to rename those steps into something more helpful. That's what the alias control flow element gives you:

(alias "deploy to CI"
  (run
    check-preconditions-ci
    deploy-ci
    smoke-test-ci
    run-ci-tests))

We can also rename some of LambdaCDs built-in steps to better fit the context they are used in:

(alias "wait for signoff"
  wait-for-manual-trigger)

Combining alias and run, we end up with a much cleaner pipeline:

(def pipeline-def
  `(
     (alias "triggers"
            (either
              wait-for-manual-trigger
              wait-for-commit))

     (alias "test and build"
            (with-repo
              run-unit-tests
              run-acceptance-tests
              build-artifact
              publish-artifact))

     (alias "deploy to CI"
            (run
              check-preconditions-ci
              deploy-ci
              smoke-test-ci
              run-ci-tests))

     (alias "deploy to QA"
            (run
              check-preconditions-qa
              deploy-qa
              smoke-test-qa))

     (alias "wait for signoff"
            wait-for-manual-trigger)

     (alias "deploy to LIVE"
            (run
              check-preconditions-live
              deploy-live
              smoke-test-live
              report-live-deployment))))

Step 2: Removing Duplicate Steps

OK, so our pipeline now looks quite a bit more clean. But do you notice we are doing pretty much the same thing for all our deployments, with separate deployment steps for each environment? We should probably try to get rid of this duplication.

So instead of creating separate steps for every environment, we are now trying to parameterize our steps. What we are trying to achieve is this:

(def pipeline-def
  `(
     (alias "triggers"
            (either
              wait-for-manual-trigger
              wait-for-commit))

     (alias "test and build"
            (with-repo
              run-unit-tests
              run-acceptance-tests
              build-artifact
              publish-artifact))

     (alias "deploy to CI"
            (run
              (check-preconditions :ci)
              (deploy :ci)
              (smoke-test :ci)
              run-ci-tests))

     (alias "deploy to QA"
            (run
              (check-preconditions :qa)
              (deploy :qa)
              (smoke-test :qa)))

     (alias "wait for signoff"
            wait-for-manual-trigger)

     (alias "deploy to LIVE"
            (run
              (check-preconditions :live)
              (deploy :live)
              (smoke-test :live)
              report-live-deployment))))

For this, we need to refactor our deployment-steps, get rid of the duplicates and put in parameterized versions instead (obviously, the steps don't do anything in this example but you get the point):

(defn check-preconditions [environment]
  (fn [args ctx]
    (step-support/capture-output ctx
                                 (println "checking preconditions for deployment to " environment " environment...")
                                 {:status :success})))

(defn deploy [environment]
  (fn [args ctx]
    (step-support/capture-output ctx
                                 (println "deploying to " environment " environment...")
                                 {:status :success})))

(defn smoke-test [environment]
  (fn [args ctx]
    (step-support/capture-output ctx
                                 (println "running smoke tests against " environment " environment...")
                                 {:status :success})))

In the last step, we made big improvements on the UI side. This time, the pipeline almost didn't change at all:

Step 3: Removing Structural Duplication

In Step 2, we got rid of our step-duplicates by parameterizing. However, our pipeline still has some fragments that repeat for each environment: Each deployment checks preconditions, deploys and smoke tests. So let's try getting rid of this duplication as well.

Note: This step contains some more advanced Clojure magic. If you are bothered by any of those strange symbols do, read up on Clojure Macros, (syntax) quoting and unquoting. For example, Leonardo Borges wrote up a nice Cheatsheet

So let's first create a function that generates the pipeline fragment for deployment:

(defn deploy-steps [environment]
  `((check-preconditions ~environment)
     (deploy ~environment)
     (smoke-test ~environment)))

This is a normal clojure function that returns a syntax quoted list. We need to unquote the environment parameter to fill in the real value.

Now we just need to call it in our pipeline using the unquote-splicing operator ~@ to get and unpack the result:

(def pipeline-def
  `(
     (alias "triggers"
            (either
              wait-for-manual-trigger
              wait-for-commit))

     (alias "test and build"
            (with-repo
              run-unit-tests
              run-acceptance-tests
              build-artifact
              publish-artifact))

     (alias "deploy to CI"
            (run
              ~@(deploy-steps :ci)
              run-ci-tests))

     (alias "deploy to QA"
            (run
              ~@(deploy-steps :qa)))

     (alias "wait for signoff"
            wait-for-manual-trigger)

     (alias "deploy to LIVE"
            (run
              ~@(deploy-steps :live)
              report-live-deployment))))

This results in exactly the same pipeline structure (I even wrote a test to prove it) but with one more piece of duplication removed from our code.

Alternative: Compacting Steps

In the previous steps, we have seen a couple of techniques to refactor pipelines with many detailed steps into something thats readable and maintainable. However, not all pipelines need that level of detailed structure visible to the user.

So instead of having all the sub-steps of the deployment in the pipeline structure, we can use the chaining macro to aggregate the sub steps into a single deployment step:

(defn complete-ci-deployment [args ctx]
  (chaining args ctx
            (check-preconditions-ci injected-args injected-ctx)
            (deploy-ci injected-args injected-ctx)
            (smoke-test-ci injected-args injected-ctx)))

The behavior is still the same, but now the details of the deployment are hidden from the user of the UI, just visible in the output of the complete deployment step: