Improvements to Container CD #385

tylertitsworth · 2024-07-09T16:30:52Z

Description

Update CD to follow DevOps best practices.

How it works

Given an example(s) (AudioQnA), this workflow creates jobs that perform container CI on that example, and allow other event-based workflows to curate a matrix of these examples.

These are the inputs for the composites:

Build

# .github/workflows/composite/docker-build/action.yml
inputs:
  example_dir:
    description: Directory with docker-compose.yaml to build
    required: true
    type: string
  env_overrides:
    description: Bash Env Variable Overrides in `KEY=VAL && KEY2=VAL2` format
    required: false
    type: string
  registry:
    description: Container Registry URL
    required: false
    default: 'opea-project'
    type: string

Scan

# .github/workflows/composite/scan/action.yml
inputs:
  image-ref:
    description: 'image reference(for backward compatibility)'
    required: true
  output:
    description: 'writes results to a file with the specified file name'
    required: true

This is the inputs for the reusable workflow:

Container CI

# .github/workflows/container.ci.yaml
inputs:
  example_dir:
    required: true
    type: string
  scan:
    default: true
    required: false
    type: boolean
  test:
    default: true
    required: false
    type: boolean
  publish:
    default: false
    required: false
    type: boolean

These are the Events that this PR supports:

# .github/workflows/example-weekly-ci.yaml
on:
  # weekly
  schedule:
  - cron: "0 0 * * 0"
  # run all examples manually
  workflow_dispatch: null
  push:
    tags: ['**'] # whenever a tag is created/modified

# ./github/workflows/example-manual-ci.yaml
on:
  workflow_dispatch:
    inputs:
      examples:
        default: 'example1, example2'
        description: 'List of comma-separated examples to test'
        required: true
        type: string
      scan:
        default: true
        description: 'Scan all examples with Trivy'
        required: false
        type: boolean
      test:
        default: true
        description: 'Test Examples'
        required: false
        type: boolean
      publish:
        default: false
        description: 'Publish Images to Dockerhub'
        required: false
        type: boolean

Changes

Add StepSecurity network traffic monitoring during workflows via harden-runner
Tests labels are determined based on the test workflow filename:

# .github/workflows/container-ci.yaml
- name: Get Tests
  id: test-matrix
  run: |
    echo "matrix=$(find ${{ inputs.example_dir }}/tests -type f -name 'test_*.sh' -print | \
    sed -n 's|.*/\(test_[^_]*_on_\([^\.]*\)\).sh|\{"test": "\1", "hardware_label": "\2"}|p' | \
    jq -sc .)" >> $GITHUB_OUTPUT
  # ex: [{"test":"test_audioqna_on_xeon","hardware_label":"xeon"},{"test":"test_audioqna_on_gaudi","hardware_label":"gaudi"}]
...
runs-on: ${{ matrix.tests.hardware_label }}
...
- name: Run Test
  run: bash ${{ matrix.tests.test }}.sh
  env:
    REGISTRY: ${{ secrets.REGISTRY }}

Simplify developer workflow:

# Before
docker build -t container1
docker build -t container2
docker build -t container3
...
# After
docker compose build

Test only the latest version of that example's container
- We pull the last known good version of the comps, and test the changes made in the example only.
- Modifying the test file or docker-compose.yaml file for a given test can allow for testing specific versions of comp containers. No need to modify CI.
status-check flag in pull_request events that verifies if any job in example-ci.yaml failed.

Issues

n/a

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)

Dependencies

n/a

Tests

I don't have the runners required to do the validation portion, however when previously testing with ubuntu-latest I was able to launch the test script from the correct directory on that runner before it ran out of space.
https://github.com/tylertitsworth/GenAIExamples/actions/runs/9896085740?pr=3

Translation/docker/gaudi/README.md

claynerobison

All I see is the minor doc change. other than that, seems like an elegant architecture improvement.

chensuyue · 2024-07-10T03:06:31Z

How does this new structure combined with current compose and k8s functionality test? Our test machine are not able to access an internal docker registry, so we have 2 local registry for CI test. We need to sync about the design.

ashahba

Thanks @tylertitsworth for the PR.
My change requests are mostly minor except for introducing dependency on ai-containers/.github@main which needs to be reviewed by the community.

.github/workflows/container-ci.yml

ashahba · 2024-07-10T03:42:58Z

.github/workflows/container-ci.yml

+        # Get diff array filtered by specific filetypes
+        DIFF=$(git diff --diff-filter=d \
+            --name-only ${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }} \
+            -- '*/*Dockerfile' '*.py' '*.yaml' '*.yml' '*.sh' '*/*requirements.txt' '*.json' '*.ts' '*.js' | \


there may be a few more file types and extensions:
*.dockerfile, *Dockerfile*, *.cjs, *.svelte and etc

.github/workflows/reuse-container-ci.yaml

SearchQnA/docker/xeon/README.md

SearchQnA/docker-compose.yaml

Translation/docker/gaudi/README.md

Translation/docker/xeon/README.md

tylertitsworth · 2024-07-10T05:23:08Z

How does this new structure combined with current compose and k8s functionality test? Our test machine are not able to access an internal docker registry, so we have 2 local registry for CI test. We need to sync about the design.

@chensuyue the current compose structure could be merged with this design. It's all about having a consistent file system and making sure your dependencies are in subfolders so you can accurately track changes to individual containers.

Regarding the local CI, we can change the runner labels and just make sure to use the correct secrets.

Regarding k8s, I believe we previously spoke about how to best test helm charts. We have a composite action for this. However I see a lot of manifests so it might be better to utilize the baremetal test runner mode to call your existing test scripts to validate those manifests. If I were smarter I would've made a branch rather than a fork so I could actually test these things, it's hard to integrate GitHub CI without maintainer access.

tylertitsworth · 2024-07-10T05:26:35Z

My change requests are mostly minor except for introducing dependency on ai-containers/.github@main which needs to be reviewed by the community.

Thanks @ashahba, it's common practice to re-use GitHub actions from other locations. I hope that the community will accept another intel repository as a trusted and maintained source similar to how they trust GitHub, docker, helm, pre-commit.ci, and other open source tools already in use.

If the fear of a dependency failing is important. These actions can be pinned at a specific SHA to avoid failures like I did with many other actions introduced by this workflow.

Signed-off-by: tylertitsworth <[email protected]> add chatqna support Signed-off-by: tylertitsworth <[email protected]> update tests and repo path Signed-off-by: tylertitsworth <[email protected]> add actions.json file for ci configuration Signed-off-by: tylertitsworth <[email protected]> add codegen Signed-off-by: tylertitsworth <[email protected]> add weekly test Signed-off-by: tylertitsworth <[email protected]> remove ref Signed-off-by: tylertitsworth <[email protected]> change runner label Signed-off-by: tylertitsworth <[email protected]> do the rest of them Signed-off-by: tylertitsworth <[email protected]> copy workflow call into repo due to gha bug #2877542 Signed-off-by: tylertitsworth <[email protected]> configure no start Signed-off-by: tylertitsworth <[email protected]> patch scan and test wfs Signed-off-by: tylertitsworth <[email protected]> change to more descriptive name for reusable workflow Signed-off-by: tylertitsworth <[email protected]> additional bugfix Signed-off-by: tylertitsworth <[email protected]> update scan workflow Signed-off-by: tylertitsworth <[email protected]> see help menus Signed-off-by: tylertitsworth <[email protected]> add missing env Signed-off-by: tylertitsworth <[email protected]> use downcase method Signed-off-by: tylertitsworth <[email protected]> remove test Signed-off-by: tylertitsworth <[email protected]> use valid group dir for rest of scan workflow Signed-off-by: tylertitsworth <[email protected]> return import step Signed-off-by: tylertitsworth <[email protected]> add and validate smoke tests Signed-off-by: tylertitsworth <[email protected]>

Signed-off-by: tylertitsworth <[email protected]>

chensuyue · 2024-07-11T04:01:52Z

.github/workflows/reuse-container-ci.yaml

+    - uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
+    - uses: docker/login-action@0d4c9c5ea7693da7b068278f7b52bda2a190a446 # v3.2.0
+      with:
+        registry: ${{ secrets.REGISTRY }}


How to use a common private registry on IDC machine? Now we have 2 local registry separately for xeon and gaudi cluster.

If you can use something like harbor that can be accessed by both clusters or a cloud registry that would be ideal.

If your runners have pre-authenticated access to these local registries, then we could remove the login commands/secrets and use environment variables that are present in the runner environment.

Signed-off-by: tylertitsworth <[email protected]>

Signed-off-by: Tyler Titsworth <[email protected]>

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth added the WIP label Jul 9, 2024

tylertitsworth self-assigned this Jul 9, 2024

tylertitsworth force-pushed the main branch from 62d03f0 to 385aaa3 Compare July 9, 2024 18:17

claynerobison reviewed Jul 9, 2024

View reviewed changes

Translation/docker/gaudi/README.md Outdated Show resolved Hide resolved

claynerobison suggested changes Jul 9, 2024

View reviewed changes

tylertitsworth force-pushed the main branch 2 times, most recently from d1f3947 to e02e651 Compare July 9, 2024 22:39

ashahba requested changes Jul 10, 2024

View reviewed changes

tylertitsworth added 2 commits July 10, 2024 12:21

scope down pr to just audioqna and build+scan

108cb34

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth force-pushed the main branch from 147dab8 to 108cb34 Compare July 10, 2024 19:22

remove excess changes

2f26d16

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth force-pushed the main branch from 94022ab to 2f26d16 Compare July 10, 2024 19:26

tylertitsworth added 5 commits July 10, 2024 12:28

remove no_start input

1c8f287

Signed-off-by: tylertitsworth <[email protected]>

add simple tesst execution job

0def4a4

Signed-off-by: tylertitsworth <[email protected]>

remove matrix strategy from build

cddd538

Signed-off-by: tylertitsworth <[email protected]>

remove rogue if statement

70ebb93

Signed-off-by: tylertitsworth <[email protected]>

pull comps rather than build them

014f98d

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth marked this pull request as ready for review July 10, 2024 22:04

tylertitsworth added enhancement New feature or request and removed WIP labels Jul 10, 2024

chensuyue reviewed Jul 11, 2024

View reviewed changes

tylertitsworth added 5 commits July 11, 2024 09:26

finish cd

7337a55

Signed-off-by: tylertitsworth <[email protected]>

add hw label to test case

63a64ca

Signed-off-by: tylertitsworth <[email protected]>

update name change to reusable wf

daf424b

Signed-off-by: tylertitsworth <[email protected]>

downstream to composite

8af9746

Signed-off-by: tylertitsworth <[email protected]>

debug matrix

c4e1873

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth added 3 commits July 11, 2024 10:44

re-fix repo name casing in scan wf

fc6e5f7

Signed-off-by: tylertitsworth <[email protected]>

Update README.md

9b695e3

Signed-off-by: Tyler Titsworth <[email protected]>

restore ci

87e79cd

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth changed the title ~~Improvements to Container CI~~ Improvements to Container CD Jul 12, 2024

tylertitsworth added 2 commits July 12, 2024 08:39

merge main

398bae0

Signed-off-by: tylertitsworth <[email protected]>

add k8s envs

d02bc88

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth mentioned this pull request Jul 12, 2024

Improvements to Container CD opea-project/GenAIComps#301

Closed

3 tasks

tylertitsworth added 2 commits July 12, 2024 11:15

correct source name

b4b4ee5

Signed-off-by: tylertitsworth <[email protected]>

merge main

54c6712

Signed-off-by: tylertitsworth <[email protected]>

tylertitsworth requested review from chensuyue, claynerobison and ashahba July 22, 2024 22:11

tylertitsworth closed this Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to Container CD #385

Improvements to Container CD #385

tylertitsworth commented Jul 9, 2024 •

edited

Loading

claynerobison left a comment

chensuyue commented Jul 10, 2024 •

edited

Loading

ashahba left a comment

ashahba Jul 10, 2024

tylertitsworth commented Jul 10, 2024 •

edited

Loading

tylertitsworth commented Jul 10, 2024

chensuyue Jul 11, 2024 •

edited

Loading

tylertitsworth Jul 11, 2024

Improvements to Container CD #385

Improvements to Container CD #385

Conversation

tylertitsworth commented Jul 9, 2024 • edited Loading

Description

How it works

Build

Scan

Container CI

Changes

Issues

Type of change

Dependencies

Tests

claynerobison left a comment

Choose a reason for hiding this comment

chensuyue commented Jul 10, 2024 • edited Loading

ashahba left a comment

Choose a reason for hiding this comment

ashahba Jul 10, 2024

Choose a reason for hiding this comment

tylertitsworth commented Jul 10, 2024 • edited Loading

tylertitsworth commented Jul 10, 2024

chensuyue Jul 11, 2024 • edited Loading

Choose a reason for hiding this comment

tylertitsworth Jul 11, 2024

Choose a reason for hiding this comment

tylertitsworth commented Jul 9, 2024 •

edited

Loading

chensuyue commented Jul 10, 2024 •

edited

Loading

tylertitsworth commented Jul 10, 2024 •

edited

Loading

chensuyue Jul 11, 2024 •

edited

Loading