-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ODH release automation #988
ODH release automation #988
Conversation
Skipping CI for Draft Pull Request. |
The idea here is to create a draft PR so that we can start the review process and also find any misfits in the code. |
uses: peter-evans/create-pull-request@v6 | ||
with: | ||
path: ./community-operators-prod | ||
token: <PAT> # We need a token with repo rights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, for some reason the token generated by github app does not work. I created an issue in the library: peter-evans/create-pull-request#2848 .
Also I created a shell script to mimic the pr creation process, but unfortunately even that flow works with a PAT and not with a token generated from GH app. Any suggestions/help here is much appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the PAT to be used needs the repo
scope.
48d338e
to
26c8e11
Compare
Happy to review it once I'm back in a week. Assuming it can wait. |
@AjayJagan Feel free to also add example tracker here for testing |
|
.github/scripts/wait-for-checks.sh
Outdated
|
||
while $(gh pr checks "$1" | grep -q -v 'tide' | grep -q 'pending'); do | ||
printf ":stopwatch: PR checks still pending, retrying in 10 seconds...\n" | ||
sleep 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to wait for e2e test in openshift-ci done normally take more than 1hr, so we really want to have a 10s sleep to print 400+ lines of retrying?
also i think search for 'pending' as keyword to identify if it is under test, is not accurate. if you use any real PR from opendatahub-io (not your mock PR) you will see even the green PR has pending returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true. We could increase the interval to 10 minutes once or so.
I checked the gh pr checks
for #991
and here is what I get
for gh pr checks 991 -R opendatahub-io/opendatahub-operator | grep pending
I get
tide pending 0 https://prow.ci.openshift.org/pr?query=is%3Apr+repo%3Aopendatahub-io%2Fopendatahub-operator+author%3Azdtsw+head%3Ajira%2F443_3 Not mergeable. Needs approved, lgtm labels.
so I though we could do an inverse of tide(which will remove tide from the checks) and grep for pending... if present we could sleep and repeat again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking pr #992 which is currently running checks
for gh pr checks 992 -R opendatahub-io/opendatahub-operator | grep -v tide | grep pending
I get
ci/prow/opendatahub-operator-e2e pending 0 https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/opendatahub-io_opendatahub-operator/992/pull-ci-opendatahub-io-opendatahub-operator-incubation-opendatahub-operator-e2e/1785239301773070336 Job triggered. BaseSHA:ec07da31c18b26885daa006c964ae3ea450a0e6f```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other way we could handle this is, manually wait for the checks to complete and a pr close with a particular comment should trigger it?
@@ -0,0 +1,31 @@ | |||
name: "Release: Create release branch" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this workflow basically is
triggered by a closed PR from "dry run" and create one branch then new PR
why not combine it with the next one release-gh-publish into one?
plus: the name for the workflow is not accurate to what it really does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea maybe we could use a different naming 😅
But the idea is after creating a new branch with the changes, we wait for every checks to finish manually. So I did this because, if in case there is a failure, then we can stop at this point and prevent a release in github
pat-token: ${{ steps.generate-token.outputs.token }} | ||
- name: Create and push version tags | ||
run: | | ||
git config --global user.email 41898282+github-actions[bot]@users.noreply.github.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does "gh label create -d " works in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so after taking a look. here we only tag right? Or do we create labels as well. AFAIK there is no gh cli way to create tags. Let me know if I am wrong here.
- name: Create test release pr | ||
uses: ./.github/actions/create-release-pr | ||
with: | ||
pr-branch: "odh-release/e2e-test" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this pr-branch map to a odh-relase-${{ env.VERSION }}/e2e-test ?
otherwise just hardcoded no need pass down to action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the create-release-pr
is a reusable action used in 2 places. That is why it passes the pr-branch name.
@@ -236,7 +236,7 @@ KUSTOMIZE_INSTALL_SCRIPT ?= "https://raw.githubusercontent.com/kubernetes-sigs/k | |||
.PHONY: kustomize | |||
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary. | |||
$(KUSTOMIZE): $(LOCALBIN) | |||
test -s $(KUSTOMIZE) || { curl -s $(KUSTOMIZE_INSTALL_SCRIPT) | sh -s -- $(subst v,,$(KUSTOMIZE_VERSION)) $(LOCALBIN); } | |||
test -s $(KUSTOMIZE) || { curl -s $(KUSTOMIZE_INSTALL_SCRIPT) | bash -s -- $(subst v,,$(KUSTOMIZE_VERSION)) $(LOCALBIN); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason to change this? dont think we even use make in the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm now I dont remember why it broke and I had to change it. Let me check this once
sed -i -e "s|createdAt.*|createdAt: \"$(date +"%Y-%-m-%dT00:00:00Z")\"|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml | ||
sed -i -e "s|name: opendatahub-operator.v.*|name: opendatahub-operator.v$NEW_VERSION|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml | ||
sed -i -e "s|version: $CURRENT_VERSION.*|version: $NEW_VERSION|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml | ||
sed -i -e "s|replaces.*|replaces: opendatahub-operator.v$CURRENT_VERSION|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont think we are still using replaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove it 👍
sed -i -e "s|name: opendatahub-operator.v.*|name: opendatahub-operator.v$NEW_VERSION|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml | ||
sed -i -e "s|version: $CURRENT_VERSION.*|version: $NEW_VERSION|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml | ||
sed -i -e "s|replaces.*|replaces: opendatahub-operator.v$CURRENT_VERSION|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml | ||
sed -i -e "s|olm.skipRange:.*|olm.skipRange: \'>=$CURRENT_VERSION <$NEW_VERSION\'|g" config/manifests/bases/opendatahub-operator.clusterserviceversion.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are following
olm.skipRange: '>=1.0.0 <$NEW_VERSION'
we wont need get CURRENT_VERSION at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed 👍
@@ -236,7 +236,7 @@ KUSTOMIZE_INSTALL_SCRIPT ?= "https://raw.githubusercontent.com/kubernetes-sigs/k | |||
.PHONY: kustomize | |||
kustomize: $(KUSTOMIZE) ## Download kustomize locally if necessary. | |||
$(KUSTOMIZE): $(LOCALBIN) | |||
test -s $(KUSTOMIZE) || { curl -s $(KUSTOMIZE_INSTALL_SCRIPT) | sh -s -- $(subst v,,$(KUSTOMIZE_VERSION)) $(LOCALBIN); } | |||
test -s $(KUSTOMIZE) || { curl -s $(KUSTOMIZE_INSTALL_SCRIPT) | bash -s -- $(subst v,,$(KUSTOMIZE_VERSION)) $(LOCALBIN); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that because ubuntu latest does ont have sh but only bash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that is the reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error is
test -s /home/runner/work/opendatahub-operator/opendatahub-operator/bin/kustomize || { curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | sh -s -- 3.8.7 /home/runner/work/opendatahub-operator/opendatahub-operator/bin; }
sh: 33: Syntax error: "(" unexpected (expecting "then")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is actually ok. The script has bash's shebang so bash should be used to execute it. Debians by default use dash as the system shell which is pretty much posix
@@ -0,0 +1,88 @@ | |||
name: "Release: GH and operatorhub publish" | |||
on: | |||
pull_request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe i missed something, does previous "release-branch" runs any test?
or how it gets closed ? what if unit-test or linter failed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so when a pr is closed(after we wait for the checks), and it has a particular title ie.
if: github.event.pull_request.merged && startsWith(github.event.pull_request.title, 'ODH') && endsWith(github.event.pull_request.title, 'Release')
Then this pr is triggered
@@ -0,0 +1,40 @@ | |||
name: "Set Shared env vars" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the whole idea to have this is to get the version and track url among all jobs?
but then what happens: we have a 2.10.0 release, due to certain reason it failed in e2e test for days
and we have to make a 2.10.1 to fix some bugs. => 2 release at the same time running workflow.
after this PATCH from 2.10.1 runs, it set the VERSION to 2.10.1 + new tracker URI
when workflow for 2.10.0 runs, it reads out 2.10.1 ?
before going to update all the small detail implementation comments, I would like to have an overall understanding for the solution.
|
So if I understand this correct, there can be 1 workflow with 3 jobs. I did not think in this angle #988 (comment). So I think shared env is not the way to do it. I can create a comment with a particular message and read from it. That would make things much easier. Considering the above changes, can I start the work? let me know if this is okay |
5af74ff
to
1a20447
Compare
Hey @zdtsw , I have changed to workflows and reduced them to 2 workflows with multiple jobs. Also I have made some changes in the code. Let me know if it looks good or it can be made better. Thanks :) |
set -euo pipefail | ||
|
||
update_tags(){ | ||
MANIFEST_STR=$(cat get_all_manifests.sh | grep $1 | sed 's/ //g') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the function do
sed -i -r "/$1/s|([^:]*):([^:]*):[^:]*:(.*)|\1:\2:$2:\3|" get_all_manifests.sh
?
} | ||
|
||
declare -A COMPONENT_VERSION_MAP=( | ||
["\"codeflare\""]=$1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't just [codelare] work for you?
["\"odh-model-controller\""]=$11 | ||
["\"kserve\""]=$12 | ||
["\"modelregistry\""]=$13 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we will have to sync get_all_manifests.sh with the script for new components. I do not have nice solution in my mind at the moment, will think about that.
ec0484a
to
7018a10
Compare
7018a10
to
a935c61
Compare
/retest |
|
||
printf "!!An unknown error occurred!!\n" | ||
pr_has_status $1 fail && { echo "!!PR checks failed!!"; exit 1; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering, since gh pr checks
has exit status 0 (success) when there are no failed tests and non-0 otherwise, if it's sufficient here to just run it without extra logic? Probably, "unknown error" or "failed checks" will be clear from the log?
a935c61
to
28b971c
Compare
I think we should not block this for operatorhub PR automation. |
@VaishnaviHire, with the secrets set for quay and gh app, we should be good to test this :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: VaishnaviHire The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
660e824
into
opendatahub-io:incubation
* ODH release automation * improve shell scripts * disable community operators pr creation * move version update to separate script
A set of github actions/ scripts to automate the entire release process in github.
Description
This change enables us to automate the odh release process through a set of github actions and scripts.
A detailed explanation can be found here.