Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto upgrade from n-1|n-2 releases #1934

Closed
wants to merge 1 commit into from

Conversation

dparmar18
Copy link

No description provided.

@dparmar18
Copy link
Author

@batrick the idea here is something like this, let's say we have this in our fs upgrade suite yaml:

tasks:
- install:
    branch: n-2

When teuthology-suite -k distro --machine-type smithi --priority 75 --suite fs:upgrade --ceph squid is ran, n-1 and n-2 server as placeholders which we try to read at teuthology side and replace it with the required branch therefore for the yaml above, what would mean is - start from quincy and upgrade to squid

@dparmar18
Copy link
Author

@batrick the idea here is something like this, let's say we have this in our fs upgrade suite yaml:

tasks:
- install:
    branch: n-2

When teuthology-suite -k distro --machine-type smithi --priority 75 --suite fs:upgrade --ceph squid is ran, n-1 and n-2 server as placeholders which we try to read at teuthology side and replace it with the required branch therefore for the yaml above, what would mean is - start from quincy and upgrade to squid

The chacra API returns a list like this:

dparmar:~$ curl -s  https://chacra.ceph.com/repos/ceph-release/ | jq "keys"
[
  "firefly",
  "hammer",
  "infernalis",
  "jewel",
  "kraken",
  "luminous",
  "mimic",
  "nautilus",
  "octopus",
  "pacific",
  "quincy",
  "reef",
  "test"
]

we have test in the returned list so it needs to be handled correctly since it cannot be used for upgrades (right?), let us run through the possibilities here:

  1. if --ceph <non-main-branch> and test exists in list:
  • for n-1 go two steps back
  • for n-2 go three steps back
  1. if --ceph main and test exists in list:
  • main and test are the same because if we have this, it means there is a branch currently in testing, therefore the main must be branched out. for instance currently we have squid which is un-released, the squid and main code are the same and so is with the test. Therefore:
    • for n-1 go two steps back
    • for n-2 go three steps back
  1. if --ceph main and test does not exist in list
  • for n-1 go one step back
  • for n-2 go two steps back
  1. if --ceph <non-main-branch> and test does not exists in list:
  • for n-1 go one step back from non-main-branch
  • for n-2 go two steps back from non-main-branch

@dparmar18
Copy link
Author

waiting for @batrick's ack before moving ahead

import json
import subprocess
curlurl = 'curl -s https://chacra.ceph.com/repos/ceph-release/ | jq "keys"'
status, output = subprocess.getstatusoutput(curlurl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not (or does not need to) be a dynamic part of the teuthology install task. We can construct the list of releases to upgrade from when creating the job matrix. Then it can be an explicit part of the job config / install task.

I recommend we table this for now until I can spend some time on the teuthology code to dynamically add dimensions to the matrix.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can construct the list of releases to upgrade from when creating the job matrix

then would incur manual efforts which completely kills the purpose, the chacra API maintains the list, why not use it instead of constructing an additional one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not manual effort. We're moving the dynamic release lookup to the matrix generation. The advantage there is we can construct a matrix that upgrades for multiple minor releases of n-1 or n-2, not just the current HEAD.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I remember from the talk post-standup a few weeks back, so the idea is to move even one step forward than this and not just to limit to few major releases but rigorously carry out tests on multiple minor releases, makes sense but the code you highlighted to be not part of the teuthology code:

Then it can be an explicit part of the job config / install task.

would mean to get this into the YAMLs which would complicate the process, if there is some sort of abstraction i.e. to only have some placeholders in places in configs/install tasks like n_1_minor_releases would mean to fetch all the reef releases(just for instance keeping squid as ref point) and create the dynamic matrix you're referring to. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I remember from the talk post-standup a few weeks back, so the idea is to move even one step forward than this and not just to limit to few major releases but rigorously carry out tests on multiple minor releases, makes sense but the code you highlighted to be not part of the teuthology code:

Then it can be an explicit part of the job config / install task.

would mean to get this into the YAMLs which would complicate the process, if there is some sort of abstraction i.e. to only have some placeholders in places in configs/install tasks like n_1_minor_releases would mean to fetch all the reef releases(just for instance keeping squid as ref point) and create the dynamic matrix you're referring to. What do you think?

You don't need placeholders. Just use an override for the install: task config.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an override would have to simple like the ones we have right now like for instance:

overrides:
  ceph:
    conf:

doesn't that means we'd still have to get the code to get the dynamic thing done in the teuthology code itself? Otherwise how we code it explicitly? Lua in YAMLs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't that means we'd still have to get the code to get the dynamic thing done in the teuthology code itself?

teuthology-suite would adjust a directory like

upgrade/tasks/0-from/{install, {base/{squid,reef}}}

where install.yaml is something vanilla like

install: {}

and reef.yaml would have specially interpreted Lua code that expand to v18.2.0, v18.2.1, v18.2.2, reef.

Once it reaches the premerge/postmerge step, these dynamically created fragments already exist (and that's why we can't use the existing premerge/postmerge framework).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't that means we'd still have to get the code to get the dynamic thing done in the teuthology code itself?

teuthology-suite would adjust a directory like

upgrade/tasks/0-from/{install, {base/{squid,reef}}}

so we'd still need to maintain releases manually for this and this is what I'm aiming to remove, i understand your idea about dynamic matrix for testing minor releases but if you want upgrade/tasks/0-from/{install, {base/{n-1,n-2}}} I think there needs to be some code in teuthology that can replace the placeholders with respective releases. Flow:

have generic upgrade YAMLs with install.upgrade: n-1 -> this gets passed to the teuthology code, it figures out what n-1 is -> passes the release to the minor-release-matrix builder code

and for the first step making use of the chacra API is highly useful

where install.yaml is something vanilla like

install: {}

and reef.yaml would have specially interpreted Lua code that expand to v18.2.0, v18.2.1, v18.2.2, reef.

Once it reaches the premerge/postmerge step, these dynamically created fragments already exist (and that's why we can't use the existing premerge/postmerge framework).

@dparmar18
Copy link
Author

The consensus in cephfs standup came to the decision that putting efforts for automating something that requires minor YAML changes once a year is not worth it.

@dparmar18 dparmar18 closed this Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants