Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Squeak: Always update images for builds #567

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

LinqLover
Copy link
Collaborator

Older Squeak images too can (and sometimes will) receive updates. Resolves #560.

Older Squeak images too can (and sometimes will) receive updates. Resolves hpi-swa#560.
@LinqLover
Copy link
Collaborator Author

Performance report

Comparing the build times from the self-tests before/after this patch:

$ paste <(echo OLD ; gh run view 3027630365 | grep Squeak | sort) <(echo NEW ; gh run view 3032776315 | grep Squeak | sort) | sed 's/ in /\t\t/g'
OLD     NEW
✓ Squeak32-4.5 on ubuntu-latest         46s (ID 8283957652)     ✓ Squeak32-4.5 on ubuntu-latest         40s (ID 8294178819)
✓ Squeak32-6.0 on ubuntu-latest         40s (ID 8283957607)     ✓ Squeak32-6.0 on ubuntu-latest         46s (ID 8294178773)
✓ Squeak32-trunk on ubuntu-latest               1m9s (ID 8283957563)    ✓ Squeak32-trunk on ubuntu-latest               1m43s (ID 8294178743)
✓ Squeak64-5.1 on macos-latest          38s (ID 8283958265)     ✓ Squeak64-5.1 on macos-latest          40s (ID 8294179455)
✓ Squeak64-5.1 on ubuntu-latest         25s (ID 8283957521)     ✓ Squeak64-5.1 on ubuntu-latest         32s (ID 8294178716)
✓ Squeak64-5.2 on macos-latest          2m9s (ID 8283958232)    ✓ Squeak64-5.2 on macos-latest          1m44s (ID 8294179417)
✓ Squeak64-5.2 on ubuntu-latest         28s (ID 8283957490)     ✓ Squeak64-5.2 on ubuntu-latest         51s (ID 8294178701)
✓ Squeak64-5.3 on macos-latest          41s (ID 8283958189)     ✓ Squeak64-5.3 on macos-latest          1m3s (ID 8294179391)
✓ Squeak64-5.3 on ubuntu-latest         42s (ID 8283957455)     ✓ Squeak64-5.3 on ubuntu-latest         51s (ID 8294178676)
✓ Squeak64-6.0 on macos-latest          44s (ID 8283958155)     ✓ Squeak64-6.0 on macos-latest          42s (ID 8294179368)
✓ Squeak64-6.0 on ubuntu-latest         25s (ID 8283957421)     ✓ Squeak64-6.0 on ubuntu-latest         28s (ID 8294178656)
✓ Squeak64-trunk on macos-latest                4m54s (ID 8283958108)   ✓ Squeak64-trunk on macos-latest                5m30s (ID 8294179334)
✓ Squeak64-trunk on ubuntu-latest               2m22s (ID 8283957384)   ✓ Squeak64-trunk on ubuntu-latest               3m16s (ID 8294178627)
Job OLD NEW
Squeak32-4.5 on ubuntu-latest 46s 40s
Squeak32-6.0 on ubuntu-latest 40s 46s
Squeak32-trunk on ubuntu-latest 1m9s 1m43s
Squeak64-5.1 on macos-latest 38s 40s
Squeak64-5.1 on ubuntu-latest 25s 32s
Squeak64-5.2 on macos-latest 2m9s 1m44s
Squeak64-5.2 on ubuntu-latest 28s 51s
Squeak64-5.3 on macos-latest 41s 1m3s
Squeak64-5.3 on ubuntu-latest 42s 51s
Squeak64-6.0 on macos-latest 44s 42s
Squeak64-6.0 on ubuntu-latest 25s 28s
Squeak64-trunk on macos-latest 4m54s 5m30s
Squeak64-trunk on ubuntu-latest 2m22s 3m16s

Overall, I don't really see a significant increase in build times. Note that some build times even were reduced, and that the trunk build times (which may serve as a rough reference point) increased by up to ~40%. Of course, the build times vary greatly on GitHub Actions Cloud.

Given that the added update step actually adds value (= removes bugs, see hpi-swa/Squot#349 for a motivating example), my proposal would be to accept this possible increase in build time. Looking forward to your opinion @fniephaus! /cc @j4yk

@LinqLover LinqLover marked this pull request as ready for review September 11, 2022 19:49
@fniephaus
Copy link
Member

I get what problem this is trying to solve but I'd say it's an edge case. More importantly, it's changing semantics and adds another source of failures (if your project doesn't need access to source.squeak.org, it's not needed for non-trunk builds).

Having said that, what do you think about adding an options for this? Something like --force-image-update?

@fniephaus fniephaus self-assigned this Oct 9, 2022
@LinqLover
Copy link
Collaborator Author

Hmm, from my naive perspective of a CI user running a build on smalltalkCI for Squeak x.y should always have the same effect as downloading the latest image for Squeak x.y from squeak.org ... shouldn't it? So I'm not sure whether this is really only an edge case. But I also see your point and an --force-image-update would be definitely better than nothing (even though personally, I would probably always enable it for all of my repositories).

@fniephaus
Copy link
Member

I doubt that naive CI users care about random updates for old Squeak releases. Also, we don't even update the website on backport releases. If only we could base such a decision on some user feedback... maybe @ekrebs5, @marceltaeumel, or @j4yk have an opinion on this?

@LinqLover
Copy link
Collaborator Author

@marceltaeumel's position was to make updates for release images opt-in, also considering that every update comes with an (even though low) risk of breaking things (e.g., a broken update stream or any conflicting change with smalltalkCI).

Yes, most CI users will probably not even expect any updates for a release image. On the other hand, if I download a Squeak 5.3 release image today and set it up (by following the recommendations of the welcome wizard) and then later find out that the CI behaves differently for the same version of Squeak, I will be somewhat confused. In the end, smalltalkCI is only maintaining a cache of files.squeak.org that should be invisible to users - or should it?

(I'm actually surprised that https://squeak.org/downloads/ is not updated for backports. @marceltaeumel Is this just a trade-off for the maintainability of the website or an explicit statement of "what is Squeak 5.3"?)

So in the end, while understanding the arguments for opt-in, my personal opinion would still be to enforce image updates (or at least make them opt-out). If @j4yk or @ekrebs5 don't have to add anything to this, it's probably on you @fniephaus to make a final decision. :-)

@fniephaus
Copy link
Member

I'm still not convinced it's a good idea to update release images by default. You're most likely an expert user, core dev, or even the author of the backported fix that you want in the release image. I don't mind an update option that is opt-in and we can bump the base images on request. One good thing about bumping manually is that we can always easily roll back.

@fniephaus
Copy link
Member

Can we just make this an option, which users can enable if needed?

@LinqLover
Copy link
Collaborator Author

Sounds fair. Is this urgent for anyone? If not, I will do this soon (tm). If urgent, I don't bother if you or anyone else wants to proceed on this PR instead of me.

@fniephaus
Copy link
Member

Nope, it's not urgent.

@LinqLover
Copy link
Collaborator Author

@fniephaus I'm wondering what would be the best way to configure this behavior:

  • a command line option (e.g., --force-image-update)
  • a suffix to the Smalltalk version (e.g., -s Squeak64-6.0-latest vs -s Squeak64-6.0)
  • an entry in the SmalltalkCISpec (e.g., `SmalltalkCISpec { #forceImageUpdate : true, ... }``

While a command line option would be the simplest to implement, I'm afraid that it unnecessarily increases the usage complexity for CI users who previously only had to maintain a config file and an image version. For instance, #useLatestMetacello is also provided in the SCIMetacelloLoadSpec. On the other hand, you can specify a custom image or VM through the command line.

Looking forward to your opinion! :-)

@fniephaus
Copy link
Member

what would be the best way to configure this behavior

Updating needs to happen within Smalltalk anyway, so there's no need to write any Bash code to make this work.
Maybe we could introduce something like this:

SmalltalkCISpec {
  // #loading, #testing, ...
  #options : {
     #forceImageUpdate : true
  }
}

What do you think?

@LinqLover
Copy link
Collaborator Author

Updating needs to happen within Smalltalk anyway, so there's no need to write any Bash code to make this work.

My first take would have been to add a check here:

if is_trunk_build || image_is_user_provided; then

Which would actually make a bash argument simpler to implement ... on the other hand, a ston option might be more convenient ... I'm not sure ...

Maybe a command line argument could actually be more convenient because users can set it conditionally, which is not possible with ease for ston files? In that case we would still have to decide --force-image-update vs -s Squeak64-6.0-latest. :-)

@fniephaus
Copy link
Member

Maybe a command line argument could actually be more convenient because users can set it conditionally.

You could also use different .ston config files and use them conditionally. Besides, I'm not sure users would ever want to test both with and without, so not sure this is something we need to optimize for.

@LinqLover
Copy link
Collaborator Author

Maybe a command line argument could actually be more convenient because users can set it conditionally.

You could also use different .ston config files and use them conditionally. Besides, I'm not sure users would ever want to test both with and without, so not sure this is something we need to optimize for.

To clarify that: A common use case that I would have in mind for the requested functionality would be a CI setup like that:

  • run CI on Squeak64-Trunk (where updates are already installed)
  • run CI on Squeak64-6.0 (where no additional updates are required)
  • run CI on Squeak64-5.3 (where the latest backports are required)

Yes, I could always install the latest updates, but this would slow down the other builds in this example. I would consider something like smalltalk: [Squeak64-Trunk, Squeak64-6.0, Squeak64-5.3-latest] or smalltalkci -s Squeak64-${{ matrix.smalltalk }} ${{matrix.smalltalk == 'Squeak64-5.3' && '--install-latest-updates'} more intuitive and concise here than duplicating the ston file (which is unfortunately not programmable). Hm, many other tools would probably support both ways, but this would further increase the complexity ...

@fniephaus
Copy link
Member

Hm, many other tools would probably support both ways, but this would further increase the complexity ...

We could also support both ways, but the more we spread this, the more users will ask for support for other dialects, which you probably don't plan to implement?

more intuitive and concise here than duplicating the ston file (which is unfortunately not programmable)

What you could also do today is use a simple postLoading script to force update iff it's running on Squeak 5.3.

@LinqLover
Copy link
Collaborator Author

Hm, many other tools would probably support both ways, but this would further increase the complexity ...

We could also support both ways, but the more we spread this, the more users will ask for support for other dialects, which you probably don't plan to implement?

Fair ... But if we just support and document Squeak<bitness>-<version>-latest, nobody should expect that this would work for Pharo et al. as well, should they?

more intuitive and concise here than duplicating the ston file (which is unfortunately not programmable)

What you could also do today is use a simple postLoading script to force update iff it's running on Squeak 5.3.

Or a preLoading script ... yes, that's what I'm currently doing, I just thought that a general flag for this could be desirable. At least, @j4yk had the same issue.

Thinking aloud ... (too much text) At the moment, the "image updating" logic resists in https://github.com/hpi-swa/smalltalkCI/blob/6d6c79b14f0fc577f7bdcf27fee9b39509cf532a/squeak/prepare.st which would make a bash-side implementation simpler and help us avoid duplication. I just realize that the bash script already does some kind of ston parsing:

smalltalkCI/helpers.sh

Lines 261 to 263 in 854de5a

ston_includes_loading() {
grep -Fq "#loading" "${config_ston}"
}

This is most likely all but not idiomatic? Just imagine someone had a #path : 'scripts/tests#loading.st' in their #preTesting section but no #loading section. :-)

I think it is crucial that image updating should happen before anything else because class definitions, preambles, etc. might depend on new behavior. Thus, if we wanted to handle a SmalltalkCISpec.options.forceImageUpdate option in Smalltalk, we would need to do so before loading/testing:

smalltalkCI/squeak/run.sh

Lines 381 to 388 in 854de5a

if is_trunk_build || image_is_user_provided; then
squeak::prepare_image
fi
if ston_includes_loading; then
squeak::load_project
check_and_consume_build_status_file
fi
squeak::test_project

Since loading is optional, this would require an extra invocation of the VM ... thus implementing updates inside squeak::prepare_image seems most convenient to me. I would see two options:

  1. Parse ston option via grep in bash, similar to ston_includes_loading (sounds very hacky and brittle to me)

  2. Parse ston option in Smalltalk:

    run_build() {
       ...
    -   if is_trunk_build || image_is_user_provided; then
    -    squeak::prepare_image
    -  fi
    +  squeak::prepare_image $((is_trunk_build || image_is_user_provided) && echo true || echo false)
       ...
     }

    To squeak::prepare_image, add a required argument. Already install SmalltalkCI there similar to squeak::load_project and squeak::test_project and pass on the required argument to prepare.st. In prepare.st, check the config file for forceImageUpdate and do not install updates if neither the command line argument nor the flag is set.

  3. Parse command line option in bash:

    This would be the simplest change:

    run_build() {
       ...
    -   if is_trunk_build || image_is_user_provided; then
    +   if is_trunk_build || image_is_user_provided | should_force_image_update; then
        squeak::prepare_image
      fi
       ...
     }

    And add a command line argument to that. Alternatively, should_force_image_update searches for a -latest prefix in the smalltalk version, this would also avoid false expectations for other Smalltalk distributions.

    With -latest, update configuration would be as close to version specification as possible.

tl;dr: After a first look at the code, my preference would be Squeak64-5.3-latest etc. (see reasoning above), provided that you don't see any other risks in changing the Smalltalk versions. What do you think? :)

@fniephaus
Copy link
Member

After a first look at the code, my preference would be Squeak64-5.3-latest etc.

That's another solution for the problem, but I'm not sure we want to encode an option (run image update) with a label (latest) in the image selection. That opens up a new dimension for configuration and we need to communicate this somehow to users.

What if we add support for fully-qualified Squeak versions, something like Squeak64-5.3-19431? If that's used, smalltalkCI will download the image from files.squeak.org and install Metacello etc? Or does installing Metacello take significantly longer than updating a Squeak64-5.3 release image? It's more of an edge use case anyway, right?

@LinqLover
Copy link
Collaborator Author

After a first look at the code, my preference would be Squeak64-5.3-latest etc.

That's another solution for the problem, but I'm not sure we want to encode an option (run image update) with a label (latest) in the image selection. That opens up a new dimension for configuration and we need to communicate this somehow to users.

Yes, I see that. Also, this could be a compatibility issue, I don't know how many places check that smalltalk_version else - just as one example, SqueakByExample's CI does.

What if we add support for fully-qualified Squeak versions, something like Squeak64-5.3-19431? If that's used, smalltalkCI will download the image from files.squeak.org and install Metacello etc? Or does installing Metacello take significantly longer than updating a Squeak64-5.3 release image? It's more of an edge use case anyway, right?

Wow, this reminds me a lot of #506 and #471. :-) Some notes:

  1. There might be scenarios where users want to explicitly retrieve all the latest updates, which would not be possible if they had to specify an explicit version number.
  2. Afaol the images on files.squeak.org have a limited and undocumented lifetime (the trunk builds are pruned from time to time), so I think we could not rely on that. On the other hand, explicitly updating to a certain version as attempted in Squeak: Specify exact Trunk image version #506 seems not to be possible (see https://lists.squeakfoundation.org/archives/list/[email protected]/thread/EBG2CPJLZTSMOPOUXSFPMFCHYMK5VBKY/ - maybe I should close that PR, I personally do not need such a feature any longer).
  3. Metacello bootstrapping took about 50 seconds for me via Wifi (see https://lists.squeakfoundation.org/archives/list/[email protected]/thread/RH2K5QFUYDBISBW52JRRPPFMNW5NQOAY/), I don't have numbers for the image update.
  4. In general, as proposed earlier in Cache prepared testing images for Squeak-Trunk? #471, I have been wondering for a longer time why the current update/release process for SCI @ Squeak Trunk (and with regard to the intention of this issue, possibly all other Squeak versions as well) is as it is. :-) Why is the process of creating new releases not automated? If we did this, we could periodically create a new SCI image once a day/week with the latest Squeak updates based on the files.squeak.org build for the relevant version, avoid manual labor for that, and reduce any update delay on the CI users' side to a few seconds for most of the time. The relevant build would be the latest for Trunk and the original one for all other versions. Optionally, we could also provide an SCI image for the latest build of each Squeak version in such a store but that would probably be out of scope.

In summary, personally, I do not have any use case for full-qualified versions. Given your fair criticism against Squeak-6.0-latest, I would be in favor of a command line argument (--force-image-update).

@fniephaus
Copy link
Member

I would be in favor of a command line argument --force-image-update).

Again, this requires Bash code for a feature that can purely be written and maintained in Smalltalk, simply because you want to conveniently create build matrices with GitHub Actions. Also, update mechanisms depends on the selected image, so I don't see how it will work with an user-provided image.

I don't want to needlessly block you on this, but why can't it be an option to auto-update in the config? Can't you just turn on auto-updates for all your images, even if only one really depends on backports? Updating other images shouldn't take long and ensures that other backports don't break your app. I just find the mix of trunk, updated, and stock images a bit questionable.

@fniephaus
Copy link
Member

Why is the process of creating new releases not automated?

Simply because we never worked out a good way to automate the hosting part: creating and updating GitHub releases is a bit tricky, and so is finding the latest 5.3 image. Right now, it's just very simple and hard coded. Suggestions and ideas are welcome.

@fniephaus
Copy link
Member

fniephaus commented Jul 9, 2023

Maybe we could even find an (existing?) or smart way to allow for dynamic option values, as in instead of accepting true or false, the option could also accept an expression that is evaluated and the result is used for the option. That way, you could write an inline check to control auto updates in your ston config.

@LinqLover
Copy link
Collaborator Author

I would be in favor of a command line argument --force-image-update).

Again, this requires Bash code for a feature that can purely be written and maintained in Smalltalk, simply because you want to conveniently create build matrices with GitHub Actions.

I am hesitating to reimplement a feature in Smalltalk that already exists in bash (as described in the details section of #567 (comment)). However, if you deem it more important not to increase the bash logic further, yes, we can also do it on the image side.

Can't you just turn on auto-updates for all your images, even if only one really depends on backports? Updating other images shouldn't take long and ensures that other backports don't break your app. I just find the mix of trunk, updated, and stock images a bit questionable.

Fair point, agreed. :-) Maybe "dynamic configurations" should become an own issue - similar to a .config.js instead of a .json, smalltalkCI might also support a .smalltalk.st that returns a config as an alternative to the .smalltalk.ston file. But maybe we don't even need that. :-)

@fniephaus
Copy link
Member

fniephaus commented Jul 9, 2023

I am hesitating to reimplement a feature in Smalltalk that already exists in bash

You are referring to the prepare.st, which really only is for trunk builds. Yes, we could share some logic, but I rather see that prepare.st in the smalltalkCI Smalltalk code base at some point. 😉 Should be fairly simple as "trunk" would imply "auto-updates on".

Maybe "dynamic configurations" should become an own issue

SGTM, feel free to create another feature request in case you think this could be useful.

@LinqLover
Copy link
Collaborator Author

Why is the process of creating new releases not automated?

Simply because we never worked out a good way to automate the hosting part: creating and updating GitHub releases is a bit tricky, and so is finding the latest 5.3 image. Right now, it's just very simple and hard coded. Suggestions and ideas are welcome.

Creating and updating GitHub releases: There seem to be plenty GitHub actions for this. I have used softprops/action-gh-release successfully in a couple of projects to create releases, I hope that updating them later should not be more complicated.

Finding the latest image: This is what I did in create-image:

https://github.com/LinqLover/create-image/blob/276883b1fdd97fd4af9a9b75d6d5b32953137b2b/src/create_image.sh#L17-L23

@LinqLover
Copy link
Collaborator Author

Alright, will try to implement a #forceImageUpdate option in the config when I have some time. Thank you!

@fniephaus
Copy link
Member

fniephaus commented Jul 9, 2023

Finding the latest image: This is what I did in create-image:

Thing is this does not work for GitHub releases. You can use the GitHub API, but this would add another request, and another dependency (GitHub API). And the GitHub API can be rate-limited.

@LinqLover
Copy link
Collaborator Author

Finding the latest image: This is what I did in create-image:

Thing is this does not work for GitHub releases. You can use the GitHub API, but this would add another request, and another dependency (GitHub API). And the GitHub API can be rate-limited.

To which step are you referring?

Inside smalltalk CI's release job (cron):

On the CI user's side, nothing changes, smalltalkCI still refers to a hard-coded release asset URL.

Alternatively, we could create a new GitHub release every time and delete the previous one. Might be more elegant but indeed increase complexity on the user side for finding the latest release. As further alternatives, we could store the artifacts on files.squeak.org or in a separate repo/branch (deleting the ancestry and force-pushing with each commit to avoid large storage consumption) and download it from there.

So, with the first option (edit the latest existing release), the complexity (runtime/dependencies) on the CI user side does not increase. Same if we use a separate repo/branch. For files.squeak.org, we would indeed introduce a new dependency, yes.

On the other hand, complexity would indeed be increased in our new release job (because it previously did not exist). I have been using the GitHub API for automating releases through a GitHub action for almost two years with many commits per day and never hit any specific outages or rate limits. For this use case, we might run the script once a day/once a week (depends on how we weigh our costs against the costs of CI users for manual updates), so I would not be afraid of rate limits. Also, if anything in the automated release breaks, we would still remain at the status quo, right?

@fniephaus
Copy link
Member

edit the latest existing release of smalltalkCI and update the image

I don't think this is a good idea: what if there's an issue and you need a particular image, but now it's overwritten? I believe we'd need to publish binaries and include the update level. This means that you need some sort of lookup. Right now, this is part of smalltalkCI's bash scripts. Maybe we could do this somehow with a moving tag that identifies the latest and gives us a stable URL, but I haven't tried this yet. I don't think this is impossible to solve. A proper solution just needs exploration and experimentation for which I don't have the time at the moment.

Also, if anything in the automated release breaks, we would still remain at the status quo, right?

Right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Squeak 5.3 image to version 19481
2 participants