Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grub: core.img: Reboot if configfile fails #585

Merged
merged 2 commits into from
Dec 11, 2019
Merged

grub: core.img: Reboot if configfile fails #585

merged 2 commits into from
Dec 11, 2019

Conversation

iliana
Copy link
Contributor

@iliana iliana commented Dec 11, 2019

If GRUB attempts to load a config file from a boot partition that is empty or otherwise broken, it will sit forever at a rescue console, which is not particularly helpful in EC2.

gptprio.next decrements tries_left on partitions not yet marked successful, so rebooting will usually result in the correct behavior of rolling back.

Some diagnostic output is printed before sleeping for 30 seconds to help aid future debugging. No additional output is printed if the configfile boots successfully.

This also drops search_fs_uuid from being built into core.img, as we haven't used that since af72caf.

x86_64's core.img size went from 129659 bytes to 130203 bytes with this change.

Tested with QEMU on x86_64 with shell enabled by booting Thar, running signpost upgrade-to-inactive without writing anything to the disk, and rebooting. GRUB printed the diagnostic message, slept 30 seconds, and rebooted, which booted into set A. I didn't test aarch64 yet.

There's still an open question on if we want to upgrade (and downgrade?) the GRUB core.img on older AMIs by (ab)using migrations.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@iliana iliana requested review from bcressey and sam-aws December 11, 2019 01:25
Comment on lines 5 to 9
configfile /grub/grub.cfg
echo "boot failed (device ($boot_dev), uuid $boot_uuid)"
echo "trying again in 30 seconds..."
sleep 30
reboot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grub manual does describe configfile returning... are these new commands executed after a successful boot, i.e. after selecting a healthy entry, or at shutdown/reboot time?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation I see reads:

Load file as a configuration file. If file defines any menu entries, then show a menu containing them immediately. Any environment variable changes made by the commands in file will not be preserved after configfile returns.

The config file we load defines a menu entry with a default and a timeout of 0, so I believe that means it executes it and never returns. (In practice that appears to be the case; systemctl halt does not reboot the host in QEMU.)

packages/grub/core.cfg Outdated Show resolved Hide resolved
If GRUB attempts to load a config file from a boot partition that is
empty or otherwise broken, it will sit forever at a rescue console,
which is not particularly helpful in EC2.

gptprio.next decrements tries_left on partitions not yet marked
successful, so rebooting will usually result in the correct behavior of
rolling back.

Some diagnostic output is printed before sleeping for 30 seconds to help
aid future debugging. No additional output is printed if the configfile
boots successfully.

Signed-off-by: iliana destroyer of worlds <[email protected]>
We haven't been using this module since af72caf.

Signed-off-by: iliana destroyer of worlds <[email protected]>
@iliana
Copy link
Contributor Author

iliana commented Dec 11, 2019

(I brought up in the aisle that including a short URL that redirects to some documentation about why GRUB might have failed could be useful, once we know what this project is actually going to be called and we have a domain name.)

@iliana iliana requested review from tjkirch and jhaynes December 11, 2019 19:06
@iliana iliana merged commit a8e88af into develop Dec 11, 2019
@tjkirch tjkirch deleted the grub-debug branch December 17, 2019 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants