Bug 1455763
| Summary: | Support setting next boot value to currently booted OS so when triggering a reboot or crash, system doesn't hang in GRUB menu | ||
|---|---|---|---|
| Product: | [Retired] Restraint | Reporter: | Qiao Zhao <qzhao> |
| Component: | general | Assignee: | Carol Bouchard <cbouchar> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | tools-bugs <tools-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | unspecified | CC: | asavkov, bpeck, breilly, cbeer, cbouchar, jbastian, lilu, mastyk, mjia, zsun |
| Target Milestone: | 0.1.43 | Keywords: | TestBlocker, Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-01-13 18:26:28 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This is the expected behaviour on EFI firmware. If you want to reboot the system as part of a Beaker recipe, make sure your task uses the rhts-reboot command. This sets BootNext in the firmware to ensure the machine boots back into the OS. Otherwise it will just boot to this GRUB menu. If you power up the system when it's not running any recipe, it will similarly sit at the GRUB menu until you pick an OS to install. There is more details about how Beaker manipulates EFI boot order in the docs: https://beaker-project.org/docs/architecture-guide/provisioning-process.html If you think Beaker is doing something wrong, please provide the complete set of steps you are going through to reach the GRUB menu and what you think it should do instead. (In reply to Dan Callaghan from comment #1) > This is the expected behaviour on EFI firmware. > > If you want to reboot the system as part of a Beaker recipe, make sure your > task uses the rhts-reboot command. This sets BootNext in the firmware to > ensure the machine boots back into the OS. Otherwise it will just boot to > this GRUB menu. Yes, like you said, i also talk with our other QE members, we think beaker-lib developer should fix this problem. Because more and more customer/partner reported *regression* problem on EFI environment. We can avoid those challenge if our test job can automation run by beaker. For example, *Kdump* can make a system reboot (not rhts-reboot), it always drop to GRUB menu. So, we hope developer can fix this problem (make it like legacy bios). -- Thanks, Qiao > > If you power up the system when it's not running any recipe, it will > similarly sit at the GRUB menu until you pick an OS to install. > > There is more details about how Beaker manipulates EFI boot order in the > docs: > > https://beaker-project.org/docs/architecture-guide/provisioning-process.html > > If you think Beaker is doing something wrong, please provide the complete > set of steps you are going through to reach the GRUB menu and what you think > it should do instead. (In reply to Qiao Zhao from comment #2) > (In reply to Dan Callaghan from comment #1) > > This is the expected behaviour on EFI firmware. > > > > If you want to reboot the system as part of a Beaker recipe, make sure your > > task uses the rhts-reboot command. This sets BootNext in the firmware to > > ensure the machine boots back into the OS. Otherwise it will just boot to > > this GRUB menu. > > Yes, like you said, i also talk with our other QE members, we think > beaker-lib developer should fix this problem. Because more and more > customer/partner reported *regression* problem on EFI environment. > We can avoid those challenge if our test job can automation run by beaker. > > For example, *Kdump* can make a system reboot (not rhts-reboot), it always > drop to GRUB menu. > ... I think the concern here is for system crash events (OS panic or intentional crash like during kdump testing) they will unfortunately hang the system at GRUB menu and block following tasks to be executed. This has become a business critical problem that prevents QE from testing those features like kdump on EFI machines in Beaker (instead, manually on local workstations). And some related bugs have found later on RH partner/customer's EFI servers - which is not good. Is there a workaround here? Maybe like: - another command (like rhts-reboot) which only sets BootNext to boot locally (for expected system crash coming next) but don't reboot the system itself? - or, a way to disable the "post-installation script supplied by Beaker" which "changes the boot order back so that netboot is at the top"? - or something we can add on netboot menu? Thanks! (In reply to Linqing Lu from comment #3) > (In reply to Qiao Zhao from comment #2) > > (In reply to Dan Callaghan from comment #1) > > > This is the expected behaviour on EFI firmware. > > > > > > If you want to reboot the system as part of a Beaker recipe, make sure your > > > task uses the rhts-reboot command. This sets BootNext in the firmware to > > > ensure the machine boots back into the OS. Otherwise it will just boot to > > > this GRUB menu. > > > > Yes, like you said, i also talk with our other QE members, we think > > beaker-lib developer should fix this problem. Because more and more > > customer/partner reported *regression* problem on EFI environment. > > We can avoid those challenge if our test job can automation run by beaker. > > > > For example, *Kdump* can make a system reboot (not rhts-reboot), it always > > drop to GRUB menu. > > ... > > I think the concern here is for system crash events (OS panic or intentional > crash like during kdump testing) they will unfortunately hang the system at > GRUB menu and block following tasks to be executed. > > This has become a business critical problem that prevents QE from testing > those features like kdump on EFI machines in Beaker (instead, manually on > local workstations). > And some related bugs have found later on RH partner/customer's EFI servers > - which is not good. > > Is there a workaround here? Maybe like: > - another command (like rhts-reboot) which only sets BootNext to boot > locally (for expected system crash coming next) but don't reboot the system > itself? > - or, a way to disable the "post-installation script supplied by Beaker" > which "changes the boot order back so that netboot is at the top"? > - or something we can add on netboot menu? > > Thanks! Hi Dan Callaghan, Any update for this problem? -- Thanks, Qiao (In reply to Linqing Lu from comment #3) > Is there a workaround here? Maybe like: > - another command (like rhts-reboot) which only sets BootNext to boot > locally (for expected system crash coming next) but don't reboot the system > itself? This sounds like the best approach. And it should be quite easy to implement. This patch adds a new command rhts-prepare-reboot: https://gerrit.beaker-project.org/5912 Corresponding command for restraint: https://gerrit.beaker-project.org/5913 And Beaker docs update: https://gerrit.beaker-project.org/5914 I should add a self-test similar to the reboot tests, and then put that into workflow-selftest as well. There does not seem to be consensus that having yet another command to mess with EFI boot order is the best approach. Bumping this to 26.0 so that we can get 25.0 out. Patch on gerrit introduces new rstrnt-prepare-reboot command that will fix this issue. There will be also be an announcement in beaker-user-list so teams are aware of this feature.
rstrnt-prepare-reboot
---------------------
Prepare the system for rebooting. Similar to rstrnt-reboot,
but does not actually trigger the reboot.
If machine is UEFI and has efibootmgr installed, sets BootNext to
BootCurrent and uses :envvar:`NEXTBOOT_VALID_TIME` to determine for
how long (in seconds) is this value valid. After the specified time,
BootOrder is reset to previous state. Default value for
:envvar:`NEXTBOOT_VALID_TIME` is 180 seconds.
Tasks can run this command before triggering a crash or rebooting
through some other non-standard means. For example::
rstrnt-prepare-reboot
echo c > /proc/sysrq-trigger
|
Description of problem: I met this problem when i try to reboot system. It will hang at GRUB stage, like: GNU GRUB version 0.97 (252K lower / 2010844K upper memory) +-------------------------------------------------------------------------+ | RHEL5-Server-U9 x86_64 | | RHEL5-Server-U9 i386 | | RHEL5-Server-U8 x86_64 | | RHEL5-Server-U8 i386 | | RHEL5-Server-U7 x86_64 | | RHEL5-Server-U7 i386 | | RHEL5-Server-U7-RC-1 x86_64 | | RHEL5-Server-U7-RC-1 i386 | | RHEL5-Server-U6 x86_64 | | RHEL5-Server-U6 i386 | | RHEL5-Server-U6-RC-1 x86_64 | | RHEL5-Server-U6-RC-1 i386 | v +-------------------------------------------------------------------------+ Use the ^ and v keys to select which entry is highlighted. Press enter to boot the selected OS, 'e' to edit the commands before booting, 'a' to modify the kernel arguments before booting, or 'c' for a command-line. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: UEFI only machine will hang at efigrub stage when system reboot. Expected results: UEFI only machine can reboot when no action in efigrub stage. Additional info: https://beaker.engineering.redhat.com/view/intel-purley-lr-02.khw.lab.eng.bos.redhat.com#details https://beaker.engineering.redhat.com/view/nec-ex5800-01.rhts.eng.pek2.redhat.com#details (admin help set STAT as first boot device)