Bug 1392569
| Summary: | decouple the SeaBIOS build from iasl | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Laszlo Ersek <lersek> | |
| Component: | seabios | Assignee: | Gerd Hoffmann <kraxel> | |
| Status: | CLOSED ERRATA | QA Contact: | FuXiangChun <xfu> | |
| Severity: | high | Docs Contact: | Jiri Herrmann <jherrman> | |
| Priority: | high | |||
| Version: | 7.3 | CC: | chayang, juzhang, knoel, kraxel, mrezanin, mst, mtessun, sbognann, snagar, virt-maint, xuzhang, yafu | |
| Target Milestone: | rc | Keywords: | Regression, ZStream | |
| Target Release: | 7.4 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | seabios-1.10.1-1.el7 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1400102 (view as bug list) | Environment: | ||
| Last Closed: | 2017-08-01 17:44:06 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1395265, 1400102, 1401400 | |||
> (2) Otherwise, if CONFIG_ACPI is set, then SeaBIOS's builtin tables are > installed, along with any additional (non-critical) ACPI tables coming > In RHEL-7.3 downstream, even with qemu-kvm (that is, not just > qemu-kvm-rhev), branches (2) and (3) are never taken in SeaBIOS, because > those QEMU versions always provide ACPI via "etc/table-loader". I think this is only true for the RHEL-7 machine types. > We should make this official, and set CONFIG_ACPI=n -- otherwise it defaults > to "y", see "src/Kconfig" -- in all of the redhat/config.* files. Given that we have different seabios binaries with different sizes for RHEL-6 (128k) and RHEL-7 (256k) machine types we can turn CONFIG_ACPI off for the 256k bios. > - by the fact that upstream ACPICA / iASL developers advised us to avoid > problems like bug 1377087 by sticking to old iASL releases (they wouldn't > introduce compatibility flags for old AML interpreters). We can just drop the iasl build dependency from seabios after rebasing to 1.10, thanks to this one: commit 4373afaef34a76733638edc4e98f355b13c52f10 Author: Kevin O'Connor <kevin> Date: Tue Nov 17 18:45:41 2015 -0500 acpi: Don't build SSDT files on every build; store them in git The SSDT files are rarely modified - recent QEMU versions don't use them at all and adding features to them in SeaBIOS has been deprecated. It no longer makes sense to generate them on every build. The content will remain (for use on old machine types in QEMU) in static files committed to the SeaBIOS git repo. If the contents do need to be generated a new build target (make iasl) is available. Signed-off-by: Kevin O'Connor <kevin> (In reply to Gerd Hoffmann from comment #1) > > (2) Otherwise, if CONFIG_ACPI is set, then SeaBIOS's builtin tables are > > installed, along with any additional (non-critical) ACPI tables coming > > > In RHEL-7.3 downstream, even with qemu-kvm (that is, not just > > qemu-kvm-rhev), branches (2) and (3) are never taken in SeaBIOS, because > > those QEMU versions always provide ACPI via "etc/table-loader". > > I think this is only true for the RHEL-7 machine types. Ouch!!! You are right, of course! In this case however, we'll have to release a 7.3.z update for SeaBIOS as well. The fix for qemu-kvm z-stream bug 1392027 solves the issue for RHEL-5 guests that run in RHEL-7 machine types on RHEL-7 hosts but it doesn't help RHEL-5 guests that run in RHEL-6 machine types on RHEL-7 hosts If you check the build logs for the latest SeaBIOS build (see the direct links in the next, private comment), namely seabios-1.9.1-5.el7, "build.log" confirms that "iasl" is invoked several times during the build, and "root.log" confirms that the "acpica-tools" package that gets installed in the build root has version "20160527-1.el7.x86_64". This is exactly the problematic acpica-tools package: refer to qemu-kvm y-stream bug 1377087 comment 17. In fact I can trigger this error. I just flipped the machine type of my "seabios.rhel5" VM -- which I had created while working on bug 1377087 / bug 1392027 -- from "pc-i440fx-rhel7.0.0" to "rhel6.6.0", with the emulator being qemu-kvm-rhev-2.6.0-28.el7.x86_64. This change triggers the bug: the guest dmesg is chock-full of AML parsing errors, and "poweroff" in the guest only halts the system, it doesn't power it down! So yeah, we need a Z-Stream clone for this bug as well -- we have to replicate the qemu-kvm bugfix to RHEL-7.3.Z SeaBIOS. Namely, generate the *.hex files statically with the "iasl" tool from the known-good acpica-tools package (20150619-3.el7.x86_64), check them into git, and use those in the build. I guess this would practically mean a backport of upstream 4373afaef34a to RHEL-7.3.z SeaBIOS. Setting the Regression and ZStream keywords -- for RHEL-5 guests running in rhel6.?.0 machine types on RHEL-7 hosts, this issue is a regression. In addition, for 7.4 y-stream, we can't just blindly rebase to upstream SeaBIOS 1.10. We have to manually regenerate the *.hex files with iASL from the known good acpica-tools-20150619-3.el7.x86_64 package. Otherwise RHEL-5 guests running in rhel6.?.0 machine types on RHEL-7.4 hosts will break.
I don't know what iASL version Kevin used for upstream commit 4373afaef34a, so we should check first...
Standing at commit d7adf6044a4c ("docs: Note v1.10.0 release") in the upstream tree, and using "redhat/config.base" from the RHEL-7 tree at 9b5b1bcf4943 ("Update to seabios-1.9.1-5.el7") as ".config", I've now rebuilt the upstream binary:
...:~/src/upstream/seabios$ cat ~/src/rhel7/seabios-rhel7/redhat/config.base \
>.config
...:~/src/upstream/seabios$ make oldnoconfig
...:~/src/upstream/seabios$ make out/bios.bin
Pointing the /domain/os/loader element in my domain XML to this binary, the bug does *not* reproduce. The RHEL-5 guests (in the rhel6.6.0 machine type) shuts down fine, and there are no ACPI errors in the dmesg. Backporting 4373afaef34a to 7.3.z, and rebasing 7.4 to it, look safe.
scratch build: http://people.redhat.com/ghoffman/bz1392569/ (In reply to Laszlo Ersek from comment #5) > I don't know what iASL version Kevin used for upstream commit 4373afaef34a, > so we should check first... IIRC it is a known-good one, reason for this change actually was the ever-changing iasl (together with the expectation that the tables will not need updates any more due to being used for old machine types only). > Pointing the /domain/os/loader element in my domain XML to this binary, the > bug does *not* reproduce. The RHEL-5 guests (in the rhel6.6.0 machine type) > shuts down fine, and there are no ACPI errors in the dmesg. Even newer guests seem to be affected by this, the scratch build seems to fix bug 1394469 in my testing (not fully sure yet as this one is a bit random). (In reply to Gerd Hoffmann from comment #7) > (In reply to Laszlo Ersek from comment #5) > > I don't know what iASL version Kevin used for upstream commit 4373afaef34a, > > so we should check first... > > IIRC it is a known-good one, reason for this change actually was the > ever-changing iasl (together with the expectation that the tables will not > need updates any more due to being used for old machine types only). > > > Pointing the /domain/os/loader element in my domain XML to this binary, the > > bug does *not* reproduce. The RHEL-5 guests (in the rhel6.6.0 machine type) > > shuts down fine, and there are no ACPI errors in the dmesg. > > Even newer guests seem to be affected by this, the scratch build seems to > fix bug 1394469 in my testing (not fully sure yet as this one is a bit > random). New guests are affected when start with machine type "rhel6.5.0" as bug 1394469 mentioned, but when we update Seabios to the version in Gerd's comment 6, the problem not exist. Bug reproduce info: Test Version: Host Kernel version: 3.10.0-521.el7.x86_64 Guest Kernel version:3.10.0-521.el7.x86_64 qemu-kvm-rhev version: 2.6.0-27.el7.x86_64 seabios version:1.9.1-5.el7.noarch Test cmds: /usr/libexec/qemu-kvm -name rhel7.3-1 \ -machine rhel6.5.0,accel=kvm,usb=off,vmport=off \ -cpu SandyBridge \ -m 4096 \ -realtime mlock=off \ -smp 8,sockets=8,cores=1,threads=1 \ -uuid 1534fa42-4818-4493-9f67-eee5ba758385 \ -no-user-config -nodefaults \ -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait \ -mon chardev=qmp_id_catch_monitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=discard -no-hpet \ -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 \ -boot menu=on,splash-time=1200 \ -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 \ -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 \ -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 \ -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 \ -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 \ -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 \ -drive file=/home/73test/img/se_test.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none,aio=native,snapshot=off \ -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \ -drive file=/home/kvm_autotest_root/iso/linux/RHEL7.3-Server-x86_64.iso,if=none,id=drive-ide0-0-0,readonly=on,format=raw \ -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \ -netdev tap,id=hostnet0 \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=50:54:00:49:b2:5f,bus=pci.0,addr=0x3 \ -chardev pty,id=charserial0 \ -device isa-serial,chardev=charserial0,id=serial0 \ -chardev spicevmc,id=charchannel1,name=vdagent \ -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 \ -device usb-tablet,id=input0 \ -vnc 0:2 \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 \ -chardev spicevmc,id=charredir0,name=usbredir \ -device usb-redir,chardev=charredir0,id=redir0 \ -chardev spicevmc,id=charredir1,name=usbredir \ -device usb-redir,chardev=charredir1,id=redir1 \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 \ -msg timestamp=on \ -monitor stdio \ Siddharth, this is a regression introduced in RHEL-7.3 that is being hit by customers and QE in the field, see bug 1394469 and bug 1394403. The fix is well understood and has been confirmed to fix the problem. I was waiting for some test results to request the PMApproved keyword, I'm sorry it took so long. *** Bug 1394469 has been marked as a duplicate of this bug. *** Reproduced this bug with seabios-1.9.1-5.el7.x86_64 steps: 1./usr/libexec/qemu-kvm -name rhel7.3-1 -machine rhel6.5.0,accel=kvm,usb=off,vmport=off -cpu Opteron_G5 -m 4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -uuid 1534fa42-4818-4493-9f67-eee5ba758385 -no-user-config -nodefaults -chardev socket,id=qmp_id_catch_monitor,path=/var/tmp/test1,server,nowait -mon chardev=qmp_id_catch_monitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot menu=on,splash-time=1200 -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x5 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive file=/home/kvm_autotest_root/images/rhel73-64-virtio.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none,aio=native,snapshot=off -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 -netdev tap,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=50:54:00:49:b2:5f,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0 -vnc 0:2 -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -msg timestamp=on -monitor stdio -vnc :1 2.(qemu) system_reset (qemu) system_reset result:guest hang verified this bug with seabios-1.10.1-2.el7.x86_64 & qemu-kvm-rhev-2.8.0-4.el7.x86_64 executed many times restart command(system_reset) from monitor. guest works well. So this bug is fixed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:1855 |
ACPI table installation by SeaBIOS occurs as follows: (1) if CONFIG_FW_ROMFILE_LOAD is set, and QEMU provides ACPI tables via the "etc/table-loader" fw_cfg file and friends, then those tables are installed. See qemu_platform_setup() in "src/fw/paravirt.c". (2) Otherwise, if CONFIG_ACPI is set, then SeaBIOS's builtin tables are installed, along with any additional (non-critical) ACPI tables coming from QEMU, via the legacy table-wise ACPI interface. (See seabios commit 188d9945a0e5eb59f1a6711e83e79e66562532eb for a reminder.) These ACPI tables are supposed to originate from the user (passed on the QEMU command line). If CONFIG_ACPI_DSDT is also set, and QEMU / the user didn't provide a custom DSDT, then SeaBIOS's internal DSDT is installed as well. See the acpi_setup() call at the end of qemu_platform_setup(), and the CONFIG_ACPI / CONFIG_ACPI_DSDT checks in acpi_setup() [src/fw/acpi.c]. (3) Otherwise, no ACPI tables are installed. In RHEL-7.3 downstream, even with qemu-kvm (that is, not just qemu-kvm-rhev), branches (2) and (3) are never taken in SeaBIOS, because those QEMU versions always provide ACPI via "etc/table-loader". We should make this official, and set CONFIG_ACPI=n -- otherwise it defaults to "y", see "src/Kconfig" -- in all of the redhat/config.* files. This is motivated mainly: - by the lesson learned from bug 1377087, and - by the fact that upstream ACPICA / iASL developers advised us to avoid problems like bug 1377087 by sticking to old iASL releases (they wouldn't introduce compatibility flags for old AML interpreters). Since iASL is part of the build process of SeaBIOS, SeaBIOS's builtin tables and QEMU's own ACPI payload could diverge without bounds -- generally, but specifically wrt. ACPI compatibility too --, as QEMU development progresses *and* the Brew buildroot sees asynchronously upgrades to the acpica-tools package. Because SeaBIOS's built-in tables are never exposed to the guest using RHEL-7 downstream qemu-kvm or qemu-kvm-rhev -- see above --, the change should present no observable change in behavior. However, it would make things safer and official: the currently built-in (and obsolete) tables wouldn't even be present in the SeaBIOS binary. If, for any reason, the SeaBIOS ACPI payload were exposed to the guest, it wouldn't be bogus but apparently valid ACPI stuff -- it would be *no* ACPI stuff at all, which is much-much easier to diagnose.