Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
DescriptionYedidyah Bar David
2022-06-28 11:14:19 UTC
Created attachment 1893165[details]
HostedEngine-console.log
Description of problem:
Some time ago, oVirt's automated testing [1] started failing, timing out on a VM's start. At some point we added a console log to a file of the VM's console, which revealed that it was booted, started the grub menu, selected an entry, then got stuck. Later we tried manually running linux/initrd commands from the console, and saw that some times they are very slow.
Today I spent some time on this, pushed a patch [2] to add a progress indicator (using 'insmod progress') to grub, and see that it's indeed slow - often much slower than expected, around 100KiB/s - but not always - sometimes it gets faster, up to around 10MiB/s (which is, I guess, expected/normal).
I then tried manually to eliminate regressions, by downgrading relevant components to older versions - qemu-kvm, grub, which didn't help, and then seabios-bin, which did. Thus opening the bug on this component.
We are using CentOS Stream 8, and the current/broken version is seabios-bin-1.16.0-1.module_el8.7.0+1140+ff0772f9.noarch .
With seabios-bin-1.15.0-1.module_el8.6.0+1087+b42c8331.noarch , in all my tests I got better throughput - minimum was ~ 400KiB/s, often 1-2MiB/s, sometimes more than 20MiB/s.
[1] https://github.com/oVirt/ovirt-system-tests/
[2] https://github.com/oVirt/ovirt-system-tests/pull/203
Version-Release number of selected component (if applicable):
seabios-bin-1.16.0-1.module_el8.7.0+1140+ff0772f9.noarch
How reproducible:
Often, but not sure - not always
Steps to Reproduce:
1. Boot a VM with a large (the default) kernel+initrd
2.
3.
Actual results:
Sometimes it's very slow
Expected results:
Should always be as fast as the platform allows
Additional info:
Attaching a console log which is a result of [2] above. You can view it comfortably with:
$ cat HostedEngine-console.log | tr '\r' '\n' | less -R +/'\[ vml|\[ ini'
What exactly is the boot device?
Can you capture a firmware log?
<qemu:commandline>
<qemu:arg value='-chardev'/>
<qemu:arg value='file,id=firmware,path=/tmp/edk2-devel-q35.log'/>
<qemu:arg value='-device'/>
<qemu:arg value='isa-debugcon,iobase=0x402,chardev=firmware'/>
</qemu:commandline>
Comment 2Yedidyah Bar David
2022-06-29 10:04:27 UTC
(In reply to Gerd Hoffmann from comment #1)
> What exactly is the boot device?
Attaching the libvirt xml (as copied from a typical successful run of our CI, should be mostly identical to current case).
> Can you capture a firmware log?
>
> <qemu:commandline>
> <qemu:arg value='-chardev'/>
> <qemu:arg value='file,id=firmware,path=/tmp/edk2-devel-q35.log'/>
> <qemu:arg value='-device'/>
> <qemu:arg value='isa-debugcon,iobase=0x402,chardev=firmware'/>
> </qemu:commandline>
I'll try.
three changes for virtio-blk in 1.15 ... 1.16:
a05af290bac5 virtio-blk: split large IO according to size_max
815d7498655b virtio-blk: abstract a function named virtio_blk_op_one_segment to handle r/w request
27b573d4f5a5 virtio-blk: add feature VIRTIO_BLK_F_SIZE_MAX and VIRTIO_BLK_F_SEG_MAX
They make seabios query the host for the maximum request size and if io requests
are larger than that seabios will split them into smaller ones. Should essentially
be a no-op, but maybe there is a bug in there that seabios splits requests even
though it is not needed, thereby slowing down the boot.
The firmware log (grep for lines with virtio-blk) should help figuring this.
Comment 8Yedidyah Bar David
2022-06-30 06:12:18 UTC
Attached some console and firmware logs.
Files named *-03 are from a plain unattended boot.
Files named *-04* are from a run where I entered the grub prompt and ran several 'initrd' commands to load the default initramfs image (~ 67 MB). During some of them, I imposed cpu load on the host (plain 'while true; do :; done &'). Tried running more than one endless loop in parallel but this caused no difference - apparently we (vdsm/libvirt, no idea) somehow allocate a cpu to the qemu process for itself (the host (actually a nested-kvm VM) has 2 cpus). While cpu was loaded, throughput was ~ 100KB/s. When not, it was faster, but changing significantly.
Files named *-05* are from a similar flow as 04, but after downgrading seabios-bin to 1.15. Throughput during cpu load is ~ 500 KB/s, without cpu load it's more than 1MB/s (in previous cases it also went much higher, not sure what changed. Perhaps the various tests cause memory fragmentation or whatever that makes it rather slow even unloaded, just a wild guess).
(In reply to Yedidyah Bar David from comment #8)
> Attached some console and firmware logs.
Seems to be corrupted or incomplete. I get a decompression error + only HostedEngine-console.log-02 unpacked, the other files are missing.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Low: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2022:7472