Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Created attachment 1136350[details]
kernel-panic-1.log
Description of problem:
After doing migration with high load (iozone, dd, stress at the same time) 9 times:
1. get kernel panic 3 times;
2. xfs error and guest no response with ssh and console but ping from external can success 3 times;
3. systemd error and guest no response with ssh and console but ping from external can success 1 time;
4. successfully migrated 2 times.
Version-Release number of selected component (if applicable):
Host & Guest kernel:
3.10.0-364.rt56.241.el7.x86_64
Host qemu-kvm:
qemu-kvm-rhev-2.5.0-2.el7.x86_64
How reproducible:
75%
Steps to Reproduce:
1. boot guest in src host
/usr/libexec/qemu-kvm -name rhel7.2-rt-355 -machine pc-i440fx-rhel7.2.0 -cpu IvyBridge -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 \
-drive file=/home/rhel7.2-rt-355.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,snapshot=off -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 \
-netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a1:d0:5f \
-monitor stdio -device qxl-vga,id=video0 -serial unix:/tmp/console,server,nowait -vnc :1 -spice port=5900,disable-ticketing
2. set migration parameters in src host hmp
1). migrate_set_speed 2G
2). migrate_set_capability xbzrle on
3. run high stress in guest
1). for((;;)); do iozone -a; done
2). for((;;)); do dd if=/dev/zero if=/home/test bs=1M count=50; done
3). stress --cpu 4 --vm-bytes 2048M --timeout 300s
4. boot guest in dst host with "-incoming tcp:0:4444"
5. migrate
1). migrate -d tcp:10.73.64.233:4444
2). migrate_set_downtime 20
6. observe output of "nc -U /tmp/console" on dst host
Actual results:
Four kinds of result have been observed.
1. kernel panic, 3 times
Only the logs in console been kept, but lost track of core file. Once I get the file I will upload it.
And the log files (I kept 2 files) is in the attachment called kernel-panic-1.log and kernel-panic-2.log.
2. xfs error and guest in dst host have no response, 3 times
Log file is xfs-error.log
3. systemd segfault and guest in dst host have no reponse, 1 time
Log file is systemd-segfault.log
4. guest migrated normally, 2 times
Expected results:
Guest migrated normally all the time
Additional info:
Tested this case on non-rt kernel, no error occured.
Hello xiywang,
Can you please test with kernel >= kernel-rt-3.10.0-548.rt56.456.el7 and check if issue still persists. If it persists Can you please provide me guest core dump to look at what's going on inside guest kernel.
Best regards,
Pankaj
(In reply to pagupta from comment #14)
> Hello xiywang,
>
> Can you please test with kernel >= kernel-rt-3.10.0-548.rt56.456.el7 and
> check if issue still persists. If it persists Can you please provide me
> guest core dump to look at what's going on inside guest kernel.
>
> Best regards,
> Pankaj
Tested again. The issue remains.
Host & Guest kernel:
# uname -r
3.10.0-576.rt56.486.el7.x86_64
Host qemu-kvm-rhev:
#rpm -qa | grep qemu-kvm-rhev
qemu-kvm-rhev-2.8.0-5.el7.x86_64
After migration, the network in guest is down. However, I can still manage the guest by remote-viewer or console. Get the vmcore file and uploaded though no error message in "dmesg" and qemu-kvm command line.
Besides, I'm not sure whether this can help you diagnose:
After manually triggered core dump in guest, I saw these in dmesg of the guest:
[ 46.666934] blk_update_request: I/O error, dev fd0, sector 0
[ 48.478112] nr_pdflush_threads exported in /proc is scheduled for removal
--
Celia
(In reply to pagupta from comment #17)
> Hi Pei, Xiywang,
>
> Is this bug present with the latest version of realtime KVM.
>
> Thanks,
> Pankaj
Hi Pankaj,
With latest versions, this issue has gone.
Versions:
3.10.0-843.rt56.784.el7.x86_64
qemu-kvm-rhev-2.10.0-19.el7.x86_64
tuned-2.9.0-1.el7.noarch
Steps:
Following steps in Description. All 10 migrations run work well, no any error in host and guest.
Best Regards,
Pei
Created attachment 1136350 [details] kernel-panic-1.log Description of problem: After doing migration with high load (iozone, dd, stress at the same time) 9 times: 1. get kernel panic 3 times; 2. xfs error and guest no response with ssh and console but ping from external can success 3 times; 3. systemd error and guest no response with ssh and console but ping from external can success 1 time; 4. successfully migrated 2 times. Version-Release number of selected component (if applicable): Host & Guest kernel: 3.10.0-364.rt56.241.el7.x86_64 Host qemu-kvm: qemu-kvm-rhev-2.5.0-2.el7.x86_64 How reproducible: 75% Steps to Reproduce: 1. boot guest in src host /usr/libexec/qemu-kvm -name rhel7.2-rt-355 -machine pc-i440fx-rhel7.2.0 -cpu IvyBridge -m 4096 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 \ -drive file=/home/rhel7.2-rt-355.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,snapshot=off -device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0 \ -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:a1:d0:5f \ -monitor stdio -device qxl-vga,id=video0 -serial unix:/tmp/console,server,nowait -vnc :1 -spice port=5900,disable-ticketing 2. set migration parameters in src host hmp 1). migrate_set_speed 2G 2). migrate_set_capability xbzrle on 3. run high stress in guest 1). for((;;)); do iozone -a; done 2). for((;;)); do dd if=/dev/zero if=/home/test bs=1M count=50; done 3). stress --cpu 4 --vm-bytes 2048M --timeout 300s 4. boot guest in dst host with "-incoming tcp:0:4444" 5. migrate 1). migrate -d tcp:10.73.64.233:4444 2). migrate_set_downtime 20 6. observe output of "nc -U /tmp/console" on dst host Actual results: Four kinds of result have been observed. 1. kernel panic, 3 times Only the logs in console been kept, but lost track of core file. Once I get the file I will upload it. And the log files (I kept 2 files) is in the attachment called kernel-panic-1.log and kernel-panic-2.log. 2. xfs error and guest in dst host have no response, 3 times Log file is xfs-error.log 3. systemd segfault and guest in dst host have no reponse, 1 time Log file is systemd-segfault.log 4. guest migrated normally, 2 times Expected results: Guest migrated normally all the time Additional info: Tested this case on non-rt kernel, no error occured.