Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1820279

Summary: RHEL 6 KVM VMs hang and coredumps
Product: Red Hat Enterprise Linux 6 Reporter: dyuen
Component: qemu-kvmAssignee: Amnon Ilan <ailan>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.10CC: ehabkost, jen, jinzhao, juzhang, knoel, mkenneth, pbonzini, tvainio, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-25 11:50:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description dyuen 2020-04-02 16:26:54 UTC
Description of problem:
VMs on a host are seen to hang, and the host is generating segfaults for QEMU-KVM.


Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux Server release 6.10 (Santiago) - 2.6.32-754.18.2.el6.x86_64

gpxe-roms-qemu-0.9.7-6.16.el6.noarch                        Tue Jun 18 19:20:53 2019
qemu-img-0.12.1.2-2.506.el6_10.4.x86_64                     Tue Oct 22 06:45:19 2019
qemu-kvm-0.12.1.2-2.506.el6_10.4.x86_64                     Tue Oct 22 06:45:32 2019

How reproducible:
Base on cust provided output.  VMs are generating.
# ll -ha /ericsson/enm/dumps
total 24G
drwxrwxr-x. 17     308 jboss   8.0K Apr  1 15:35 .
drwxr-xr-x.  4 root    root    4.0K Jun 18  2019 ..
-rw-------.  1 root    root    240K Apr  1 12:35 core.dlgenmsvc3.date.pid23177.usr0.sig11.tim1585724702
-rw-------.  1 root    root    240K Apr  1 00:35 core.dlgenmsvc3.date.pid34048.usr0.sig11.tim1585681503
-rw-------.  1 root    root    362M Mar 31 02:14 core.dlgenmsvc3.java.pid28119.usr0.sig6.tim1585601042
-rw-------.  1 root    root    361M Mar 31 09:44 core.dlgenmsvc3.java.pid37069.usr0.sig6.tim1585628043
-rw-------.  1 root    root    135M Apr  1 10:33 core.svc-3-cmserv.java.pid26347.usr0.sig6.tim1585717382
-rw-------.  1 root    root    121M Mar 30 23:09 core.svc-3-cmserv.java.pid9985.usr0.sig6.tim1585589970
-rw-------.  1 root    root     59M Apr  1 10:09 core.svc-3-comecimpolicy.java.pid2717.usr0.sig6.tim1585715955
-rw-------.  1 root    root     76M Mar 30 23:10 core.svc-3-eventbasedclient.java.pid15862.usr0.sig6.tim1585590030
-rw-------.  1 root    root     43M Apr  1 01:05 core.svc-3-eventbasedclient.jps.pid24737.usr0.sig6.tim1585683303
-rw-------.  1 root    root     85M Apr  1 02:57 core.svc-3-flowautomation.java.pid10998.usr0.sig6.tim1585690042
-rw-------.  1 root    root     87M Mar 31 06:43 core.svc-3-flowautomation.java.pid23073.usr0.sig6.tim1585617233
-rw-------.  1 root    root    191M Mar 31 02:37 core.svc-3-fmalarmprocessing.java.pid30555.usr0.sig6.tim1585602420
-rw-------.  1 root    root    175M Mar 31 09:36 core.svc-3-fmhistory.java.pid19464.usr0.sig6.tim1585627562
-rw-------.  1 root    root    151M Mar 30 23:49 core.svc-3-fmhistory.java.pid23896.usr0.sig6.tim1585592342
-rw-------.  1 root    root     81M Mar 30 21:06 core.svc-3-fmhistory.java.pid28357.usr0.sig6.tim1585582561
-rw-------.  1 root    root    127M Apr  1 09:05 core.svc-3-fmhistory.jps.pid31902.usr0.sig6.tim1585712103
-rw-------.  1 root    root    131M Mar 31 07:53 core.svc-3-fmx.java.pid11702.usr0.sig6.tim1585621382
-rw-------.  1 root    root    371M Mar 31 17:33 core.svc-3-fmx.java.pid13922.usr0.sig6.tim1585656183
-rw-------.  1 root    root    236M Mar 31 11:35 core.svc-3-fmx.java.pid16759.usr0.sig6.tim1585634703
-rw-------.  1 root    root    209M Mar 31 06:12 core.svc-3-fmx.java.pid18052.usr0.sig6.tim1585615322
-rw-------.  1 root    root    139M Apr  1 05:43 core.svc-3-fmx.java.pid18303.usr0.sig6.tim1585699982
-rw-------.  1 root    root    225M Mar 31 23:01 core.svc-3-fmx.java.pid19208.usr0.sig6.tim1585675861
-rw-------.  1 root    root    411M Mar 31 05:59 core.svc-3-fmx.java.pid20005.usr0.sig6.tim1585614542
-rw-------.  1 root    root    210M Mar 31 17:43 core.svc-3-fmx.java.pid22596.usr0.sig6.tim1585656781
-rw-------.  1 root    root    125M Mar 31 19:56 core.svc-3-fmx.java.pid24257.usr0.sig6.tim1585664762
-rw-------.  1 root    root    115M Mar 31 07:37 core.svc-3-fmx.java.pid24384.usr0.sig6.tim1585620421
-rw-------.  1 root    root    320M Mar 31 09:19 core.svc-3-fmx.java.pid25951.usr0.sig6.tim1585626542
-rw-------.  1 root    root    235M Apr  1 07:35 core.svc-3-fmx.java.pid28354.usr0.sig6.tim1585706703
-rw-------.  1 root    root    226M Apr  1 02:08 core.svc-3-fmx.java.pid29728.usr0.sig6.tim1585687082
-rw-------.  1 root    root    208M Mar 30 21:36 core.svc-3-fmx.java.pid30704.usr0.sig6.tim1585584362
-rw-------.  1 root    root    211M Mar 31 13:01 core.svc-3-fmx.java.pid31247.usr0.sig6.tim1585639862
-rw-------.  1 root    root    119M Apr  1 12:10 core.svc-3-fmx.java.pid4138.usr0.sig6.tim1585723202
-rw-------.  1 root    root    221M Mar 30 22:30 core.svc-3-fmx.java.pid4239.usr0.sig6.tim1585587602
-rw-------.  1 root    root    408M Apr  1 09:31 core.svc-3-fmx.java.pid4440.usr0.sig6.tim1585713663
-rw-------.  1 root    root    212M Mar 31 04:55 core.svc-3-fmx.java.pid5238.usr0.sig6.tim1585610702
-rw-------.  1 root    root    226M Mar 31 04:11 core.svc-3-fmx.java.pid7340.usr0.sig6.tim1585608062
-rw-------.  1 root    root    238M Apr  1 00:50 core.svc-3-fmx.java.pid8815.usr0.sig6.tim1585682404
-rw-------.  1 root    root    173M Mar 30 23:46 core.svc-3-httpd.java.pid24623.usr0.sig6.tim1585592163
-rw-------.  1 root    root    184M Mar 31 15:46 core.svc-3-httpd.java.pid30252.usr0.sig6.tim1585649764
-rw-------.  1 root    root    151M Mar 31 16:51 core.svc-3-impexpserv.java.pid19998.usr0.sig6.tim1585653717


Steps to Reproduce:
1.
2.
3.

Actual results:
Host generating segfaults for QEMU-KVM

Mar 15 01:02:00 dlgenmsvc3 kernel: python[62788]: segfault at 46e49b6c ip 00007f238d700018 sp 00007fff73db3810 error 6 in libpython2.6.so.1.0[7f238d670000+15d000]
Mar 15 20:45:03 dlgenmsvc3 kernel: lvs[23917]: segfault at 7f17bf3afff8 ip 00007f17bf3afff8 sp 00007fff5a7317e0 error 14 in libdevmapper.so.1.02[7f17bf349000+200000]
Mar 16 03:35:31 dlgenmsvc3 kernel: lvs[17742]: segfault at 7fe3931afff8 ip 00007fe3931afff8 sp 00007ffd9f0818a0 error 14 in libdevmapper.so.1.02[7fe393149000+200000]
Mar 17 16:35:30 dlgenmsvc3 kernel: lvs[11925]: segfault at 7f447988fff8 ip 00007f447988fff8 sp 00007ffea255ccb0 error 14 in libdevmapper.so.1.02[7f4479829000+200000]
Mar 18 09:07:55 dlgenmsvc3 kernel: python[31356]: segfault at 46c87b6c ip 00007fdf6bb00018 sp 00007ffd6b7f5eb0 error 6 in libpython2.6.so.1.0[7fdf6ba70000+15d000]
Mar 19 03:31:37 dlgenmsvc3 kernel: python[51767]: segfault at 46f38efc ip 00007f2a61600018 sp 00007ffc23d38a60 error 6 in libpython2.6.so.1.0[7f2a61570000+15d000]
Mar 19 09:00:02 dlgenmsvc3 kernel: vgs[8819]: segfault at 7f1d3abcfff8 ip 00007f1d3abcfff8 sp 00007ffc066ef8f0 error 14 in libdevmapper.so.1.02[7f1d3ab69000+200000]
Mar 23 03:13:28 dlgenmsvc3 kernel: python[61689]: segfault at 470d5b6c ip 00007fe2a3400018 sp 00007ffdb173e740 error 6 in libpython2.6.so.1.0[7fe2a3370000+15d000]
Mar 23 04:50:03 dlgenmsvc3 kernel: sh[51015]: segfault at 7f417c100018 ip 00007f417c100018 sp 00007fffffc04bc0 error 14 in ld-2.12.so[7f417c17d000+20000]
Mar 25 18:42:14 dlgenmsvc3 kernel: qemu-kvm[16414]: segfault at 7f6dab000183 ip 00007f6dab000183 sp 00007fff25b160e0 error 14 in libaio.so.1.0.1[7f6daae3c000+1ff000]
Mar 30 19:40:13 dlgenmsvc3 kernel: python[60068]: segfault at 0 ip 00007ff078301018 sp 00007ffefebfa690 error 6 in libpython2.6.so.1.0[7ff0782e5000+15d000]



Expected results:
No segfault messages



Additional info:
Cust provided sosreport and core file to case 02621916
 - core.svc-3-eventbasedclient.jps.pid17442.usr0.sig6.tim1585050611
 - sosreport-dlgenmsvc3.3677085-20200331070513.tar.xz
No system outage, but throughput is degraded

Comment 2 FuXiangChun 2020-04-03 06:07:10 UTC
Hi David,

I used virt-manager to boot RHEL6. Guest works well. So I cann't reproduce this bug. This is detailed qemu-kvm command line.  Could you provide qemu command line or layer product tool for me?

/usr/libexec/qemu-kvm -name rhel6 -S -M rhel6.6.0 -cpu Opteron_G5 -enable-kvm -m 4096 -realtime mlock=off -smp 8,sockets=8,cores=1,threads=1 -uuid 9ad6e91e-b072-da76-4039-0f58559c59ad -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/rhel6.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot order=c,menu=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -drive file=/home/rhel6.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0 -drive file=/home/RHEL6.10-Server-x86_64.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:bd:07:7e,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0 -vnc 127.0.0.1:0 -vga cirrus -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on

 
# uname -r
2.6.32-754.el6.x86_64
# rpm -qa|grep qemu
qemu-guest-agent-0.12.1.2-2.506.el6_10.7.x86_64
qemu-kvm-0.12.1.2-2.506.el6_10.7.x86_64
qemu-img-0.12.1.2-2.506.el6_10.7.x86_64
qemu-kvm-tools-0.12.1.2-2.506.el6_10.7.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.506.el6_10.7.x86_64
gpxe-roms-qemu-0.9.7-6.16.el6.noarch

Comment 7 Eduardo Habkost 2020-06-11 20:45:40 UTC
In most of the segfaults shown in the logs, the faulting address is at the instruction pointer:

Mar 15 20:45:03 dlgenmsvc3 kernel: lvs[23917]: segfault at 7f17bf3afff8 ip 00007f17bf3afff8 sp 00007fff5a7317e0 error 14 in libdevmapper.so.1.02[7f17bf349000+200000]
Mar 16 03:35:31 dlgenmsvc3 kernel: lvs[17742]: segfault at 7fe3931afff8 ip 00007fe3931afff8 sp 00007ffd9f0818a0 error 14 in libdevmapper.so.1.02[7fe393149000+200000]
Mar 17 16:35:30 dlgenmsvc3 kernel: lvs[11925]: segfault at 7f447988fff8 ip 00007f447988fff8 sp 00007ffea255ccb0 error 14 in libdevmapper.so.1.02[7f4479829000+200000]
Mar 19 09:00:02 dlgenmsvc3 kernel: vgs[8819]: segfault at 7f1d3abcfff8 ip 00007f1d3abcfff8 sp 00007ffc066ef8f0 error 14 in libdevmapper.so.1.02[7f1d3ab69000+200000]
Mar 23 04:50:03 dlgenmsvc3 kernel: sh[51015]: segfault at 7f417c100018 ip 00007f417c100018 sp 00007fffffc04bc0 error 14 in ld-2.12.so[7f417c17d000+20000]
Mar 25 18:42:14 dlgenmsvc3 kernel: qemu-kvm[16414]: segfault at 7f6dab000183 ip 00007f6dab000183 sp 00007fff25b160e0 error 14 in libaio.so.1.0.1[7f6daae3c000+1ff000]

In all of them except the sh[51015] crash, the IP value is inside the shared library address range.  I don't know what could be causing those segfaults, but I suspect the cause is not related to KVM at all.