Red Hat Bugzilla – Bug 1171124
libvirtd occasionally crashes at the end of migration
Last modified: 2015-01-05 15:30:15 EST
This bug has been copied from bug #1162208 and has been proposed to be backported to 7.0 z-stream (EUS).
Reproduce : version: libvirt-1.1.1-29.el7_0.3.x86_64 qemu-kvm-rhev-1.5.3-60.el7_0.11 kernel-3.10.0-123.17.1.el7.x86_64 according to comment 4 Doc Text: Cause: Libvirt did not properly check whether a DAC security label is non-NULL before trying to parse user/group ownership from it. Consequence: When virDomainGetBlockInfo API is called on a transient domain that has just finished migration to another host, its DAC security label may already be NULL, which crashes libvirtd. and the title of patch : Fix crash when saving a domain with type none dac label. reproduce issue as following steps : 1>try to migrate 1.create a transient domain with seclabel type is none and model is dac # virsh dumpxml r7 | grep seclabel <seclabel model='selinux' labelskip='yes'/> <seclabel type='none' model='dac'/> 2. do migration migrate to dest host: # virsh migrate r7 qemu+ssh://$ip/system root@ip's password: # virsh list --all Id Name State ---------------------------------------------------- migrate back to source host: # virsh list --all Id Name State ---------------------------------------------------- 9 r7 running check domblkinfo ,libvirt will crash : # virsh domblklist r7 Target Source ------------------------------------------------ vda /tmp/zp/r7.img # virsh domblkinfo r7 vda error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor 2> try to save / restore # virsh list Id Name State ---------------------------------------------------- 17 r7 running # virsh save r7 r7.save error: Failed to save domain r7 to r7.save error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor # service libvirtd status Redirecting to /bin/systemctl status libvirtd.service libvirtd.service - Virtualization daemon Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled) Active: failed (Result: signal) since Thu 2014-12-11 18:10:06 CST; 15s ago Process: 11911 ExecStart=/usr/sbin/libvirtd $LIBVIRTD_ARGS (code=killed, signal=SEGV) Main PID: 11911 (code=killed, signal=SEGV) Verify: verify version : libvirt-1.1.1-29.el7_0.4.x86_64 qemu-kvm-rhev-1.5.3-60.el7_0.11 kernel-3.10.0-123.17.1.el7.x86_64 steps: 1>try to migrate 1.create a transient domain with seclabel type is none and model is dac # virsh dumpxml r7 | grep seclabel <seclabel model='selinux' labelskip='yes'/> <seclabel type='none' model='dac'/> 2. do migration migrate to dst host: # virsh migrate r7 qemu+ssh://$ip/system root@$ip's password: # virsh list --all Id Name State ---------------------------------------------------- migrate back to source host: # virsh list --all Id Name State ---------------------------------------------------- 11 r7 running check domblkinfo ,libvirt works well , get domain block info successfully : # virsh domblklist r7 Target Source ------------------------------------------------ vda /tmp/zp/r7.img # virsh domblkinfo r7 vda Capacity: 8589934592 Allocation: 8589938688 Physical: 8589938688 2> try to save / restore , domain can save / restore successfully # virsh list Id Name State ---------------------------------------------------- 16 r7 running # virsh save r7 r7.save Domain r7 saved to r7.save # virsh restore r7.save Domain restored from r7.save # virsh list Id Name State ---------------------------------------------------- 17 r7 running
test in another scenario : Reproduce version: libvirt-1.1.1-29.el7_0.3.x86_64 qemu-kvm-rhev-1.5.3-60.el7_0.11 kernel-3.10.0-123.17.1.el7.x86_64 steps to reproduce : 1.get libvirt-1.1.1-29.el7_0.3.src.rpm , rebuild the libvirt and add a patch (the patch from bug #1162208 ) 2.install new libvirt rpm packets produced in step 1 . #service libvirtd restart . 3.prepare a domain XML and create a transient domain . # virsh list --all Id Name State ---------------------------------------------------- 4 r7 running # virsh dumpxml r7 | grep seclabel <seclabel model='selinux' relabel='yes'/> <seclabel type='none' model='dac'/> 4.in one terminal do --p2p migrate domain from source to host # virsh migrate r7 --live --p2p qemu+ssh://$ip/system 5.in other terminal check libvirt debug log to find sleep flag # tailf libvirt.log | grep SLEEPING 2014-12-17 07:29:56.488+0000: 30884: debug : doPeer2PeerMigrate:4070 : SLEEPING then do domblkinfo : # virsh domblkinfo r7 vda error: End of file while reading data: Input/output error error: One or more references were leaked after disconnect from the hypervisor error: Failed to reconnect to the hypervisor 6.using gdb to get crash backtrace and check two threads info involved in this crash (compared with bug #1162208 comment 5) : ...... (gdb) bt #0 0x00007f189311f158 in __strchr_sse42 () from /lib64/libc.so.6 #1 0x00007f1895f194d0 in virParseOwnershipIds (label=0x0, uidPtr=uidPtr@entry=0x7f1886b8a818, gidPtr=gidPtr@entry=0x7f1886b8a81c) at util/virutil.c:2072 #2 0x00007f187f39129e in qemuOpenFile (driver=driver@entry=0x7f18781567e0, vm=vm@entry=0x7f186c000e90, path=path@entry=0x7f186c00c7e0 "/tmp/zp/r7raw.img", oflags=oflags@entry=0, needUnlink=needUnlink@entry=0x0, bypassSecurityDriver=bypassSecurityDriver@entry=0x0) at qemu/qemu_driver.c:2780 #3 0x00007f187f39a29e in qemuDomainGetBlockInfo (dom=0x7f18781fbba0, path=0x7f186c00c7e0 "/tmp/zp/r7raw.img", info=0x7f1886b8ab40, flags=<optimized out>) at qemu/qemu_driver.c:10124 #4 0x00007f1895f99734 in virDomainGetBlockInfo (domain=domain@entry=0x7f18781fbba0, disk=0x7f18781fe5e0 "vda", info=info@entry=0x7f1886b8ab40, flags=0) at libvirt.c:9110 #5 0x00007f1896992b04 in remoteDispatchDomainGetBlockInfo (server=<optimized out>, msg=<optimized out>, ret=0x7f18781fbf10, args=0x7f18781fbf30, rerr=0x7f1886b8ac80, client=<optimized out>) at remote_dispatch.h:3487 #6 remoteDispatchDomainGetBlockInfoHelper (server=<optimized out>, client=<optimized out>, msg=<optimized out>, rerr=0x7f1886b8ac80, args=0x7f18781fbf30, ret=0x7f18781fbf10) at remote_dispatch.h:3463 #7 0x00007f1895ff21ba in virNetServerProgramDispatchCall (msg=0x7f1898781700, client=0x7f1898787500, server=0x7f18987723d0, prog=0x7f189877e150) at rpc/virnetserverprogram.c:435 #8 virNetServerProgramDispatch (prog=0x7f189877e150, server=server@entry=0x7f18987723d0, client=0x7f1898787500, msg=0x7f1898781700) at rpc/virnetserverprogram.c:305 #9 0x00007f1895fecd28 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x7f18987723d0) at rpc/virnetserver.c:166 #10 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7f18987723d0) at rpc/virnetserver.c:187 #11 0x00007f1895f115e5 in virThreadPoolWorker (opaque=opaque@entry=0x7f18987568e0) at util/virthreadpool.c:144 #12 0x00007f1895f10f7e in virThreadHelper (data=<optimized out>) at util/virthreadpthread.c:194 #13 0x00007f18937bcdf3 in start_thread () from /lib64/libpthread.so.0 #14 0x00007f18930e33dd in clone () from /lib64/libc.so.6 (gdb) info thread Id Target Id Frame 11 Thread 0x7f188738c700 (LWP 30882) "libvirtd" 0x00007f18937c0705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 * 10 Thread 0x7f1886b8b700 (LWP 30883) "libvirtd" 0x00007f189311f158 in __strchr_sse42 () from /lib64/libc.so.6 9 Thread 0x7f188638a700 (LWP 30884) "libvirtd" 0x00007f18930aa8ad in nanosleep () from /lib64/libc.so.6 (gdb) thread 9 [Switching to thread 9 (Thread 0x7f188638a700 (LWP 30884))] #0 0x00007f18930aa8ad in nanosleep () from /lib64/libc.so.6 (gdb) bt #0 0x00007f18930aa8ad in nanosleep () from /lib64/libc.so.6 #1 0x00007f18930aa744 in sleep () from /lib64/libc.so.6 #2 0x00007f187f3710f1 in doPeer2PeerMigrate (v3proto=<synthetic pointer>, resource=0, dname=<optimized out>, flags=<optimized out>, listenAddress=<optimized out>, graphicsuri=<optimized out>, uri=<optimized out>, dconnuri=<optimized out>, xmlin=<optimized out>, vm=0x7f186c000e90, sconn=0x7f18781fed00, driver=0x7f18781567e0) at qemu/qemu_migration.c:4071 #3 qemuMigrationPerformJob (driver=driver@entry=0x7f18781567e0, conn=0x7f18781fed00, conn@entry=0x0, vm=vm@entry=0x7f186c000e90, xmlin=xmlin@entry=0x0, dconnuri=<optimized out>, uri=<optimized out>, graphicsuri=0x0, listenAddress=0x0, cookiein=0x0, cookieinlen=0, cookieout=0x7f1886389b58, cookieoutlen=0x7f1886389b54, flags=3, dname=0x0, resource=0, v3proto=true) at qemu/qemu_migration.c:4129 #4 0x00007f187f3723d9 in qemuMigrationPerform (driver=driver@entry=0x7f18781567e0, conn=0x0, vm=vm@entry=0x7f186c000e90, xmlin=0x0, dconnuri=dconnuri@entry=0x7f1858000d80 "qemu+ssh://$ip/system", uri=0x7f18580008c0 "\220\037", graphicsuri=0x0, listenAddress=0x0, cookiein=cookiein@entry=0x0, cookieinlen=cookieinlen@entry=0, cookieout=cookieout@entry=0x7f1886389b58, cookieoutlen=cookieoutlen@entry=0x7f1886389b54, flags=flags@entry=3, dname=0x0, resource=0, v3proto=v3proto@entry=true) at qemu/qemu_migration.c:4313 #5 0x00007f187f393b0d in qemuDomainMigratePerform3Params (dom=0x7f1858000d40, dconnuri=0x7f1858000d80 "qemu+ssh://$ip/system", params=<optimized out>, nparams=0, cookiein=0x0, cookieinlen=0, cookieout=0x7f1886389b58, cookieoutlen=0x7f1886389b54, flags=3) at qemu/qemu_driver.c:10910 #6 0x00007f1895f94bdf in virDomainMigratePerform3Params (domain=domain@entry=0x7f1858000d40, dconnuri=0x7f1858000d80 "qemu+ssh://$ip/system", params=params@entry=0x7f1858000ee0, nparams=0, cookiein=0x0, cookieinlen=0, cookieout=cookieout@entry=0x7f1886389b58, cookieoutlen=cookieoutlen@entry=0x7f1886389b54, flags=3) at libvirt.c:7401 #7 0x00007f18969866ef in remoteDispatchDomainMigratePerform3Params (server=<optimized out>, msg=<optimized out>, ret=0x7f1858000cc0, args=0x7f1858000ce0, rerr=0x7f1886389c80, client=<optimized out>) at remote.c:4978 #8 remoteDispatchDomainMigratePerform3ParamsHelper (server=<optimized out>, client=<optimized out>, msg=<optimized out>, rerr=0x7f1886389c80, args=0x7f1858000ce0, ret=0x7f1858000cc0) at remote_dispatch.h:5631 #9 0x00007f1895ff21ba in virNetServerProgramDispatchCall (msg=0x7f18987734b0, client=0x7f1898781e30, server=0x7f18987723d0, prog=0x7f189877e150) at rpc/virnetserverprogram.c:435 #10 virNetServerProgramDispatch (prog=0x7f189877e150, server=server@entry=0x7f18987723d0, client=0x7f1898781e30, msg=0x7f18987734b0) at rpc/virnetserverprogram.c:305 #11 0x00007f1895fecd28 in virNetServerProcessMsg (msg=<optimized out>, prog=<optimized out>, client=<optimized out>, srv=0x7f18987723d0) at rpc/virnetserver.c:166 #12 virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7f18987723d0) at rpc/virnetserver.c:187 #13 0x00007f1895f115e5 in virThreadPoolWorker (opaque=opaque@entry=0x7f1898756780) at util/virthreadpool.c:144 #14 0x00007f1895f10f7e in virThreadHelper (data=<optimized out>) at util/virthreadpthread.c:194 #15 0x00007f18937bcdf3 in start_thread () from /lib64/libpthread.so.0 #16 0x00007f18930e33dd in clone () from /lib64/libc.so.6 Verify version: libvirt-1.1.1-29.el7_0.4.x86_64 qemu-kvm-rhev-1.5.3-60.el7_0.11 kernel-3.10.0-123.17.1.el7.x86_64 Verify steps: 1.get libvirt-1.1.1-29.el7_0.4.src.rpm . rebuild the libvirt and add a patch (the patch from bug #1162208 ) 2.install new libvirt rpm packets produced in step 1 . #service libvirtd restart . 3.prepare a domain XML and create a transient domain . # virsh list --all Id Name State ---------------------------------------------------- 4 r7 running # virsh dumpxml r7 | grep seclabel <seclabel model='selinux' relabel='yes'/> <seclabel type='none' model='dac'/> 4.in one terminal do --p2p migrate domain from source to host # virsh migrate r7 --live --p2p qemu+ssh://$ip/system 5.in other terminal check libvirt debug log to find sleep flag # tailf libvirt.log | grep SLEEPING 2014-12-17 07:29:56.488+0000: 30884: debug : doPeer2PeerMigrate:4070 : SLEEPING then do domblkinfo : # virsh domblkinfo r7 vda Capacity: 8589934592 Allocation: 8589934592 Physical: 8589934592 6.check in dest host , domain running well .
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0008.html