Red Hat Bugzilla – Bug 1163463
use after free in callers of virNetDevLinkDump
Last modified: 2015-03-05 02:47:30 EST
The coding error described below could lead to failures when setting up type='hostdev' or type='direct' (with macvtap passthrough mode) network interfaces in a guest; in particular, the symptom would be an error while trying to set the MAC address of the SRIOV virtual function. Fix in libvirt upstream: commit f9f9699f40729556238b905f67a7d6f68c084f6a Author: Laine Stump <laine@laine.org> Date: Thu Oct 16 00:49:01 2014 +0200 util: eliminate "use after free" in callers of virNetDevLinkDump virNetDevLinkDump() gets a message from netlink into "resp", then calls nlmsg_parse() to fill the table "tb" with pointers into resp. It then returns tb to its caller, but not before freeing the buffer at resp. That means that all the callers of virNetDevLinkDump() are examining memory that has already been freed. This can be verified by filling the buffer at resp with garbage prior to freeing it (or, I suppose, just running libvirtd under valgrind) then performing some operation that calls virNetDevLinkDump(). The code has been like this ever since virNetDevLinkDump() was written - the original author didn't notice it, and neither did later additional users of the function. It has only been pure luck (or maybe a lack of heavy load, and/or maybe an allocation algorithm in malloc() that delays re-use of just-freed memory) that has kept this from causing errors, for example when configuring a PCI passthrough or macvtap passthrough network interface. The solution taken in this patch is the simplest - just return resp to the caller along with tb, then have the caller free it after they are finished using the data (pointers) in tb. I alternately could have made a cleaner interface by creating a new struct that put tb and resp together along with a vir*Free() function for it, but this function is only used in a couple places, and I'm not sure there will be additional new uses of virNetDevLinkDump(), so the value of adding a new type, extra APIs, etc. is dubious.
Further investigation to see if this bug affects RHEL6 (it doesn't) revealed that my upstream commit log is incorrect - the bug was actually introduced in upstream commit e95de74d, which was first in libvirt-1.0.5.
I can not find a good way to verify the bug. Could you tell me a detailed scenario to validate it? Thanks.
I can reproduce it using libvirt-1.2.8-4.el7.x86_64 [root@sriov ~]# rpm -q libvirt In the first terminal: [root@sriov ~]# valgrind -v libvirtd In the second terminal: [root@sriov ~]# virsh dumpxml r7_gpt | grep \<interface -A7 <interface type='hostdev' managed='yes'> <mac address='52:54:00:6d:90:02'/> <driver name='vfio'/> <source> <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </interface> [root@sriov ~]# virsh start r7_gpt Domain r7_gpt started After starting the r7_gpt, the error will be reported, please see the attachment(error_log_from_valgrind_in _libvirt-1.2.8-4.el7) for details. ... ==30777== ERROR SUMMARY: 72 errors from 15 contexts (suppressed: 2 from 2) ==30777== ==30777== 2 errors in context 1 of 15: ==30777== Invalid read of size 4 ==30777== at 0x52CF58F: UnknownInlinedFun (virnetdev.c:1630) ==30777== by 0x52CF58F: UnknownInlinedFun (virnetdev.c:1658) ==30777== by 0x52CF58F: virNetDevReplaceVfConfig (virnetdev.c:1677) ==30777== by 0x52C0EE3: virHostdevNetConfigReplace (virhostdev.c:418) ==30777== by 0x52C0EE3: virHostdevPreparePCIDevices (virhostdev.c:580) ==30777== by 0x1C2682F8: qemuPrepareHostDevices (qemu_hostdev.c:295) ==30777== by 0x1C27CF6C: qemuProcessStart (qemu_process.c:4095) ==30777== by 0x1C2CF631: qemuDomainObjStart (qemu_driver.c:6378) ==30777== by 0x1C2CFF41: qemuDomainCreateWithFlags (qemu_driver.c:6433) ==30777== by 0x538995B: virDomainCreate (libvirt.c:9001) ==30777== by 0x14278B: remoteDispatchDomainCreate (remote_dispatch.h:3129) ==30777== by 0x14278B: remoteDispatchDomainCreateHelper (remote_dispatch.h:3107) ==30777== by 0x53EC2E1: virNetServerProgramDispatchCall (virnetserverprogram.c:437) ==30777== by 0x53EC2E1: virNetServerProgramDispatch (virnetserverprogram.c:307) ==30777== by 0x1501FC: virNetServerProcessMsg (virnetserver.c:172) ==30777== by 0x1501FC: virNetServerHandleJob (virnetserver.c:193) ==30777== by 0x52F17D4: virThreadPoolWorker (virthreadpool.c:145) ==30777== by 0x52F116D: virThreadHelper (virthread.c:197) ==30777== Address 0x1eba1994 is 532 bytes inside a block of size 16,384 free'd ==30777== at 0x4C2ACD7: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==30777== by 0x529A489: virFree (viralloc.c:582) ==30777== by 0x52CF20F: virNetDevLinkDump (virnetdev.c:1455) ==30777== by 0x52CF39F: UnknownInlinedFun (virnetdev.c:1654) ==30777== by 0x52CF39F: virNetDevReplaceVfConfig (virnetdev.c:1677) ==30777== by 0x52C0EE3: virHostdevNetConfigReplace (virhostdev.c:418) ==30777== by 0x52C0EE3: virHostdevPreparePCIDevices (virhostdev.c:580) ==30777== by 0x1C2682F8: qemuPrepareHostDevices (qemu_hostdev.c:295) ==30777== by 0x1C27CF6C: qemuProcessStart (qemu_process.c:4095) ==30777== by 0x1C2CF631: qemuDomainObjStart (qemu_driver.c:6378) ==30777== by 0x1C2CFF41: qemuDomainCreateWithFlags (qemu_driver.c:6433) ==30777== by 0x538995B: virDomainCreate (libvirt.c:9001) ==30777== by 0x14278B: remoteDispatchDomainCreate (remote_dispatch.h:3129) ==30777== by 0x14278B: remoteDispatchDomainCreateHelper (remote_dispatch.h:3107) ==30777== by 0x53EC2E1: virNetServerProgramDispatchCall (virnetserverprogram.c:437) ==30777== by 0x53EC2E1: virNetServerProgramDispatch (virnetserverprogram.c:307) ==30777== ... I can not reproduce it on libvirt-1.2.8-10.el7.x86_64 [root@sriov ~]# rpm -q libvirt qemu-kvm-rhev libvirt-1.2.8-10.el7.x86_64 qemu-kvm-rhev-2.1.2-16.el7.x86_64 [root@sriov ~]# virsh dumpxml r7_gpt | grep \<interface -A7 <interface type='hostdev' managed='yes'> <mac address='52:54:00:6d:90:02'/> <driver name='vfio'/> <source> <address type='pci' domain='0x0000' bus='0x03' slot='0x10' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> [root@sriov ~]# virsh start r7_gpt Domain r7_gpt started In another terminal: [root@sriov ~]# valgrind -v libvirtd ... ==31448== ==31448== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2) --31448-- --31448-- used_suppression: 2 glibc-2.5.x-on-SUSE-10.2-(PPC)-2a /usr/lib64/valgrind/default.supp:1296 ==31448== ==31448== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2) The bug has been fixed, move to Verified. If I used inaccurate steps to verify the bug, please correct me.
Created attachment 968778 [details] log from valgrind using libvirt-1.2.8-4.el7
That is the perfect way to test it!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html