Bug 739704

Summary: Coverity found several mem leaks in libvirt
Product: Red Hat Enterprise Linux 6 Reporter: Eric Blake <eblake>
Component: libvirtAssignee: Eric Blake <eblake>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: acathrow, ajia, gren, jdenemar, mzhan, rwu, veillard
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.9.4-18.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 11:32:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 743047, 748554    
Attachments:
Description Flags
CoverityScan
none
CoverityScan-13
none
CoverityScan-14
none
List of fixed libvirt defects
none
CoverityScan-16
none
CoverityScan-17
none
CoverityScan-18 none

Description Eric Blake 2011-09-19 19:57:42 UTC
Description of problem:
Alex Jia ran Coverity on upstream libvirt, and found several mem leaks; some of these leaks are present in the RHEL 6.2 build, and should be plugged to avoid hitting OOM situations.

Version-Release number of selected component (if applicable):
libvirt-0.9.4-11.el6

How reproducible:
The leak most likely to hit occurs during seamless migration, in remoteDomainBuildEventGraphics (client side); more dangerous is a leak in qemuParseCommandLine (server side) which can be exploited by repeated calls to virConnectDomainXMLFromNative with a bogus command line.  I'll use this BZ to collect as many leaks as possible before the next build, in case others surface as we continue to read Coverity results.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
See upstream thread:
https://www.redhat.com/archives/libvir-list/2011-September/msg00698.html
patches 1 and 5 are not relevant to RHEL, but 2-4 are all worth back-porting.

Comment 4 Alex Jia 2011-09-21 06:19:44 UTC
Using Coverity to run on libvirt-0.9.4-12.el6.src.rpm, I haven't found other resource leak except for 'virVMXParseConfig', which has been fixed on upstream, 
as Eric suggested, we needn't build vmx in rhel, coverity report can be treated as a false positive, so move the bug to VERIFIED status.

Please see attachment for Coverity result based on -12.

Alex

Comment 5 Alex Jia 2011-09-21 06:28:06 UTC
Created attachment 524139 [details]
CoverityScan

Comment 6 Eric Blake 2011-09-21 17:56:58 UTC
(In reply to comment #4)
> Using Coverity to run on libvirt-0.9.4-12.el6.src.rpm, I haven't found other
> resource leak except for 'virVMXParseConfig', which has been fixed on upstream, 
> as Eric suggested, we needn't build vmx in rhel, coverity report can be treated
> as a false positive, so move the bug to VERIFIED status.
> 
> Please see attachment for Coverity result based on -12.

That scan still shows to non-false-positive bugs that need to be fixed.  I'm moving this back to ASSIGNED to track those:

16 	FORWARD_NULL 	/root/rpmbuild/BUILD/libvirt-0.9.4/src/qemu/qemu_process.c 	qemuProcessReconnect 	UNINSPECTED

30 	REVERSE_INULL 	/root/rpmbuild/BUILD/libvirt-0.9.4/src/locking/lock_driver_sanlock.c 	virLockManagerSanlockSetupLockspace 	UNINSPECTED

Comment 7 Alex Jia 2011-09-21 19:05:48 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > Using Coverity to run on libvirt-0.9.4-12.el6.src.rpm, I haven't found other
> > resource leak except for 'virVMXParseConfig', which has been fixed on upstream, 
> > as Eric suggested, we needn't build vmx in rhel, coverity report can be treated
> > as a false positive, so move the bug to VERIFIED status.
> > 
> > Please see attachment for Coverity result based on -12.
> 
> That scan still shows to non-false-positive bugs that need to be fixed.  I'm
> moving this back to ASSIGNED to track those:
> 
> 16  FORWARD_NULL  /root/rpmbuild/BUILD/libvirt-0.9.4/src/qemu/qemu_process.c 
> qemuProcessReconnect  UNINSPECTED

https://www.redhat.com/archives/libvir-list/2011-September/msg00858.html

> 
> 30  REVERSE_INULL 
> /root/rpmbuild/BUILD/libvirt-0.9.4/src/locking/lock_driver_sanlock.c 
> virLockManagerSanlockSetupLockspace  UNINSPECTED

https://www.redhat.com/archives/libvir-list/2011-September/msg00857.html


Alex

Comment 8 Alex Jia 2011-09-22 01:17:53 UTC
(In reply to comment #6) 
> That scan still shows to non-false-positive bugs that need to be fixed.  I'm
> moving this back to ASSIGNED to track those:
> 
> 16  FORWARD_NULL  /root/rpmbuild/BUILD/libvirt-0.9.4/src/qemu/qemu_process.c 
> qemuProcessReconnect  UNINSPECTED

https://www.redhat.com/archives/libvir-list/2011-September/msg00858.html  (ACKed and pushed)

> 
> 30  REVERSE_INULL 
> /root/rpmbuild/BUILD/libvirt-0.9.4/src/locking/lock_driver_sanlock.c 
> virLockManagerSanlockSetupLockspace  UNINSPECTED

https://www.redhat.com/archives/libvir-list/2011-September/msg00876.html (Resend a new patch based on Eric's suggestion)


Thanks,
Alex

Comment 9 Alex Jia 2011-09-22 01:19:50 UTC
(In reply to comment #8)
> > 
> > 16  FORWARD_NULL  /root/rpmbuild/BUILD/libvirt-0.9.4/src/qemu/qemu_process.c 
> > qemuProcessReconnect  UNINSPECTED
> 
> https://www.redhat.com/archives/libvir-list/2011-September/msg00858.html 
> (ACKed and pushed)

The following is the same as above:
https://www.redhat.com/archives/libvir-list/2011-September/msg00859.html (ACKed and pushed)

Comment 11 Alex Jia 2011-09-26 08:54:07 UTC
Hi Eric,
The above issues have been resolved on libvirt-0.9.4-13. Another question, I think '17' item should be fixed from attachment web page, if 'vmdef' is NULL, but we do nothing for this, it will potentially result in other issue such as 'NULL pointer deference', IMHO, so it makes sense to fix this, and have committed a patch for upstream:

http://www.redhat.com/archives/libvir-list/2011-September/msg00979.html

Alex

Comment 12 Alex Jia 2011-09-26 08:55:26 UTC
Created attachment 524862 [details]
CoverityScan-13

Comment 13 Eric Blake 2011-09-27 12:07:44 UTC
(In reply to comment #11)
> Hi Eric,
> The above issues have been resolved on libvirt-0.9.4-13. Another question, I
> think '17' item should be fixed from attachment web page, if 'vmdef' is NULL,
> but we do nothing for this, it will potentially result in other issue such as
> 'NULL pointer deference', IMHO, so it makes sense to fix this, and have
> committed a patch for upstream:
> 
> http://www.redhat.com/archives/libvir-list/2011-September/msg00979.html

Concur that this thread is worth resolving (there has been some back and forth, so the patch is not quite final yet), so moving this bz back to ASSIGNED.

Comment 14 Eric Blake 2011-09-28 17:15:43 UTC
Back in POST for yet another fix.
http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-September/msg00918.html

Comment 15 Alex Jia 2011-09-29 05:57:33 UTC
Hi Eric,
The previous issues have been resolved based on latest coverity result in attachment.

However, it seems we still need to fix some issues such as 'NULL_RETURNS' and 'REVERSE_INULL', please confirm it.

Thanks,
Alex

Comment 16 Alex Jia 2011-09-29 05:58:25 UTC
Created attachment 525443 [details]
CoverityScan-14

Comment 18 Michal Luscon 2011-10-07 12:31:42 UTC
Created attachment 526885 [details]
List of fixed libvirt defects

Comment 19 Eric Blake 2011-10-07 13:58:37 UTC
Agreed that the latest attachment still points out problems worth fixing.  Moving back to assigned for yet another round of upstream patching.

Comment 20 Alex Jia 2011-10-09 02:17:52 UTC
Created attachment 527058 [details]
CoverityScan-16

This is a latest coverity result on libvirt-0.9.4-16.el6.src.rpm.

Comment 21 Alex Jia 2011-10-12 03:30:24 UTC
Created attachment 527596 [details]
CoverityScan-17

This is a latest coverity result on libvirt-0.9.4-17.el6.src.rpm.

Comment 22 Eric Blake 2011-10-13 01:12:36 UTC
I've posted several relevant upstream patches, such as:
https://www.redhat.com/archives/libvir-list/2011-October/msg00533.html

I'm hoping to get these all approved and backported to RHEL by Friday.

My analysis of the -17 results:

Error: CHECKED_RETURN:
/builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_monitor_json.c:1437: check_return: Calling function "virJSONValueObjectGetBoolean" without checking return value (as is done elsewhere 4 out of 5 times).
False positive, but upstream will silence this.

Error: DEADCODE:
/builddir/build/BUILD/libvirt-0.9.4/gnulib/lib/strerror_r.c:134: dead_error_condition: On this path, the condition "msg" cannot be true.
Truthful on Linux, but inconsequential.  The gnulib code has to cater to other platforms, which does indeed introduce dead code on Linux.  Not worth fixing.

Error: DEADCODE:
/builddir/build/BUILD/libvirt-0.9.4/gnulib/lib/strerror.c:47: dead_error_condition: On this path, the condition "msg" cannot be true.
Same story.

Error: DEADCODE:
/builddir/build/BUILD/libvirt-0.9.4/tools/virsh.c:12627: dead_error_condition: On this path, the condition "parent" cannot be true.
I didn't see this one in my upstream scan; that may be related to the fact that the code has further changed due to 'snapshot-list --tree' which was not backported; so fixing this is probably RHEL-only.  Or we can safely ignore it.

Error: MISSING_RETURN:
/tmp/tmpG4rrjF.c:1: missing_return: Arriving at the end of a function without returning a value.
Not libvirt's fault.  This is caused by dprobe emitting a function with an int return type but no body.  We can safely ignore it.


Error: NULL_RETURNS:
/builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_monitor_json.c:1377: returned_null: Function "qemuMonitorJSONMakeCommand" returns null (checked 57 out of 58 times).
Real, and part of my upstream patches.


Error: RESOURCE_LEAK:
/builddir/build/BUILD/libvirt-0.9.4/src/vmx/vmx.c:1221: alloc_fn: Calling allocation function "virConfReadMem".
Real and upstream, but we don't build VMX for RHEL, so not worth worrying about.

Error: RETURN_LOCAL:
/builddir/build/BUILD/libvirt-0.9.4/gnulib/lib/careadlinkat.c:101: local_ptr_assign_local: Assigning: "buffer" = "stack_buf" (address of local variable "stack_buf").
False positive.  Coverity is mis-analyzing gnulib code.

Error: REVERSE_INULL:
/builddir/build/BUILD/libvirt-0.9.4/src/esx/esx_vi.c:1667: deref_ptr: Directly dereferencing pointer "objectSpec".
Upstream.  We build ESX for RHEL, but this one is harmless in effect (basically a sign of dead code, and not an actual logic bug).

Error: REVERSE_INULL:
/builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_monitor_text.c:738: deref_ptr: Directly dereferencing pointer "p".
Real and upstream, and in my patch set (several instances in one function).

Comment 23 Eric Blake 2011-10-13 20:29:56 UTC
(In reply to comment #22)
> I've posted several relevant upstream patches, such as:
> https://www.redhat.com/archives/libvir-list/2011-October/msg00533.html
> 
> I'm hoping to get these all approved and backported to RHEL by Friday.
> 
> My analysis of the -17 results:
> 
> Error: CHECKED_RETURN:
> /builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_monitor_json.c:1437:
> check_return: Calling function "virJSONValueObjectGetBoolean" without checking
> return value (as is done elsewhere 4 out of 5 times).
> False positive, but upstream will silence this.

Backport of commit 0654d27

> 
> Error: DEADCODE:
> /builddir/build/BUILD/libvirt-0.9.4/tools/virsh.c:12627: dead_error_condition:
> On this path, the condition "parent" cannot be true.
> I didn't see this one in my upstream scan; that may be related to the fact that
> the code has further changed due to 'snapshot-list --tree' which was not
> backported; so fixing this is probably RHEL-only.  Or we can safely ignore it.

Safely ignore.  Introduced by RHEL-specific conflict resolution when backporting commit d1be48f for bug 742410

> Error: NULL_RETURNS:
> /builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_monitor_json.c:1377:
> returned_null: Function "qemuMonitorJSONMakeCommand" returns null (checked 57
> out of 58 times).
> Real, and part of my upstream patches.

backport of commit ce521f2

> Error: REVERSE_INULL:
> /builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_monitor_text.c:738:
> deref_ptr: Directly dereferencing pointer "p".
> Real and upstream, and in my patch set (several instances in one function).

backport of commit 60be9e8

Plus I'm lumping in some other backports for bug fixes found by code analysis.

Comment 24 Eric Blake 2011-10-13 23:07:24 UTC
In POST again (maybe fourth time's the charm):
http://post-office.corp.redhat.com/archives/rhvirt-patches/2011-October/msg00441.html

Comment 25 Alex Jia 2011-10-18 06:22:35 UTC
Created attachment 528730 [details]
CoverityScan-18

This is a latest coverity result on libvirt-0.9.4-18.el6.src.rpm

The previous issues have been resolve based on latest result and Comment 22 & Comment 23.

The following is a new patch issue:

Error: NULL_RETURNS:
/builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_hostdev.c:230: returned_null: Function "pciDeviceListFind" returns null (checked 8 out of 9 times).
/builddir/build/BUILD/libvirt-0.9.4/src/util/pci.c:1536: return_null: Explicitly returning NULL.
......
/builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_hostdev.c:230: var_assigned: Assigning: "activeDev" = null return value from "pciDeviceListFind".
/builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_hostdev.c:232: dereference: Dereferencing a pointer that might be null "activeDev" when calling "pciDeviceSetUsedBy".
/builddir/build/BUILD/libvirt-0.9.4/src/util/pci.c:1400: deref_parm: Directly dereferencing parameter "dev".

And I have discussed this with Eric, which think we may consolidate loop 4 and loop 5 into one to silence Coverity complains, but I think it's not a necessary choose, and use 6 loops is a clear design, Eric, maybe, may we ignore this?

BTW, There are also some issues for libvirt-0.8.7-18.el6_1.2 and libvirt-0.8.2-15.el5_6.5, whether I need to file 2 bugs for them again?

Thanks,
Alex

Comment 26 Eric Blake 2011-10-18 11:57:33 UTC
(In reply to comment #25)
> Created attachment 528730 [details]
> CoverityScan-18
> 
> The following is a new patch issue:
> 
> Error: NULL_RETURNS:
> /builddir/build/BUILD/libvirt-0.9.4/src/qemu/qemu_hostdev.c:230: returned_null:
> Function "pciDeviceListFind" returns null (checked 8 out of 9 times).
> /builddir/build/BUILD/libvirt-0.9.4/src/util/pci.c:1536: return_null:
> Explicitly returning NULL.

We can safely ignore this one.  The code is structured so that loop 4 assigns the ref into the list, and loop 5 then queries the ref from that list.  Loop 5 cannot be reached unless loop 4 succeeded, so this will never fail.  Given that this was the only new finding against -18, I am confident that we can mark this bug as resolved.

> BTW, There are also some issues for libvirt-0.8.7-18.el6_1.2 and
> libvirt-0.8.2-15.el5_6.5, whether I need to file 2 bugs for them again?

File a separate BZ for separate build streams.

Comment 27 Alex Jia 2011-10-18 16:01:33 UTC
(In reply to comment #26)
> We can safely ignore this one.  The code is structured so that loop 4 assigns
> the ref into the list, and loop 5 then queries the ref from that list.  Loop 5
> cannot be reached unless loop 4 succeeded, so this will never fail.  Given that
> this was the only new finding against -18, I am confident that we can mark this
> bug as resolved.
Agree.
> 
> > BTW, There are also some issues for libvirt-0.8.7-18.el6_1.2 and
> > libvirt-0.8.2-15.el5_6.5, whether I need to file 2 bugs for them again?
> 
> File a separate BZ for separate build streams.
Okay, thanks Eric.

Comment 28 Alex Jia 2011-10-18 16:02:08 UTC
The bug has been resolved based on Comment 25 and Comment 26, so move the bug to VERIFIED status.

Comment 29 Jiri Denemark 2011-11-04 08:15:58 UTC
Ouch, one of the patches (conf: plug memory leak on error) for this BZ introduced a crasher: bug 751287

Comment 30 errata-xmlrpc 2011-12-06 11:32:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html