Bug 747629

Summary: blktrace leaks memory when you run with "-o -" option
Product: Red Hat Enterprise Linux 6 Reporter: Rajendra Kumar <rajendra.kumar.jat>
Component: blktraceAssignee: Eric Sandeen <esandeen>
Status: CLOSED NOTABUG QA Contact: Karel Srot <ksrot>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.0CC: mmalik
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-22 21:49:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Rajendra Kumar 2011-10-20 14:45:14 UTC
Description of problem:

If you run blktrace specifying "-o -" option then blktrace dumps the output to stdout of caller. and after few minutes your system goes out of memory. I am facing this problem only when you run blktrace with "-o -". 


Version-Release number of selected component (if applicable):

Red Hat Enterprise Linux Server release 6.0
kernel-2.6.32-71.el6.x86_64
blktrace-2.0.0

How reproducible:

I was experimenting with blkparse tool to get IO performance number. Todo this I redirected the output of blktrace to blkparse. after some time it gave me "invalid pointer to reallocate". Following was the command i was running.
# blktrace -d /dev/sda -o -|blkparse -t -i - -o temp_file

I thought that there might be a bug in blkparse then i just run simple blktrace command(ex. blktrace -d /dev/sda -o -) and i find out the same problem.

I used few tools(ex. valgrind) to figure out the memory leak but i was not lucky.

Steps to Reproduce:
1. # blktrace -d /dev/sda -o -
2. # badblocks /dev/sda
3. # top


Actual results:
Memory used by blktrace is keep increasing till you goes out of memory.

Expected results:
Memory taken by blktrace(with "-o -" option) should be same as you run blktrace with "-o file_name"

Additional info:

And you can look at the attached snapshot of top command output which is taking 37% memory of your system. Please look at the top line which is.

***********************************************************************
PID   USER      PR   NI  VIRT  RES  SHR  S %CPU %MEM    TIME+  COMMAND          
10013 root      20   0  14.9g 1.4g  652  S    2  37.0   0:12.66 blktrace
 2316 raj       20   0  1024m 233m  37m  S    0   6.1   8:43.31 firefox-bin 
 3742 raj       20   0   444m  103m  14m S    0   2.7   0:48.01  evince
 4139 raj       20   0   501m   91m  15m S    0   2.4   1:01.03  evince
***********************************************************************

Comment 2 Karel Srot 2011-10-21 07:46:33 UTC
Hi Rajendra,
I am a bit confused with your blktrace version. The latest blktrace available in RHEL is blktrace-1.0.1-6.el6. I am trying to reproduce the leak with this version but I didn't have luck yet.

Comment 3 Eric Sandeen 2011-10-21 16:22:51 UTC
(In reply to comment #2)
> Hi Rajendra,
> I am a bit confused with your blktrace version. The latest blktrace available
> in RHEL is blktrace-1.0.1-6.el6. I am trying to reproduce the leak with this
> version but I didn't have luck yet.

*nod* there is no blktrace-2.0.0 even upstream....

Comment 4 Rajendra Kumar 2011-10-22 17:08:14 UTC
Actually I putted the output of "blktrace -V", and that was "2.0.0". But Karel is right. I am having this problem with blktrace-1.0.1-3.1.el6. I have not tried with latest one. Karel, may be you are following the steps which i have mentioned but i would like describe them again.

Disk size: 1TB (/dev/sda)
 
Step 1.
~~~~~~~
Run blktrace command with "-o -" option.

# blktrace -d /dev/sda -o -

Step 2.
~~~~~~~
Go to other console and use badblocks command which will generate so many sequential IO requests.

# badblocks /dev/sda

Step 3.
~~~~~~~
Go to third console and use top command (please wait for atleast 5 min.) and you will see that memory is being increased.

# top

Note: use (shift + m) to short them in the order of memory.

I have been trying to resolve this problem looking at the code and if you choose piped output then no of threads(no of CPUs) trying to write to single fd(stdout). I do see after writing the data buffer is being freed. And still i am confused where is the problem.

I putting these comment after verifying this again from my end.Please let me know if you need more information. Or I can record(video) the issue and share with you.

Comment 5 Karel Srot 2011-10-24 07:38:15 UTC
I can confirm the high memory consumption. Unfortunately I was not able to identify any leaks with valgind, it says

==3983== Memcheck, a memory error detector
==3983== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==3983== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info
==3983== Command: blktrace -d /dev/vda -o -
==3983== Parent PID: 3917
==3983== 
==3983== Warning: noted but unhandled ioctl 0x1274 with no size/direction hints
==3983==    This could cause spurious value errors to appear.
==3983==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3983== Warning: noted but unhandled ioctl 0x1275 with no size/direction hints
==3983==    This could cause spurious value errors to appear.
==3983==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3983== Warning: noted but unhandled ioctl 0x1275 with no size/direction hints
==3983==    This could cause spurious value errors to appear.
==3983==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3983== 
==3983== HEAP SUMMARY:
==3983==     in use at exit: 0 bytes in 0 blocks
==3983==   total heap usage: 998 allocs, 998 frees, 509,659,998 bytes allocated
==3983== 
==3983== All heap blocks were freed -- no leaks are possible

Comment 6 RHEL Program Management 2011-10-28 05:47:35 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 7 Eric Sandeen 2011-11-02 22:44:24 UTC
The problem only seems to appear if the stdout is not consumed.

i.e. for:

# blktrace -d /dev/sdb -o outfile
# blktrace -d /dev/sdb -o - &>/dev/null
# blktrace -d /dev/sdb -o - &>outfile
# blktrace -d /dev/sdb -o - | blkparse -i -

none of these seem to show the problem.

Is this really a problem in practice?  What workflow exposes this problem?  

Thanks,
-Eric

Comment 8 Eric Sandeen 2011-12-22 21:49:43 UTC
Closing NOTABUG.  It makes no real sense to send binary blktrace data to the console; it will just get buffered by glibc until it is consumed by something.

> Expected results:
> Memory taken by blktrace(with "-o -" option) should be same as you run blktrace
> with "-o file_name"

I disagree with that; the data has to go -somewhere-

The bug has been in NEEDINFO for 2 months with no reply; if you have a real workload for which this matters, please contact your Red Hat support folks to open a ticket for the problem.  However, I cannot see the usefulness of sending blktrace data to stdout without a consumer.

Comment 9 Eric Sandeen 2011-12-22 22:11:34 UTC
Also, FWIW testing with blktrace-1.0.1-6.el6.x86_64, I do get output on the console, and memory use does not go up.

NB: The bug is filed against blktrace-2.0, which is not shipped with RHEL6; RHEL6 contains 1.0.1.

However, testing with blktrace built from the upstream git tree, I see the same behavior - binary data on console, no extra memory usage ...