Description of problem: `gfs_tool df /home` and `gfs_tool gettune /home` return an error that it cannot allocate memory on some of our systems. Version-Release number of selected component (if applicable): RHEL AS 4.5 How reproducible: On systems with not much "free" memory, but still has a lot of "cached" memory. Steps to Reproduce: 1. free -mt 2. gfs_tool df /home 3. gfs_tool gettune /home Actual results: open("/home", O_RDONLY) = 3 ioctl(3, 0x4723, 0x7fbffcdf3c) = 0 ioctl(3, 0x472d, 0x7fbfffe300) = -1 ENOMEM (Cannot allocate memory) Expected results: open("/home", O_RDONLY) = 3 ioctl(3, 0x4723, 0x7fbffdf73c) = 0 ioctl(3, 0x472d, 0x7fbffff760) = 719 close(3) = 0 Additional info: The gfs_tool does most of its work using ioctl calls to the gfs kernel module. Often, it tries to allocate and pass in a huge buffer to make sure it doesn't ask for more than the kernel needs to respond with. In some cases, it doesn't need to allocate such a big buffer. I fixed "gfs_tool counters" for a similar ENOMEM problem with bugzilla record 229461 about a year ago. (I don't know if that bug record is public or locked so you may not be able to access it, which is out of my control--sorry). I should probably go through all the other gfs_tool functions, including the two you mentioned, and figure out their minimum memory requirements and change the code so it doesn't ask for so much memory.
Reassigning to me. The "additional info" comments are mine from linux-cluster mailing list. Thanks, Robert.
Checking the code and doing some testing reveal that the "df" and "gettune" functions of gfs_tool should both work if given a 4K buffer rather than 64K. The fix should be as easy as changing gfs_tool/df.c and gfs_tool/tune.c to change the constant SIZE at the top from (65536) to (4096). Now I just need to fit it in with everything else going on.
Requesting flags so I can get this fix into RHEL4.7.
This request was evaluated by Red Hat Product Management for inclusion, but this component is not scheduled to be updated in the current Red Hat Enterprise Linux release. If you would like this request to be reviewed for the next minor release, ask your support representative to set the next rhel-x.y flag to "?".
Created attachment 299056 [details] patch to fix the problem This is the patch I'm planning to ship. I've tested it on one of my trin nodes and it works.
The patch was pushed to the cluster git repository so it should hopefully go into RHEL4.7 and similar. I've cloned this bug as bug #438762 for crosswriting the changes into RHEL5.2. I'll push it into the upstream branches (master, STABLE2) using that bz record, not this one. Since this is pushed, I'm changing the status to modified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0804.html