Bug 175372 - Reading in different-sized chunks from /proc/cluster/services gives different results
Reading in different-sized chunks from /proc/cluster/services gives different...
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: cman (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: Christine Caulfield
Cluster QE
:
Depends On:
Blocks: 175033
  Show dependency treegraph
 
Reported: 2005-12-09 11:27 EST by Lon Hohberger
Modified: 2009-04-16 16:00 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2006-0559
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-10 17:32:13 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Source code to little reader program. (667 bytes, text/plain)
2005-12-09 11:27 EST, Lon Hohberger
no flags Details

  None (edit)
Description Lon Hohberger 2005-12-09 11:27:16 EST
Description of problem:

A customer of ours has a lot of GFS file systems mounted.  This causes
/proc/cluster/services to exceed a page in size, which has caused #175033 to appear.

The initial solution in #175033 was to issue a read and retry reading the whole
entry if we exceeded the buffer size.  Unfortunately, this does not work,
because /proc entries can not be read in chunks >1 page in size.

In investigating #175033 further, I found a general problem with
/proc/cluster/services read handling.  When you issue multiple read() calls to
/proc/cluster/services, you get different results depending on the service group
configuration and the read size.  Basically, if you are forced to issue multiple
reads, some of the service lines may be missing from the output.  I do not think
that reading in page size chunks will guarantee that this loss-of-output will
not occur.

Examples, on my 2-node cluster:

[root@blue ~]# ./reader 4 /proc/cluster/services print
Service          Name                              GID LID State     Code
DLM Lock Space:  "clvmd"                             2   3 run       -
[2 1]

DLM Lock Space:  "_mnt_gfs"                          5   6 run       -
[2 1]

User:            "usrm::manager"                     9   4 run       -
[2 1]

total = 308

--- versus ---

[root@blue ~]# ./reader 128 /proc/cluster/services print
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 1]

DLM Lock Space:  "Magma"                            10   5 run       -
[2 1]

DLM Lock Space:  "_mnt_gfs"                          5   6 run       -
[2 1]

User:            "usrm::manager"                     9   4 run       -
[2 1]

total = 386

--- versus ---

[root@blue ~]# ./reader 4096 /proc/cluster/services print
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 1]

DLM Lock Space:  "Magma"                            10   5 run       -
[2 1]

DLM Lock Space:  "_mnt_gfs"                          5   6 run       -
[2 1]

GFS Mount Group: "_mnt_gfs"                          6   7 run       -
[2 1]

User:            "usrm::manager"                     9   4 run       -
[2 1]

total = 542 

Version-Release number of selected component (if applicable): 1.0.2

How reproducible: 100%
Steps to Reproduce:
1. gcc -o reader reader.c 
2. ./reader 4 /proc/cluster/services print
3. ./reader 16 /proc/cluster/services print
4. ./reader 128 /proc/cluster/services print
5. ./reader 4096 /proc/cluster/services print
  
Actual results: Some service group lines missing from output.

Expected results: All service group lines, irrespective of the size of the
read call issued.

Additional info:

The header is always displayed, even with a read size of 1.  The reader program
works fine with other large /proc entries, like /proc/kallsyms.  There does not
seem to be a correlation with what types of service entries are missing, but
rerunning with the same read size always yields the same results.

[root@blue ~]# ./reader 1 /proc/kallsyms 
total = 736632
[root@blue ~]# ./reader 4 /proc/kallsyms 
total = 736632
[root@blue ~]# ./reader 4096 /proc/kallsyms 
total = 736632
Comment 1 Lon Hohberger 2005-12-09 11:27:17 EST
Created attachment 122078 [details]
Source code to little reader program.
Comment 2 Christine Caulfield 2005-12-14 04:27:20 EST
The pointer was not being initialised in sm_seq_start when reading was resumed
in the middle of the file. 

This checkin fixes it on -rSTABLE

Checking in sm_misc.c;
/cvs/cluster/cluster/cman-kernel/src/sm_misc.c,v  <--  sm_misc.c
new revision: 1.2.2.1.6.3; previous revision: 1.2.2.1.6.2
done
Comment 3 Christine Caulfield 2005-12-20 07:00:53 EST
And on -rRHEL4

Checking in sm_misc.c;
/cvs/cluster/cluster/cman-kernel/src/sm_misc.c,v  <--  sm_misc.c
new revision: 1.2.2.4; previous revision: 1.2.2.3
done
Comment 6 Red Hat Bugzilla 2006-08-10 17:32:13 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0559.html

Note You need to log in before you can comment on or make changes to this bug.