Bug 129846 - wrong attach counter of IPC shared memory.
Summary: wrong attach counter of IPC shared memory.
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i386
OS: Linux
medium
high
Target Milestone: ---
Assignee: Dave Anderson
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-08-13 10:10 UTC by Fabrizio Muscarella
Modified: 2007-11-30 22:07 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-10-19 19:20:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
see bug description (2.33 KB, text/plain)
2004-08-13 10:11 UTC, Fabrizio Muscarella
no flags Details
This attachment contains Code altered in various functions in shm.c (7.12 KB, text/plain)
2005-04-02 05:53 UTC, Srividhya
no flags Details

Description Fabrizio Muscarella 2004-08-13 10:10:16 UTC
Description of problem:

I observe a problem in some program's that use 'mprotect' to protect
shared memory. The type of shared memory is IPC shared memory.

Version-Release number of selected component (if applicable):
I tested on different versions RHEL 3, 2.1, fedora core 2 and the same
error are reproduceable in every system. 


How reproducible:
To reproduce the error just use the attached program

 

Steps to Reproduce:
1. Compile it with 'gcc -o ipc_bug main.c' and 
2. Start the program ./ipc_bug.
3. Then in a separate console just call 'ipcs -m' or 'cat   
/proc/sysvipc/shm.
  
Actual results:
You will observe the 'nattch' column is equal to the number of call to
'mprotect', so if mprotect is called 20 times
the 'nattch' value is 21 (1 for the real attach process)

Expected results:
Should be 1.

Additional info:
This value is important for some program because they check this 
value and if the value aren't 0 the don't remove the Shared Memory
Segment.

Comment 1 Fabrizio Muscarella 2004-08-13 10:11:43 UTC
Created attachment 102691 [details]
see bug description

Comment 2 Srividhya 2005-04-02 05:53:33 UTC
Created attachment 112609 [details]
This attachment contains Code altered in various functions in shm.c

This attachment is a plain text which contains the code altered in shm_inc() ,
shmat() , shm_close() routines.

The bug description says that when ever mprotect is used to protect the shared
memory the nattch counter shows wrong values.

This is because whenever mprotect was called to protect the shared memory , It
would invoke the static shm_open() function , which inturn would invoke 
static shm_inc() subroutine. shm_inc() subroutine is used to increment the
nattch counter.

In shmat() subroutine, nattch is incremented i.e. whenever a process attaches
itself . But decremented for invalid conditions. (another decrement to
compensate for the increment in shm_inc()). But by default the control used to
go to invalid condition.

Again nattch is decremented in shm_close() , i.e. when a process detaches
from the shared memory.

So we haved attempted to fix the bug , by commenting the nattch increment
statement in shm_inc() , and commenting the 2nd nattch decrement statement in
shmat().

We added a piece of code shm_close() to prevent nattch value from decreneting
below 0.

Inference is whenever , mprotect is used to protect the shared memory , and
there is a change of protection required (i.e. PROT_READ or PROT_WRITE , by
default it is PROT_READ|PROT_WRITE )then , mprotect invokes shm_open() function
, which automatically invokes shm_inc() which would increment nattch value. 

Now the bug is not reproduced with this patch. 
Also it is found that nattch increments invariably with other system calls like
madvise() , mlock() in a similar fashion.
With this bug-fix patch these problems are also eliminated.

Comment 3 Dave Anderson 2005-04-04 17:51:35 UTC
This is the first time I have taken a look at this case,
so I'm very sorry for the delay.  BTW, when submitting a patch,
please do a "diff -urNp old-file new-file".  The open office file
attached in the previous comment would be something like this
as a patch:

--- linux-2.4.21/ipc/shm.c.orig
+++ linux-2.4.21/ipc/shm.c
@@ -101,7 +101,7 @@ static inline void shm_inc (int id) {
                BUG();
        shp->shm_atim = CURRENT_TIME;
        shp->shm_lprid = current->tgid;
-       shp->shm_nattch++;
+//     shp->shm_nattch++;
        shm_unlock(id);
 }
 
@@ -148,7 +148,10 @@ static void shm_close (struct vm_area_st
                BUG();
        shp->shm_lprid = current->tgid;
        shp->shm_dtim = CURRENT_TIME;
-       shp->shm_nattch--;
+       if (shp->shm_nattach == 0)
+               shp->shm_nattach = 0;
+       else
+               shp->shm_nattch--;
        if(shp->shm_nattch == 0 &&
           shp->shm_flags & SHM_DEST)
                shm_destroy (shp);
@@ -670,7 +673,7 @@ invalid:
        down (&shm_ids.sem);
        if(!(shp = shm_lock(shmid)))
                BUG();
-       shp->shm_nattch--;
+//     shp->shm_nattch--;
        if(shp->shm_nattch == 0 &&
           shp->shm_flags & SHM_DEST)
                shm_destroy (shp);

In any case, this patch destroys the prime purpose of shm_inc(),
that being to increment the nattach reference count. But more
importantly, looking into this case, there really is not a bug
here.  The nattach count is properly counting what it is supposed
to, that being the number of virtual memory areas that reference
any part of the shared memory segment.

For example, if I walk through your ipc_bug program, and
stop it a various key points, and check the process virtual
memory in conjunction with "ipcs -m" statistics, you'll see 
something like the following.

The initial shmget() creates a shared memory area that can
be attached to 1 or more virtual memory areas of any process
that has the permission to attach to that shared memory area.
In ipc_bug, only 1 process both creates the shared memory
area and then attaches to it.

So, after the initial shmget() and shmat(), here is the "ipcs -m"
output for the new shared memory segment whose shmid is 917505,
and has an nattach of 1:

$ ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 262144     root      644        106496     2          dest
0x00000000 917505     anderson  600        409600     1
$

The process (pid 4117) that created the shared memory segment
has it mapped at a virtual memory area starting at b7568000
and ending at b75cc000:

$ cat /proc/4117/maps
00e80000-00fb0000 r-xp 00000000 03:03 4546297    /lib/tls/libc-2.3.2.so
00fb0000-00fb4000 rwxp 0012f000 03:03 4546297    /lib/tls/libc-2.3.2.so
00fb4000-00fb6000 rwxp 00000000 00:00 0
08048000-08049000 r-xp 00000000 00:0d 1949284    /tmp/bz129846/ipc_bug
08049000-0804a000 rwxp 00000000 00:0d 1949284    /tmp/bz129846/ipc_bug
b7568000-b75cc000 rwxs 00000000 00:04 917505     /SYSV00000000 (deleted)
b75cc000-b75cd000 rwxp 00000000 00:00 0
b75e5000-b75e8000 rwxp 00000000 00:00 0
b75e8000-b75fe000 r-xp 00000000 03:03 4530970    /lib/ld-2.3.2.so
b75fe000-b75ff000 rwxp 00015000 03:03 4530970    /lib/ld-2.3.2.so
bfffe000-c0000000 rwxp fffff000 00:00 0
$

After the first mprotect() operation on the 1st of the 100 4k pages, 
the nattach count goes up:

$ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 262144     root      644        106496     2          dest
0x00000000 917505     anderson  600        409600     2
$

However, what's important here is that the virtual memory of the process
has changed:

$ cat /proc/4117/maps
00e80000-00fb0000 r-xp 00000000 03:03 4546297    /lib/tls/libc-2.3.2.so
00fb0000-00fb4000 rwxp 0012f000 03:03 4546297    /lib/tls/libc-2.3.2.so
00fb4000-00fb6000 rwxp 00000000 00:00 0
08048000-08049000 r-xp 00000000 00:0d 1949284    /tmp/bz129846/ipc_bug
08049000-0804a000 rwxp 00000000 00:0d 1949284    /tmp/bz129846/ipc_bug
b7568000-b7569000 r-xs 00000000 00:04 917505     /SYSV00000000 (deleted)
b7569000-b75cc000 rwxs 00001000 00:04 917505     /SYSV00000000 (deleted)
b75cc000-b75cd000 rwxp 00000000 00:00 0
b75e5000-b75e8000 rwxp 00000000 00:00 0
b75e8000-b75fe000 r-xp 00000000 03:03 4530970    /lib/ld-2.3.2.so
b75fe000-b75ff000 rwxp 00015000 03:03 4530970    /lib/ld-2.3.2.so
bfffe000-c0000000 rwxp fffff000 00:00 0
$

Note that because the protections of the 1st page has been changed,
the original virtual memory area from b7568000 to b75cc000 has
been split into two virtual memory areas, one for the first
page that was mprotect()'ed (b7568000-b7569000), and the
second comprising the remainder of the segment.  Both have
different permissions, requiring different virtual memory
areas.  With each additional mprotect(), a new virtual memory 
area will be created, and the nattach count goes up.  

It should be noted that same thing would happen if the process 
were to do another shmat() on the existing shared memory area
at a new virtual address.

By definition, the "nattach" count is a count of individual virtual
memory areas of 1 or more processes that reference all or part of the 
shared memory area.  That being the case, the nattach counts are 
correct.  When the address space (the virtual memory areas) of the 
process above are broken down when the process exits, the nattach 
counts will be decremented.  If there are no current virtual memory
areas referencing the shared memory segment, then the nattach count 
will be 0.  So if I control-C the ipc_bug program, ipcs -m shows an
nattach count of 0:

$ ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 262144     root      644        106496     2          dest
0x00000000 917505     anderson  600        409600     0

Dave Anderson



Comment 4 Dave Anderson 2005-04-04 18:13:23 UTC
Furthermore, regarding this statement in the problem description:

> This value is important for some program because they check this 
> value and if the value aren't 0 the don't remove the Shared Memory
> Segment.

You have not shown this with the example program.  If you are able to
create a shared memory segment, and programmatically force a situation
such that the nattach value is non-zero while there are no processes
attached to it, then that would be a bug.  

Your program does not show that -- the shared memory segment is removed
properly if the program is run to completion.  If your program is ended
prematurely with Ctrl-C, the nattach count is properly decremented to 0,
although the shared memory segment will hang around until explicitly removed.



Comment 5 Srividhya 2005-04-30 11:44:21 UTC
Hi Anderson. Went through the explanation. But still have a doubt in the 
definition of nattch you given.

By definition, the "nattach" count is a count of individual virtual
memory areas of 1 or more processes that reference all or part of the 
shared memory area

But nattch according to the man pages says that "it is count of the number of 
attaches either by the same process or different processes." 

So in that case , should mprotecting different parts of the same shared memory 
increase the nattach count. But we are only attaching it once. So nattach 
should be one.

Can you please clarify this , I am confused.


Comment 6 Dave Anderson 2005-05-02 12:53:50 UTC
I guess it's the definition of "attaches".

From the kernel's viewpoint, the definition of "attaches" is the number
of different virtual memory areas -- whether they be in the same process
or in different processes -- that reference the shared memory area.  That
bookkeeping must be kept intact in order to properly track all references
held by all virtual memory areas that reference.  Note that the shm_close()
function is called for *each* virtual memory area in a process, and the
accounting must be kept as is so that the shared memory area won't be
left around unnecessarily or freed prematurely.

Doing an mprotect() on part of a previously-attached shared memory area
is essentially the same thing as a process doing an additional shmat().
So, since a particular process may have multiple attaches, regardless
whether they came by shmat() or mprotect() calls, the man page is correct:

"it is count of the number of attaches either by the same process
or different processes." 



Comment 9 RHEL Program Management 2007-10-19 19:20:49 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.


Note You need to log in before you can comment on or make changes to this bug.