This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 431952 - GFS: gfs-kernel should use device major:minor
GFS: gfs-kernel should use device major:minor
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: gfs-kmod (Show other bugs)
5.2
All Linux
low Severity medium
: rc
: ---
Assigned To: Robert Peterson
GFS Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-02-07 18:09 EST by Robert Peterson
Modified: 2010-01-11 22:28 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 16:18:36 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Proposed patch to fix the problem (1.91 KB, patch)
2008-02-07 18:19 EST, Robert Peterson
no flags Details | Diff

  None (edit)
Description Robert Peterson 2008-02-07 18:09:02 EST
+++ This bug was initially created as a clone of Bug #431945 +++
RHEL4 gfs-kmod bug cloned so I can crosswrite to RHEL5 gfs-kmod.

+++ This bug was initially created as a clone of Bug #421761 +++
Bug #421761 was cloned so I can do the gfs-kernel work.
This is similar to GFS2 bug pairs: 354201 (userland) and 363901 (kernel)
for RHEL5.  We need to fix this bug for GFS in both RHEL4 and 5.

I have a two node system with a GFS filesystem on external RAID array.

[root@aa-node-1 ~]# cat /proc/mounts
rootfs / rootfs rw 0 0
/proc /proc proc rw,nodiratime 0 0
none /dev tmpfs rw 0 0
/dev/root / ext3 rw 0 0
none /dev tmpfs rw 0 0 none /selinux selinuxfs rw 0 0
/proc /proc proc rw,nodiratime 0 0
/proc/bus/usb /proc/bus/usb usbfs rw 0 0
/sys /sys sysfs rw 0 0
none /dev/pts devpts rw 0 0
/dev/md1 /boot ext3 rw 0 0
none /dev/shm tmpfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
/dev/cciss/c0d0p1 /diskarray gfs rw,noatime,nodiratime 0 0
[root@aa-node-1 ~]#

[root@aa-node-1 ~]# gfs_tool lockdump /diskarray gfs_tool: unknown mountpoint
/diskarray
[root@aa-node-1 ~]# 

strace tells me that it does get the gfs file list, and that it is finding the
mountpoint in /proc/mounts:

open("/proc/fs/gfs", O_RDWR|O_LARGEFILE) = 3
write(3, "list", 4)                     = 4
read(3, "4172492800 cciss/c0d0p1 6E0845C6"..., 1048575) = 45
close(3)                                = 0
open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb7e14000
read(3, "rootfs / rootfs rw 0 0\n/proc /pr"..., 1024) = 448
open("/proc/devices", O_RDONLY|O_LARGEFILE) = 4
fstat64(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb7e13000
read(4, "Character devices:\n  1 mem\n  4 /"..., 1024) = 414
close(4)                                = 0
munmap(0xb7e13000, 4096)                = 0
stat64("/dev/cciss/c0d0p1", {st_mode=S_IFBLK|0600, st_rdev=makedev(104,
1), ...}) = 0
close(3)                                = 0
munmap(0xb7e14000, 4096)                = 0
write(2, "gfs_tool: ", 10gfs_tool: )              = 10 write(2, "unknown
mountpoint /diskarray\n", 30unknown mountpoint
/diskarray
) = 30
exit_group(1)                           = ?
Process 24095 detached
[root@aa-node-1 ~]#

The problem lies in mp2cookie() in gfs_tool/util.c - it's failing to find a
cookie for the filesystem, because "cciss/c0d0p1" does not match
"/dev/cciss/c0d0p1".

The error message is misleading. The mountpoint does exist, is known to the
system, but gfs_tool just can't find the cookie.

As a workaround, I can read the lockdump via:

[root@aa-node-1 ~]# exec 5<>/proc/fs/gfs
[root@aa-node-1 ~]# echo list >&5
[root@aa-node-1 ~]# cat <&5
4172492800 cciss/c0d0p1 6E0845C6A41911:FS1.0
cat: -: No such file or directory
[root@aa-node-1 ~]# exec 5<>/proc/fs/gfs
[root@aa-node-1 ~]# echo lockdump 4172492800 >&5
[root@aa-node-1 ~]# dd bs=4096k <&5 > /tmp/gfs.lockdump
dd: reading `standard input': No such file or directory
0+1 records in
0+1 records out
[root@aa-node-1 ~]#

-- Additional comment from charlieb-redhat-bugzilla@e-smith.com on 2007-12-12
10:33 EST --
> The problem lies in mp2cookie() in gfs_tool/util.c - it's failing to find a
> cookie for the filesystem, because "cciss/c0d0p1" does not match
> "/dev/cciss/c0d0p1".

No, I think the failing comparison is "cciss/c0d0p1" vs "c0d0p1", due to this
code in do_basename():

...
        if (stat(device, &st))
                goto punt;
        if (major(st.st_rdev) == major_number) {
                static char realname[16];
                snprintf(realname, 16, "dm-%u", minor(st.st_rdev));
                return realname;
        }

 punt:
        return basename(device);
}
...

Using basename() to strip a "/dev/" prefix appears naive.


-- Additional comment from rpeterso@redhat.com on 2007-12-12 11:04 EST --
I'll assume ownership of this one.  I fixed this for RHEL5.2.
This section of code is not straightforward.  It does lookups only
to turn around and do reverse lookups.


-- Additional comment from charlieb-redhat-bugzilla@e-smith.com on 2007-12-12
11:26 EST --
This code looks odd to me - is a mountpoint ever listed as the first item in the
output of "gfs_tool list", or is this just a backdoor way to allow "gfs_tool
lockdump cookie"?

...
        for (x = 0; *lines[x]; x++) {
                char s_id[256];
                sscanf(lines[x], "%s %s", cookie, s_id);
                if (dev) {
                        if (strcmp(s_id, dev) == 0)
                                return cookie;
                } else {
                        if (strcmp(cookie, mp) == 0)
                                return cookie;
                }
        }
...

Using the cookie as the index to 'gfs_tool lockdump' does work, so a simpler
workaround becomes:

gfs_tool lockdump $(gfs_tool list | awk '{ print $1 }')


-- Additional comment from rkenna@redhat.com on 2007-12-12 11:32 EST --
Marking for 4.7 consideration

-- Additional comment from pm-rhel@redhat.com on 2007-12-12 11:35 EST --
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

-- Additional comment from rpeterso@redhat.com on 2008-02-07 10:11 EST --
The "proper" way to fix this is to do the same thing I did with GFS2.
That is, to change the gfs kernel module so that it kicks out the
device major and minor number as "major:minor" rather than s_id, and
change gfs_tool to expect it that way accordingly.  That way it will
find the proper device no matter where it is or what it's called.
That would require a gfs-kernel crosswrite bugzilla.
Comment 1 Robert Peterson 2008-02-07 18:19:43 EST
Created attachment 294286 [details]
Proposed patch to fix the problem

I did this RHEL5 fix first because I don't have a RHEL4 cluster
readily available at the moment.
Comment 2 Robert Peterson 2008-03-14 12:51:04 EDT
I pushed the changes for RHEL4.7 so I'm requesting some flags for
inclusion in RHEL5.
Comment 3 Robert Peterson 2008-04-09 18:10:00 EDT
I pushed the fix to the RHEL5 branch, so I'm changing status to
modified.
Comment 6 errata-xmlrpc 2009-01-20 16:18:36 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0132.html

Note You need to log in before you can comment on or make changes to this bug.