Bug 1018232
Summary: | gfs_controld[1438]: cpg_initialize error 100 after fenced node rejoins and mounts GFS2 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Jaroslav Kortus <jkortus> | ||||||||
Component: | corosync | Assignee: | Jan Friesse <jfriesse> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||||
Severity: | medium | Docs Contact: | |||||||||
Priority: | medium | ||||||||||
Version: | 6.5 | CC: | ccaulfie, cluster-maint, fdinitto, rpeterso, teigland, tlavigne | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | corosync-1.4.1-18.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
This BZ doesn't need documentation. It's only for newer versions of kernels (newer then included in RHEL 6).
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 1019343 (view as bug list) | Environment: | |||||||||
Last Closed: | 2014-10-14 07:11:38 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 1055584 | ||||||||||
Bug Blocks: | 1019343, 1023565 | ||||||||||
Attachments: |
|
Description
Jaroslav Kortus
2013-10-11 14:00:59 UTC
Successful run of the same command later: Oct 11 15:55:00 virt-139 qarshd[2642]: Running cmdline: mount -t gfs2 /dev/mapper/STSRHTS8560-STSRHTS85600 /mnt/STSRHTS85600 Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1438, uid = 0, gid = 0 Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1438, uid = 0, gid = 0 Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:00 virt-139 kernel: Slow work thread pool: Starting up Oct 11 15:55:00 virt-139 kernel: Slow work thread pool: Ready Oct 11 15:55:00 virt-139 kernel: GFS2 (built Sep 30 2013 12:03:53) installed Oct 11 15:55:00 virt-139 kernel: GFS2: fsid=: Trying to join cluster "lock_dlm", "STSRHTS8560:STSRHTS85600" Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1382, uid = 0, gid = 0 Oct 11 15:55:00 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1382, uid = 0, gid = 0 Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1382, uid = 0, gid = 0 Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1382, uid = 0, gid = 0 Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1382, uid = 0, gid = 0 Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1382, uid = 0, gid = 0 Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred found Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - client_pid = 1382, uid = 0, gid = 0 Oct 11 15:55:01 virt-139 corosync[1309]: [IPC ] IPC cred - Security valid Oct 11 15:55:01 virt-139 kernel: GFS2: fsid=STSRHTS8560:STSRHTS85600.1: Joined cluster. Now mounting FS... Oct 11 15:55:01 virt-139 kernel: GFS2: fsid=STSRHTS8560:STSRHTS85600.1: jid=1, already locked for use Oct 11 15:55:01 virt-139 kernel: GFS2: fsid=STSRHTS8560:STSRHTS85600.1: jid=1: Looking at journal... Oct 11 15:55:01 virt-139 kernel: GFS2: fsid=STSRHTS8560:STSRHTS85600.1: jid=1: Done root 1438 0.0 0.1 134248 1904 ? Ssl 15:39 0:00 gfs_controld root 1382 0.0 0.5 218228 5948 ? Ssl 15:39 0:00 dlm_controld It looks like something may have changed with IPC since 6.4. Jardo, thanks for running test with debug packages. It helped me reduce problem space a lot (internal uidgid handling looks ok). Problem is really in CRED CMSG. Sadly that code wasn't touched for ages, so it seems to be ether new feature of kernel of libc, or ... This is serious blocker (= must be included in 6.5) so I've changed priority. After digging to problem a little more. Problem seems to be kernel problem (not corosync one) and related to SO_PASSCRED. Next comment includes very simple (I would call it as minimal) reproducer. Steps to reproduce: gcc usock-to-stdout.c ./a.out /tmp/a on different console execute: while true;do echo "ahjo" | nc -U /tmp/a;done Reproducible with kernel 2.6.32-419.el6.x86_64. Created attachment 812543 [details]
Reproducer
Reproducer
Created attachment 813659 [details]
Proposed patch - part 1 - (required) Backport LibQB handling of CMSG (for loop)
Created attachment 813660 [details] Proposed patch - part 2 - Set SO_PASSCRED on listener socket This patch is needed only if http://patchwork.ozlabs.org/patch/284366/ is accepted without any changes. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1508.html |