Description of problem: I was attempting to mount a bunch of gfs filesystems on all 3 grant cluster nodes when gfs_controld segfaulted on grant-02. Jan 16 14:35:31 grant-02 gfs_controld[9289]: mount_client_dead ci 106 no sysfs entry for fs Jan 16 14:35:31 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:364" Jan 16 14:35:31 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:370" Jan 16 14:35:31 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:31 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:371" Jan 16 14:35:31 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:31 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:31 grant-02 gfs_controld[9289]: mount_client_dead ci 109 no sysfs entry for fs Jan 16 14:35:31 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:129" Jan 16 14:35:31 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:135" Jan 16 14:35:31 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:31 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:31 grant-02 gfs_controld[9289]: mount_client_dead ci 112 no sysfs entry for fs Jan 16 14:35:31 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:144" Jan 16 14:35:31 grant-02 gfs_controld[9289]: write errno 9 Jan 16 14:35:31 grant-02 gfs_controld[9289]: notify_mount_client: send failed -1 [...] Jan 16 14:35:32 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:119" Jan 16 14:35:32 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:32 grant-02 gfs_controld[9289]: mount_client_dead ci 207 no sysfs entry for fs Jan 16 14:35:32 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:251" Jan 16 14:35:32 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:32 grant-02 kernel: GFS: unknown option: 0 Jan 16 14:35:32 grant-02 kernel: GFS: invalid mount option(s) Jan 16 14:35:32 grant-02 kernel: GFS: can't parse mount arguments Jan 16 14:35:32 grant-02 kernel: gfs_controld[9289]: segfault at 0000000000000024 rip 000000000041038b rsp 00007fffcbf750e0 error 4 Jan 16 14:35:32 grant-02 kernel: Joined cluster. Now mounting FS... Jan 16 14:35:32 grant-02 groupd[9269]: gfs daemon appears to be dead Jan 16 14:35:32 grant-02 dlm_controld[9283]: cluster is down, exiting Jan 16 14:35:32 grant-02 fenced[9277]: cluster is down, exiting Jan 16 14:35:32 grant-02 kernel: dlm: closing connection to node 1 Jan 16 14:35:32 grant-02 kernel: dlm: closing connection to node 3 Jan 16 14:35:32 grant-02 kernel: dlm: closing connection to node 2 Jan 16 14:35:33 grant-02 kernel: Trying to join cluster "lock_dlm", "GRANT:38" It's possible that this issue may be closed bz 223271. Version-Release number of selected component (if applicable): 2.6.18-128.el5 lvm2-2.02.40-6.el5 BUILT: Fri Oct 24 07:37:33 CDT 2008 lvm2-cluster-2.02.40-7.el5 BUILT: Wed Nov 26 07:19:19 CST 2008 device-mapper-1.02.28-2.el5 BUILT: Fri Sep 19 02:50:32 CDT 2008 kmod-gfs-0.1.31-3.el5 BUILT: Wed 17 Dec 2008 03:19:30 PM CST gfs-utils-0.1.18-1.el5 BUILT: Thu 31 Jul 2008 04:34:07 PM CDT I'll attach the core file...
Created attachment 329254 [details] gfs_controld corefile from grant-02
Adding Dave Teigland to the cc list. I don't have time to look at this one until next week.
I'm not sure how to make that core file useful, I think perhaps install debuginfo on grant-02 and run gdb on it there? It looks like there are other bad things happening before the segfault, too; mount_client_dead indicates mount.gfs is going away unexpectedly.
See also comment #1 of bz #475312 which might be connected.
Core was generated by `/sbin/gfs_controld'. Program terminated with signal 11, Segmentation fault. [New process 9289] #0 0x000000000041038b in got_mount_result (mg=0xab08da0, result=-1, ci=207, another=0) at recover.c:2172 2172 log_group(mg, "got_mount_result: ci %d result %d another %d " (gdb) bt #0 0x000000000041038b in got_mount_result (mg=0xab08da0, result=-1, ci=207, another=0) at recover.c:2172 #1 0x0000000000403d0c in loop () at main.c:412 #2 0x0000000000404811 in main (argc=<value optimized out>, argv=<value optimized out>) at main.c:983 (gdb) p *mg $1 = {list = {next = 0xab065b0, prev = 0x81ffe0}, id = 0, associated_ls_id = 0, members = {next = 0xab08db8, prev = 0xab08db8}, members_gone = {next = 0xab08dc8, prev = 0xab08dc8}, memb_count = 0, resources = {next = 0xab08de0, prev = 0xab08de0}, mountpoints = { next = 0xab0a570, prev = 0xab0a570}, name = "21", '\0' <repeats 253 times>, table = "GRANT:21", '\0' <repeats 247 times>, type = "gfs\000", options = "rw", '\0' <repeats 1022 times>, dev = "/dev/mapper/vg2-lv1", '\0' <repeats 4077 times>, last_stop = 0, last_start = 0, last_finish = 0, last_callback = 0, start_event_nr = 0, start_type = 0, error_msg = '\0' <repeats 127 times>, mount_client = 207, mount_client_fd = 213, mount_client_notified = 0, mount_client_delay = 0, group_leave_on_finish = 0, remount_client = 0, init = 1, got_our_options = 0, got_our_journals = 0, delay_send_journals = 0, kernel_mount_error = 0, kernel_mount_done = 0, got_kernel_mount = 0, first_mount_pending_stop = 0, first_mounter = 0, first_mounter_done = 0, global_first_recover_done = 0, emulate_first_mounter = 0, wait_first_done = 0, low_nodeid = -1, master_nodeid = -1, save_plocks = 0, reject_mounts = 0, cp_handle = 0, last_checkpoint_time = 0, last_plock_time = 0, drop_resources_last = {tv_sec = 0, tv_usec = 0}, needs_recovery = 0, our_jid = 0, spectator = 0, readonly = 0, rw = 1, withdraw = 0, dmsetup_wait = 0, dmsetup_pid = 0, saved_messages = { next = 0xab0a548, prev = 0xab0a548}, start2_fn = 0x0} (gdb) p memb $2 = <value optimized out> (gdb) p *memb Cannot access memory at address 0x0
<dct_home> corey, so when you say mounting a "bunch of filesystems" you mean how many hundred at once? <corey> dct_home: "how many hundred at once" you know me too well ;) <corey> dct_home: 560 <dct_home> ok, thanks, just wanted that to put your bug into perspective for anyone who was actually worried about it ;-)
Created attachment 329366 [details] gfs_controld debug log extracted from the core file
FWIW, I was able to repo this bz with "just" 100 GFS, I tried less then that but end up hitting cluster deadlocks each time. I'll file a BZ for that now...
The immediate cause of the segfault isn't hard to fix, but it's not terribly helpful because there are already other problems that have occured before that happens. I'm going to approach this from the other direction -- find a number that works and increase from there. The hope being that we'll hit the various bugs incrementally instead of trying to diagnose and fix a whole bunch all jumbled together.
My current testing is on a two node xen cluster, where one node does for i in `seq 1 32`; do mount -v /dev/dctxen/x$i /gfs$i & done This will often work, but sometimes triggers an error where one mount process will be stuck: 2519 pts/0 S 0:00 mount -v /dev/dctxen/x13 /gfs13 2551 pts/0 S 0:00 /sbin/mount.gfs /dev/dctxen/x13 /gfs13 -v -o rw mount.gfs is getting an empty response back from gfs_controld (prefixing all the mount.gfs output with the fs name or mount point would help in deciphering all the mixed together output). /sbin/mount.gfs: message to gfs_controld: asking to join mountgroup: /sbin/mount.gfs: write "join /gfs13 gfs lock_dlm dctxen:13 rw /dev/mapper/dctxen -x13" /sbin/mount.gfs: message from gfs_controld: response to join request: /sbin/mount.gfs: lock_dlm_join: read "0" /sbin/mount.gfs: message from gfs_controld: mount options: /sbin/mount.gfs: lock_dlm_join: read "" /sbin/mount.gfs: lock_dlm_join: hostdata: "" /sbin/mount.gfs: lock_dlm_join: extra_plus: "" /sbin/mount.gfs: parse_opts: opts = "rw" /sbin/mount.gfs: clear flag 1 for "rw", flags = 0 /sbin/mount.gfs: parse_opts: flags = 0 /sbin/mount.gfs: parse_opts: extra = "" /sbin/mount.gfs: parse_opts: hostdata = "" /sbin/mount.gfs: parse_opts: lockproto = "" /sbin/mount.gfs: parse_opts: locktable = "" Instead of "", we *should* have gotten lock_dlm_join: read "hostdata=jid=0:id=917505:first=1" the gfs-kernel error confirms that no proper option string is passed: Jan 21 16:29:12 dct-xen-01 kernel: GFS: fsid=dctxen:13.4294967295: can't mount j ournal #4294967295 Jan 21 16:29:12 dct-xen-01 kernel: GFS: fsid=dctxen:13.4294967295: there are onl y 2 journals (0 - 1) gfs_controld debug shows that the connection between mount.gfs and gfs_controld has been closed prematurely. 1232576938 client 18: join /gfs13 gfs lock_dlm dctxen:13 rw /dev/mapper/dctxen-x13 1232576938 mount: /gfs13 gfs lock_dlm dctxen:13 rw /dev/mapper/dctxen-x13 1232576938 13 cluster name matches: dctxen 1232576938 13 do_mount: rv 0 1232576938 client 18 fd 25 dead 1232576938 13 get open /sys/fs/gfs/dctxen:13/lock_module/id error -1 2 1232576938 mount_client_dead ci 18 no sysfs entry for fs 1232576938 groupd cb: set_id 13 e0001 1232576938 groupd cb: start 13 type 2 count 1 members 1 1232576938 13 start 14 init 1 type 2 member_count 1 1232576938 13 add member 1 1232576938 13 total members 1 master_nodeid -1 prev -1 1232576938 13 start_first_mounter 1232576938 13 start_done 14 1232576938 notify_mount_client: nodir not found for lockspace 13 1232576938 notify_mount_client: ccs_disconnect 1232576938 notify_mount_client: hostdata=jid=0:id=917505:first=1 1232576938 write errno 9 1232576938 notify_mount_client: send failed -1 1232576938 groupd cb: finish 13 gfs_controld is calling client_dead() on the connection from mount.gfs, even though mount.gfs hasn't exited. There's one of two places that could be happening; finding which is the next step.
I'm having a lot of difficulty reproducing this consistently. In a way it verifies the uninitialized memory that I suspect is the bug, but I've not been able to confirm. For a while, I was seeing poll revents of 0xFFFF cause the connections to be closed. That makes me think that the following may be the bug. I guess it's a good patch regardless, so I'll probably commit to the RHEL5 branch so we can get a new build for you to try. diff --git a/group/gfs_controld/main.c b/group/gfs_controld/main.c index b2edc32..7293938 100644 --- a/group/gfs_controld/main.c +++ b/group/gfs_controld/main.c @@ -208,6 +208,7 @@ static int client_add(int fd) for (i = client_size - MAX_CLIENTS; i < client_size; i++) { client[i].fd = -1; pollfd[i].fd = -1; + pollfd[i].revents = 0; } } }
commit in RHEL5 branch e4c7af4521fdacd4663ccf44d21208973f891ede
Let's assume for now that that patch fixed it.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: - cause: mounting around 100 gfs fs's in parallel - consequence: some mount processes never complete - fix: initialize variable in gfs_controld so that connection with mount.gfs is not broken - result: mount.gfs receives correct data from gfs_controld, which prevents the mount error and hang
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1341.html