Description of problem: ------------------------------------------- When a gluster volume is unmounted from the client, the client is seen to get rebooted. The following is seen in /var/log/messages, around the same time that the machine got rebooted - Nov 18 17:05:18 rhs rsyslogd: the last error occured in /etc/rsyslog.d/gluster.conf, line 17:"$ModLoad mmcount" Nov 18 17:05:18 rhs rsyslogd-3003: invalid or yet-unknown config file command - have you forgotten to load a module? [try http://www.rsyslog.com/e/3003 ] Nov 18 17:05:18 rhs rsyslogd: the last error occured in /etc/rsyslog.d/gluster.conf, line 18:"$mmcountKey gf_code # start counting value of gf_code" Nov 18 17:05:18 rhs rsyslogd: the last error occured in /etc/rsyslog.d/gluster.conf, line 28:"if $app-name contains 'gluster' then :mmcount:" Nov 18 17:05:18 rhs rsyslogd: warning: selector line without actions will be discarded Nov 18 17:05:18 rhs rsyslogd: the last error occured in /etc/rsyslog.conf, line 31:"$IncludeConfig /etc/rsyslog.d/*.conf" Nov 18 17:05:18 rhs rsyslogd-2124: CONFIG ERROR: could not interpret master config file '/etc/rsyslog.conf'. [try http://www.rsyslog.com/e/2124 ] Version-Release number of selected component (if applicable): On the client - [root@rhs ~]# rpm -qa|grep glusterfs glusterfs-fuse-3.4.0.42.1u2rhs-1.el6rhs.x86_64 glusterfs-libs-3.4.0.42.1u2rhs-1.el6rhs.x86_64 glusterfs-3.4.0.42.1u2rhs-1.el6rhs.x86_64 How reproducible: Frequently Steps to Reproduce: 1. Fuse mounted a gluster volume on the client. 2. Created data on the mount point, deleted some of it, multiple times. 3. Unmounted the volume. Actual results: When the command to unmount the volume was run, the client got rebooted. Expected results: Unmounting the volume should not cause the client to reboot. Additional info:
Created attachment 825564 [details] /var/log/messages
Created attachment 825566 [details] sosreport from the client machine
Regarding messages from /var/log/messages, this bz#1015630 is already taken care of it and these rsyslog warnings are nothing to do with this bug.
Why bz#1015630 is dependent to fix this bug?
I am sorry, the messages in the description pointed to the bug#1015630. However, they are just the side effect of reboot. When the client machine reboots, we just see the rsyslog messages in the log. Removing the depends flag.
Logged in to the machine and checked for crash logs as nothing was available in /var/log/messages. Found vm-core and vm-dmesg.txt related to same reboot. Here is a snip of the dmesg: <4>------------[ cut here ]------------ <2>kernel BUG at fs/dcache.c:670! <4>invalid opcode: 0000 [#1] SMP <4>last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/virtio0/net/eth0/broadcast <4>CPU 1 <4>Modules linked in: bridge stp llc fuse autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 microcode virtio_balloon snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc virtio_net i2c_piix4 i2c_core ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] <4> <4>Pid: 17008, comm: umount Not tainted 2.6.32-358.18.1.el6.x86_64 #1 Red Hat KVM <4>RIP: 0010:[<ffffffff8119acd8>] [<ffffffff8119acd8>] shrink_dcache_for_umount_subtree+0x2a8/0x2b0 <4>RSP: 0018:ffff8801127ebdb8 EFLAGS: 00010292 <4>RAX: 0000000000000055 RBX: ffff8800c79043c0 RCX: 0000000000000000 <4>RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246 <4>RBP: ffff8801127ebdf8 R08: 0000000000000000 R09: ffffffff8163fde0 <4>R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000005 <4>R13: ffffffff81a83fc0 R14: ffff8801017b7d80 R15: ffff8800c7904420 <4>FS: 00007fe0e7a6e740(0000) GS:ffff880028300000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>CR2: 00007fe0e70d73a0 CR3: 0000000111d26000 CR4: 00000000000006e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process umount (pid: 17008, threadinfo ffff8801127ea000, task ffff880111449500) <4>Stack: <4> ffff880111cd0e70 0000000000000000 ffff8801127ebdd8 ffff880111cd0c00 <4><d> ffffffffa01fc200 ffff8801124cc338 ffff880111cd0c00 ffff880113a011c0 <4><d> ffff8801127ebe18 ffffffff8119ad16 0000000000000000 ffff880111cd0c00 <4>Call Trace: <4> [<ffffffff8119ad16>] shrink_dcache_for_umount+0x36/0x60 <4> [<ffffffff811835ff>] generic_shutdown_super+0x1f/0xe0 <4> [<ffffffff81183726>] kill_anon_super+0x16/0x60 <4> [<ffffffffa01f95d2>] fuse_kill_sb_anon+0x52/0x60 [fuse] <4> [<ffffffff81183ec7>] deactivate_super+0x57/0x80 <4> [<ffffffff811a215f>] mntput_no_expire+0xbf/0x110 <4> [<ffffffff811a2bcb>] sys_umount+0x7b/0x3a0 <4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b <4>Code: 50 30 4c 8b 0a 31 d2 48 85 f6 74 04 48 8b 56 40 48 05 70 02 00 00 48 89 de 48 c7 c7 f0 3a 7b 81 48 89 04 24 31 c0 e8 08 2e 37 00 <0f> 0b eb fe 0f 0b eb fe 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 <1>RIP [<ffffffff8119acd8>] shrink_dcache_for_umount_subtree+0x2a8/0x2b0 <4> RSP <ffff8801127ebdb8> Have asked Shruti(Reporter of this bug) to file a bug on kernel with the above information and make current bug dependent on that. A similar bug was filed on same kernel version when umount of a cifs mount is done. The patch for that went into cifs module. Hence it is possible that the bug still exists for fuse mounts. Here is the bug link https://bugzilla.redhat.com/show_bug.cgi?id=917890 Asking PM to remove U2 tag from this bug as we can't do much here.
This bug is very inconsistent and is a bug in rhel 6.4 kernel. Moving it out of corbett
*** This bug has been marked as a duplicate of bug 981741 ***
Verified that the back trace found in both the vm-cores are same. We have not seen the crash in any kernel version greater than "fixed in" kernel version.