Bug 1207534 - glusterd : unable to start glusterd after hard reboot as one of the peer info file is truncated to 0 byte
Summary: glusterd : unable to start glusterd after hard reboot as one of the peer info...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: glusterd
Version: mainline
Hardware: x86_64
OS: Linux
urgent
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-31 06:52 UTC by Rachana Patel
Modified: 2016-06-22 06:24 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-22 06:24:45 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Rachana Patel 2015-03-31 06:52:34 UTC
Description of problem:
=======================
after hard reboot of all server glusterd is not cming up on one node and peer info file on that node is trauncated to zero


Version-Release number of selected component (if applicable):
=============================================================
0.803.gitf64666f.el6.x86_64

How reproducible:
=================
havent tried but intemittent

Steps to Reproduce:
===================
1. had cluster of 3 nodes and run few test. 
2. unable to login to server so performed hard reboot of all servers
3. glusterd was not coming up on one node after reboot and peer info file was trunctaed to zero

[root@rhs-client38 ~]# service glusterd status
glusterd is stopped
[root@rhs-client38 ~]# service glusterd start
Starting glusterd:                                         [FAILED]
[root@rhs-client38 ~]# ls -l /var/lib/glusterd/peers/
total 4
-rw------- 1 root root 73 Mar 26 14:29 33fd732c-41c3-4fa5-a588-f7b352333724
-rw------- 1 root root  0 Mar 30 06:29 743cfed7-9578-4fae-8a93-47ef73be22ed
[root@rhs-client38 ~]# tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-03-31 00:50:14.787013] I [glusterd.c:154:glusterd_uuid_init] 0-management: retrieved UUID: 9a7435a5-877e-45a4-a94e-4d7ef2ab9cbc
[2015-03-31 00:50:14.787107] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2015-03-31 00:50:14.787288] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2015-03-31 00:50:14.787416] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2015-03-31 00:50:14.787551] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2015-03-31 00:50:14.787676] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2015-03-31 00:50:16.050763] E [xlator.c:426:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2015-03-31 00:50:16.050795] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed
[2015-03-31 00:50:16.050807] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2015-03-31 00:50:16.051269] W [glusterfsd.c:1212:cleanup_and_exit] (--> 0-: received signum (0), shutting down


log snippet :-

Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): orphan cleanup on readonly fs
Mar 30 06:37:30 rhs-client38 kernel: ------------[ cut here ]------------
Mar 30 06:37:30 rhs-client38 kernel: WARNING: at fs/ext4/inode.c:3929 ext4_flush_unwritten_io+0x74/0x80 [ext4]() (Not tainted)
Mar 30 06:37:30 rhs-client38 kernel: Hardware name: X9DRW-3LN4F+/X9DRW-3TF+
Mar 30 06:37:30 rhs-client38 kernel: Modules linked in: ext4 jbd2 mbcache sd_mod crc_t10dif isci libsas scsi_transport_sas ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Mar 30 06:37:30 rhs-client38 kernel: Pid: 862, comm: mount Not tainted 2.6.32-504.12.2.el6.x86_64 #1
Mar 30 06:37:30 rhs-client38 kernel: Call Trace:
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81074e4a>] ? warn_slowpath_null+0x1a/0x20
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00c8bb4>] ? ext4_flush_unwritten_io+0x74/0x80 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00f0937>] ? ext4_ext_truncate+0x37/0x1f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00cebf8>] ? ext4_truncate+0x4c8/0x6a0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81529445>] ? printk+0x41/0x44
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00dfed8>] ? ext4_msg+0x68/0x80 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00c61d3>] ? ext4_orphan_get+0xb3/0x1f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00e50d1>] ? ext4_fill_super+0x26e1/0x28f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff812969d4>] ? snprintf+0x34/0x40
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff8119156e>] ? get_sb_bdev+0x18e/0x1d0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00e29f0>] ? ext4_fill_super+0x0/0x28f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00de178>] ? ext4_get_sb+0x18/0x20 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff811909bb>] ? vfs_kern_mount+0x7b/0x1b0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81190b62>] ? do_kern_mount+0x52/0x130
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff811b270b>] ? do_mount+0x2fb/0x930
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81145734>] ? strndup_user+0x64/0xc0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff811b2dd0>] ? sys_mount+0x90/0xe0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Mar 30 06:37:30 rhs-client38 kernel: ---[ end trace a0b4d05ba5881b34 ]---
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): 6 orphan inodes deleted
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): 1 truncate cleaned up
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): recovery complete
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
Mar 30 06:37:30 rhs-client38 kernel: dracut: Mounted root filesystem /dev/mapper/vg_rhsclient38-lv_root
Mar 30 06:37:30 rhs-client38 kernel: SELinux:  Disabled at runtime.


Actual results:
===============
- glusterd not staring
- peer info file trucated

Comment 3 Vivek Agarwal 2015-04-23 07:22:55 UTC
This seems to be a system issue, removing this from the tracker for 3.7

Comment 4 Atin Mukherjee 2016-06-22 06:24:45 UTC
Based on comment 3, closing this bug.


Note You need to log in before you can comment on or make changes to this bug.