Bug 1207528 - glusterd : unable to start glusterd after hard reboot as one of the peer info file is truncated to 0 byte
Summary: glusterd : unable to start glusterd after hard reboot as one of the peer info...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Bug Updates Notification Mailing List
QA Contact: amainkar
URL:
Whiteboard:
Depends On:
Blocks: qe_tracker_everglades
TreeView+ depends on / blocked
 
Reported: 2015-03-31 06:41 UTC by Rachana Patel
Modified: 2015-04-20 11:56 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-03-31 06:51:36 UTC
Embargoed:


Attachments (Terms of Use)

Description Rachana Patel 2015-03-31 06:41:28 UTC
Description of problem:
=======================
after hard reboot of all server glusterd is not cming up on one node and peer info file on that node is trauncated to zero


Version-Release number of selected component (if applicable):
=============================================================
0.803.gitf64666f.el6.x86_64

How reproducible:
=================
havent tried but intemittent

Steps to Reproduce:
===================
1. had cluster of 3 nodes and run few test. 
2. unable to login to server so performed hard reboot of all servers
3. glusterd was not coming up on one node after reboot and peer info file was trunctaed to zero

[root@rhs-client38 ~]# service glusterd status
glusterd is stopped
[root@rhs-client38 ~]# service glusterd start
Starting glusterd:                                         [FAILED]
[root@rhs-client38 ~]# ls -l /var/lib/glusterd/peers/
total 4
-rw------- 1 root root 73 Mar 26 14:29 33fd732c-41c3-4fa5-a588-f7b352333724
-rw------- 1 root root  0 Mar 30 06:29 743cfed7-9578-4fae-8a93-47ef73be22ed
[root@rhs-client38 ~]# tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
[2015-03-31 00:50:14.787013] I [glusterd.c:154:glusterd_uuid_init] 0-management: retrieved UUID: 9a7435a5-877e-45a4-a94e-4d7ef2ab9cbc
[2015-03-31 00:50:14.787107] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2015-03-31 00:50:14.787288] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2015-03-31 00:50:14.787416] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2015-03-31 00:50:14.787551] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2015-03-31 00:50:14.787676] I [rpc-clnt.c:972:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2015-03-31 00:50:16.050763] E [xlator.c:426:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again
[2015-03-31 00:50:16.050795] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed
[2015-03-31 00:50:16.050807] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed
[2015-03-31 00:50:16.051269] W [glusterfsd.c:1212:cleanup_and_exit] (--> 0-: received signum (0), shutting down


log snippet :-

Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): orphan cleanup on readonly fs
Mar 30 06:37:30 rhs-client38 kernel: ------------[ cut here ]------------
Mar 30 06:37:30 rhs-client38 kernel: WARNING: at fs/ext4/inode.c:3929 ext4_flush_unwritten_io+0x74/0x80 [ext4]() (Not tainted)
Mar 30 06:37:30 rhs-client38 kernel: Hardware name: X9DRW-3LN4F+/X9DRW-3TF+
Mar 30 06:37:30 rhs-client38 kernel: Modules linked in: ext4 jbd2 mbcache sd_mod crc_t10dif isci libsas scsi_transport_sas ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Mar 30 06:37:30 rhs-client38 kernel: Pid: 862, comm: mount Not tainted 2.6.32-504.12.2.el6.x86_64 #1
Mar 30 06:37:30 rhs-client38 kernel: Call Trace:
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81074e4a>] ? warn_slowpath_null+0x1a/0x20
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00c8bb4>] ? ext4_flush_unwritten_io+0x74/0x80 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00f0937>] ? ext4_ext_truncate+0x37/0x1f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00cebf8>] ? ext4_truncate+0x4c8/0x6a0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81529445>] ? printk+0x41/0x44
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00dfed8>] ? ext4_msg+0x68/0x80 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00c61d3>] ? ext4_orphan_get+0xb3/0x1f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00e50d1>] ? ext4_fill_super+0x26e1/0x28f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff812969d4>] ? snprintf+0x34/0x40
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff8119156e>] ? get_sb_bdev+0x18e/0x1d0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00e29f0>] ? ext4_fill_super+0x0/0x28f0 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffffa00de178>] ? ext4_get_sb+0x18/0x20 [ext4]
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff811909bb>] ? vfs_kern_mount+0x7b/0x1b0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81190b62>] ? do_kern_mount+0x52/0x130
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff811b270b>] ? do_mount+0x2fb/0x930
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff81145734>] ? strndup_user+0x64/0xc0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff811b2dd0>] ? sys_mount+0x90/0xe0
Mar 30 06:37:30 rhs-client38 kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Mar 30 06:37:30 rhs-client38 kernel: ---[ end trace a0b4d05ba5881b34 ]---
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): 6 orphan inodes deleted
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): 1 truncate cleaned up
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): recovery complete
Mar 30 06:37:30 rhs-client38 kernel: EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
Mar 30 06:37:30 rhs-client38 kernel: dracut: Mounted root filesystem /dev/mapper/vg_rhsclient38-lv_root
Mar 30 06:37:30 rhs-client38 kernel: SELinux:  Disabled at runtime.


Actual results:
===============
- glusterd not staring
- peer info file trucated


Note You need to log in before you can comment on or make changes to this bug.