Hide Forgot
In my testing of 3.1qa43 build, I managed to render glusterd inoperative, even after a restart /etc/init.d/glusterd. Support will need a method to recover glusterd when it gets into this state. How do we reset glusterd? gluster> volume log locate mirrorvol1 10.1.30.126 wrong brick type: 10.1.30.126, use <HOSTNAME>:<export-dir-abs-path> getting log file location information failed gluster> volume log locate mirrorvol1 10.1.30.126:/mnt2 log file location: /etc/glusterd/logs/bricks gluster> quit [root@alu-vm1 glusterd]# cd /etc/glusterd/logs/bricks -bash: cd: /etc/glusterd/logs/bricks: No such file or directory [root@alu-vm1 glusterd]# gluster volume info all Volume Name: mirrorvol1 Type: Distributed-Replicate Status: Created Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.1.30.127:/mnt1 Brick2: 10.1.30.127:/mnt2 Brick3: 10.1.30.126:/mnt1 Brick4: 10.1.30.126:/mnt2 [root@alu-vm1 glusterd]# gluster volume start mirrorvol1 Starting volume mirrorvol1 has been unsuccessful [root@alu-vm1 glusterd]# gluster volume info all Volume Name: mirrorvol1 Type: Distributed-Replicate Status: Created Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.1.30.127:/mnt1 Brick2: 10.1.30.127:/mnt2 Brick3: 10.1.30.126:/mnt1 Brick4: 10.1.30.126:/mnt2 [root@alu-vm1 glusterd]# gluster volume info all Volume Name: mirrorvol1 Type: Distributed-Replicate Status: Created Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.1.30.127:/mnt1 Brick2: 10.1.30.127:/mnt2 Brick3: 10.1.30.126:/mnt1 Brick4: 10.1.30.126:/mnt2 [root@alu-vm1 glusterd]# /etc/init.d/glusterd restart Stopping glusterd: [ OK ] Starting glusterd: [ OK ] [root@alu-vm1 glusterd]# gluster volume info all [root@alu-vm1 glusterd]# /etc/init.d/glusterd restart Stopping glusterd: [FAILED] Starting glusterd: [ OK ] [root@alu-vm1 glusterd]# gluster volume info all [root@alu-vm1 glusterd]# [root@alu-vm1 glusterd]# cd /etc/glusterd [root@alu-vm1 glusterd]# ls -la total 44 drwxr-xr-x 5 root root 4096 Oct 8 11:21 . drwxr-xr-x 97 root root 12288 Oct 8 13:53 .. -rw-r--r-- 1 root root 7194 Oct 8 14:36 .cmd_log_history -rw-r--r-- 1 root root 42 Oct 8 11:21 glusterd.info lrwxrwxrwx 1 root root 18 Oct 8 11:21 logs -> /var/log/glusterfs drwxr-xr-x 3 root root 4096 Oct 8 14:24 nfs drwxr-xr-x 2 root root 4096 Oct 8 12:39 peers drwxr-xr-x 3 root root 4096 Oct 8 14:31 vols [root@alu-vm1 glusterd]# cd vols [root@alu-vm1 vols]# ls -la total 12 drwxr-xr-x 3 root root 4096 Oct 8 14:31 . drwxr-xr-x 5 root root 4096 Oct 8 11:21 .. drwxr-xr-x 4 root root 4096 Oct 8 14:32 mirrorvol1 [root@alu-vm1 vols]# cd mirrorvol1 [root@alu-vm1 mirrorvol1]# ls -la total 44 drwxr-xr-x 4 root root 4096 Oct 8 14:32 . drwxr-xr-x 3 root root 4096 Oct 8 14:31 .. drwxr-xr-x 2 root root 4096 Oct 8 14:31 bricks -rw-r--r-- 1 root root 15 Oct 8 14:31 cksum -rw-r--r-- 1 root root 214 Oct 8 14:31 info -rw-r--r-- 1 root root 662 Oct 8 14:31 mirrorvol1.10.1.30.126.mnt1.vol -rw-r--r-- 1 root root 662 Oct 8 14:31 mirrorvol1.10.1.30.126.mnt2.vol -rw-r--r-- 1 root root 662 Oct 8 14:31 mirrorvol1.10.1.30.127.mnt1.vol -rw-r--r-- 1 root root 662 Oct 8 14:31 mirrorvol1.10.1.30.127.mnt2.vol -rw-r--r-- 1 root root 1546 Oct 8 14:31 mirrorvol1-fuse.vol drwxr-xr-x 2 root root 4096 Oct 8 14:32 run [root@alu-vm1 mirrorvol1]# cd run [root@alu-vm1 run]# ls -la total 8 drwxr-xr-x 2 root root 4096 Oct 8 14:32 . drwxr-xr-x 4 root root 4096 Oct 8 14:32 .. [root@alu-vm1 run]# [root@alu-vm1 logs]# tail etc-glusterfs-glusterd.vol.log [2010-10-08 14:39:00.186690] E [glusterfsd.c:323:get_volfp] glusterfsd: /etc/glusterfs/glusterd.vol: No such file or directory [2010-10-08 14:39:00.186956] E [glusterfsd.c:1356:glusterfs_volumes_init] glusterfsd: Cannot reach volume specification file [2010-10-08 14:41:50.136349] E [glusterfsd.c:323:get_volfp] glusterfsd: /etc/glusterfs/glusterd.vol: No such file or directory [2010-10-08 14:41:50.136643] E [glusterfsd.c:1356:glusterfs_volumes_init] glusterfsd: Cannot reach volume specification file [root@alu-vm1 logs]# cd /etc/glusterfs [root@alu-vm1 glusterfs]# /etc/init.d/glusterd stop Stopping glusterd: [FAILED] [root@alu-vm1 glusterfs]# ps -ef | grep gluster root 3486 1 0 14:24 ? 00:00:00 /usr/sbin/glusterfsd --xlator-option mirrorvol1-server.listen-port=24010 -s localhost --volfile-id mirrorvol1.10.1.30.127.mnt2 -p /etc/glusterd/vols/mirrorvol1/run/10.1.30.127-mnt2.pid --brick-name /mnt2 --brick-port 24010 -l /etc/glusterd/logs/bricks/mnt2.log root 3781 3312 0 14:54 pts/0 00:00:00 grep gluster [root@alu-vm1 glusterfs]# kill -9 3486 [root@alu-vm1 glusterfs]# ps -ef | grep gluster root 3784 3312 0 14:54 pts/0 00:00:00 grep gluster [root@alu-vm1 glusterfs]# /etc/init.d/glusterd start Starting glusterd: [ OK ] [root@alu-vm1 glusterfs]# ps -ef | grep gluster root 3793 3312 0 14:54 pts/0 00:00:00 grep gluster [root@alu-vm1 glusterfs]# ps -ef | grep gluster root 3795 3312 0 14:54 pts/0 00:00:00 grep gluster [root@alu-vm1 glusterfs]# /etc/init.d/glusterd restart Stopping glusterd: [FAILED] Starting glusterd: [ OK ] [root@alu-vm1 glusterfs]# ps -ef | grep gluster root 3815 3312 0 14:55 pts/0 00:00:00 grep gluster [root@alu-vm1 glusterfs]# gluster volume info
Allen, Could you please attach the glusterd log file ?
There was no logs to be found. In order to get things back to normal, I had to reinstall the core package. A suitable fix to this is to allow admins a way to recreate the data structures needed for glusterd to be functional without reinstalling.
> [root@alu-vm1 logs]# tail etc-glusterfs-glusterd.vol.log > [2010-10-08 14:39:00.186690] E [glusterfsd.c:323:get_volfp] glusterfsd: > /etc/glusterfs/glusterd.vol: No such file or directory > [2010-10-08 14:39:00.186956] E [glusterfsd.c:1356:glusterfs_volumes_init] > glusterfsd: Cannot reach volume specification file > [2010-10-08 14:41:50.136349] E [glusterfsd.c:323:get_volfp] glusterfsd: > /etc/glusterfs/glusterd.vol: No such file or directory > [2010-10-08 14:41:50.136643] E [glusterfsd.c:1356:glusterfs_volumes_init] > glusterfsd: Cannot reach volume specification file Thinking of having 'glusterd.vol' in memory itself instead of loading it from volfile. Team, please revert back with your ideas..
(In reply to comment #2) > There was no logs to be found. In order to get things back to normal, I had to > reinstall the core package. > > A suitable fix to this is to allow admins a way to recreate the data structures > needed for glusterd to be functional without reinstalling. We can work around this problem by just copying the 'glusterd.vol' file in proper path.. no need of re-install of glusterfs.
glusterd.vol was missing and hence the behavior. Resolving as invalid.