Description of problem: There is no way to start a previously existing brick from an newly created empty folder (eg. if you have replaced a failed disk). The brick will never start up because it is missing trusted.glusterfs.volume-id at extended attribute at the brick location. E [posix.c:4288:init] 0-machines-posix: Extended attribute trusted.glusterfs.volume-id is absent Version-Release number of selected component (if applicable): 3.4.0 How reproducible: stop glusterd unmount $brick rm -rf $brick mkfs.xfs $brick mkdir $brick mount $brick start glusterd Actual results: Brick does not start and cannot be healed. Expected results: Brick will start empty and create a missing stuff to be a perfect target for healing. Either this happens automatically or there neeeds to be a commandline to initialize a folder with existing brick meta data. Additional info: The workaround is to manually recreate the trusted.glusterfs.volume-id extened attribute on the brick folder. JoeJulian on IRC came up with the following command to just do this: (vol=myvol; brick=/tmp/brick1; setfattr -n trusted.glusterfs.volume-id -v $(grep volume-id /var/lib/glusterd/vols/$vol/info | cut -d= -f2 | sed 's/-//g') $brick) This works perfectly well, and the brick starts fine (empty) and is healed perfectly afterwards.
For some reason the command got garbled - correct command: (vol=myvol; brick=/tmp/brick1; setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id /var/lib/glusterd/vols/$vol/info | cut -d= -f2 | sed 's/-//g') $brick)
The volume-id metadata is automatically created when one of the following commands is run: 1.gluster volume start <VOLNAME> force 2.gluster volume replace-brick <VOLNAME> <FAILED-BRICK> <NEW-BRICK> commit force Thereafter, the self-heal can be triggered to copy the data to the new brick.
Force starting a volume does not work: root@srv2:~# gluster volume start machines force volume start: machines: failed: Failed to get extended attribute trusted.glusterfs.volume-id for brick dir /export/brick1. Reason : No data available Force replacing a brick does not work: root@srv2:~# gluster volume replace-brick machines srv2:/export/brick1 srv2:/export/brick1 commit force volume replace-brick: failed: Brick: srv2:/export/brick1 not available. Brick may be containing or be contained by an existing brick Details: on srv2 (out of 3 servers, all clean an fully running with one brick each) root@srv2:~# service glusterfs-server stop Existing brick: /dev/mapper/ubuntu-brick1 on /export/brick1 type xfs (rw,nosuid,noatime) root@srv2:~# killall glusterfsd -> brick goes offline check existing attributes root@srv2:~# getfattr -m- -d /export/brick1/ getfattr: Removing leading '/' from absolute path names # file: export/brick1/ trusted.afr.machines-client-0=0sAAAAAAAAAAAAAAAA trusted.afr.machines-client-1=0sAAAAAAAAAAAAAAAA trusted.afr.machines-client-2=0sAAAAAAAAAAAAAAAA trusted.gfid=0sAAAAAAAAAAAAAAAAAAAAAQ== trusted.glusterfs.dht=0sAAAAAQAAAAAAAAAA/////w== trusted.glusterfs.volume-id=0s4PICLVK6S5yX2v7X3dNtdg== root@srv2:~# umount /export/brick1 root@srv2:~# mkfs.xfs -f /dev/mapper/ubuntu-brick1 -> New filesystem root@srv2:~# mount /export/brick1 check existing attributes root@srv2:~# getfattr -m- -d /export/brick1/ (there are none) Start glusterfs server (brick does not start) root@srv2:~# service glusterfs-server start [2013-08-19 08:51:48.122318] E [posix.c:4288:init] 0-machines-posix: Extended attribute trusted.glusterfs.volume-id is absent [2013-08-19 08:51:48.122450] E [xlator.c:390:xlator_init] 0-machines-posix: Initialization of volume 'machines-posix' failed, review your volfile again [2013-08-19 08:51:48.122467] E [graph.c:292:glusterfs_graph_init] 0-machines-posix: initializing translator failed [2013-08-19 08:51:48.122480] E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick srv1:/export/brick1 49152 Y 14621 Brick srv2:/export/brick1 N/A N N/A Brick srv3:/export/brick1 49152 Y 13130
Following my example from Comment #3: root@srv2:~# (vol=machines; brick=/export/brick1; setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id /var/lib/glusterd/vols/$vol/info | cut -d= -f2 | sed 's/-//g') $brick) root@srv2:~# getfattr -m- -d /export/brick1/ getfattr: Removing leading '/' from absolute path names # file: export/brick1/ trusted.glusterfs.volume-id=0s4PICLVK6S5yX2v7X3dNtdg== root@srv2:~# gluster volume start machines force volume start: machines: success Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick srv1:/export/brick1 49152 Y 14621 Brick srv2:/export/brick1 49152 Y 22711 Brick srv3:/export/brick1 49152 Y 13130
I try solution: (vol=machines; brick=/export/sdb1/brick; setfattr -n trusted.glusterfs.volume-id -v 0x$(grep volume-id /var/lib/glusterd/vols/$vol/info | cut -d= -f2 | sed 's/-//g') $brick) grep: /var/lib/glusterd/vols/machines/info: No such file or directory And then this: root@cluster1:/home/admincit# gluster volume start gv0 force volume start: gv0: failed: Volume id mismatch for brick 192.168.100.170:/export/sdb1/brick. Expected volume id e9e31a9d-e194-4cbb-851c-a837ebc753d0, volume id 00000000-0000-0000-0000-000000000000 found This has come about because I replaced the HD the brick was on. Is there another solution I can try or does anyone know what I have to tweak in the above commands to get this volume id thing sorted ? Thanks for the help, Dan
I'm seeing this as well, Ubuntu 12.04, Gluster 3.4.2. Comment 4 worked for me. I did not need to use the "force" option. I also needed to stop/start the volume if it was already started.
(In reply to nabber00 from comment #6) > I'm seeing this as well, Ubuntu 12.04, Gluster 3.4.2. Comment 4 worked for > me. I did not need to use the "force" option. I also needed to stop/start > the volume if it was already started. Sorry that should have been 14.04.
GlusterFS 3.7.0 has been released (http://www.gluster.org/pipermail/gluster-users/2015-May/021901.html), and the Gluster project maintains N-2 supported releases. The last two releases before 3.7 are still maintained, at the moment these are 3.6 and 3.5. This bug has been filed against the 3,4 release, and will not get fixed in a 3.4 version any more. Please verify if newer versions are affected with the reported problem. If that is the case, update the bug with a note, and update the version if you can. In case updating the version is not possible, leave a comment in this bug report with the version you tested, and set the "Need additional information the selected bugs from" below the comment box to "bugs". If there is no response by the end of the month, this bug will get automatically closed.
GlusterFS 3.4.x has reached end-of-life. If this bug still exists in a later release please reopen this and change the version or open a new bug.