Description of problem: ===================== Created about 60 1x3 volumes on a 6 node setup as below: all volumes created using bricks from n1,n2,n3 stated to pump IOs from 3 different clients by writing one file into one volume at a time. Took snapshots 1 for each volume While IOs were going on, brought down v1 Then killed brick process on n3(so all volumes have brick3 down) Then after leaving setup overnight, restarted glusterd on n3 to bring b3 up. I ended up seeing 2 glusterfsd processes on n3 as below [root@dhcp35-122 ~]# ps -ef|grep glusterfsd root 21022 1 8 12:08 ? 00:01:27 /usr/sbin/glusterfsd -s dhcp35-122.lab.eng.blr.redhat.com --volfile-id cross3_10.dhcp35-122.lab.eng.blr.redhat.com.rhs-brick1-cross3_10 -p /var/lib/glusterd/vols/cross3_10/run/dhcp35-122.lab.eng.blr.redhat.com-rhs-brick1-cross3_10.pid -S /var/lib/glusterd/vols/cross3_10/run/daemon-dhcp35-122.lab.eng.blr.redhat.com.socket --brick-name /rhs/brick1/cross3_10 -l /var/log/glusterfs/bricks/rhs-brick1-cross3_10.log --xlator-option *-posix.glusterd-uuid=0d8eaf5c-e629-451b-b6d2-b0a32df473a0 --brick-port 49152 --xlator-option cross3_10-server.listen-port=49152 root 21088 1 0 12:08 ? 00:00:02 /usr/sbin/glusterfsd -s dhcp35-122.lab.eng.blr.redhat.com --volfile-id cross3_30.dhcp35-122.lab.eng.blr.redhat.com.rhs-brick2-cross3_30 -p /var/lib/glusterd/vols/cross3_30/run/dhcp35-122.lab.eng.blr.redhat.com-rhs-brick2-cross3_30.pid -S /var/lib/glusterd/vols/cross3_30/run/daemon-dhcp35-122.lab.eng.blr.redhat.com.socket --brick-name /rhs/brick2/cross3_30 -l /var/log/glusterfs/bricks/rhs-brick2-cross3_30.log --xlator-option *-posix.glusterd-uuid=0d8eaf5c-e629-451b-b6d2-b0a32df473a0 --brick-port 49153 --xlator-option cross3_30-server.listen-port=49153 root 22220 20760 0 12:25 pts/0 00:00:00 grep --color=auto glusterfsd [root@dhcp35-122 ~]# Version-Release number of selected component (if applicable): ====== 3.8.4-22
note: the above setup had brickmultiplexing enabled at the start
gluster volume info/status output along with log files please?
Logs of the n3 where I saw the problem is available at http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/nchilaka/bug.1441939/ It included vol info log too. Note that I was unable to collect vol status due to bz#1441946 - Brick Multiplexing: volume status showing "Another transaction is in progress" Also note the setup is currently available(I don;t know for how long I can guarantee that) 10.70.35.122
The following is my analysis and explains why this isn't a bug: From the "volume info" data it seems like due to some reason the snapshot create operation failed for volume cross3_30. This can be seen from the snapshot count for cross3_30 which is 0, unlike all the other volumes for which the snapshot count is 1. Also, for all volumes, option "features.barrier" has been reconfigured and set to "disabled", whereas, for cross3_30, this volume isn't configured. This can again be observed from the "volume info" data. The brick multiplexing feature works in a way that checks for brick compatibility before deciding to attach a brick to a particular brick process that already has one of more bricks. This compatibility check happens in 2 steps. First, volume options are compared between the two volumes that the bricks being checked for compatibility belong to. If the volume options do not match, then the bricks are deemed incompatible and a new brick process is spawned instead. The second step involves other brick specific compatibility checks. Here the volume options for cross3_30 do not match those of the other volumes, due to the option "features.barrier". A new brick being spawned for brick "dhcp35-122.lab.eng.blr.redhat.com:/rhs/brick2/cross3_30" is thus an expected brick-multiplexing behaviour, resulting from the first compatibility check's failure, and can't be considered as a bug.
(In reply to Samikshan Bairagya from comment #5) > The following is my analysis and explains why this isn't a bug: > > From the "volume info" data it seems like due to some reason the snapshot > create operation failed for volume cross3_30. This can be seen from the > snapshot count for cross3_30 which is 0, unlike all the other volumes for > which the snapshot count is 1. Also, for all volumes, option > "features.barrier" has been reconfigured and set to "disabled", whereas, for > cross3_30, this volume isn't configured. This can again be observed from the > "volume info" data. > > The brick multiplexing feature works in a way that checks for brick > compatibility before deciding to attach a brick to a particular brick > process that already has one of more bricks. This compatibility check > happens in 2 steps. First, volume options are compared between the two > volumes that the bricks being checked for compatibility belong to. If the > volume options do not match, then the bricks are deemed incompatible and a > new brick process is spawned instead. The second step involves other brick > specific compatibility checks. So are we saying that if the volume options of two volumes are different, then they won't be served by same glusterfsd(meaning brick mux won't take into effect?) > > Here the volume options for cross3_30 do not match those of the other > volumes, due to the option "features.barrier". A new brick being spawned for > brick "dhcp35-122.lab.eng.blr.redhat.com:/rhs/brick2/cross3_30" is thus an > expected brick-multiplexing behaviour, resulting from the first > compatibility check's failure, and can't be considered as a bug.
(In reply to nchilaka from comment #6) > (In reply to Samikshan Bairagya from comment #5) > > The following is my analysis and explains why this isn't a bug: > > > > From the "volume info" data it seems like due to some reason the snapshot > > create operation failed for volume cross3_30. This can be seen from the > > snapshot count for cross3_30 which is 0, unlike all the other volumes for > > which the snapshot count is 1. Also, for all volumes, option > > "features.barrier" has been reconfigured and set to "disabled", whereas, for > > cross3_30, this volume isn't configured. This can again be observed from the > > "volume info" data. > > > > The brick multiplexing feature works in a way that checks for brick > > compatibility before deciding to attach a brick to a particular brick > > process that already has one of more bricks. This compatibility check > > happens in 2 steps. First, volume options are compared between the two > > volumes that the bricks being checked for compatibility belong to. If the > > volume options do not match, then the bricks are deemed incompatible and a > > new brick process is spawned instead. The second step involves other brick > > specific compatibility checks. > > So are we saying that if the volume options of two volumes are different, > then they won't be served by same glusterfsd(meaning brick mux won't take > into effect?) That's right! > > > > > Here the volume options for cross3_30 do not match those of the other > > volumes, due to the option "features.barrier". A new brick being spawned for > > brick "dhcp35-122.lab.eng.blr.redhat.com:/rhs/brick2/cross3_30" is thus an > > expected brick-multiplexing behaviour, resulting from the first > > compatibility check's failure, and can't be considered as a bug.