Description of problem: Output of 'gluster volume status <vol-name> details --xml' should have a <device> element for all the bricks in the volume, but it is missing for the arbiter bricks. Version-Release number of selected component (if applicable): 3.7.17-1 How reproducible: Always. Steps to Reproduce: 1. Create a replica 3 with arbiter volume named myvol 2. Try to get XML volume status details output from commandline with: "gluster volume status myvol details --xml" Actual results: The output does not contain the <device> element for the arbiter bricks. Expected results: The output does contain the <device> element for all the bricks in the volume, including the arbiter bricks. Additional info: This output format error causes VDSM to fail with a Python stack trace. Bug discussed and diagnosed (by Ramesh Nachimuthu) on oVirt and GlusterFS users' mailing list, see: http://www.gluster.org/pipermail/gluster-users/2016-December/029485.html
Trial run on my setup: 0:root@vm3 ~$ gluster --version glusterfs 3.7.17 built on Jan 4 2017 12:39:59 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. 0:root@vm3 ~$ ============================================ root@vm3 ~$ gluster v create testvol replica 3 arbiter 1 127.0.0.2:/bricks/brick{1..3} force root@vm3 ~$ gluster v start testvol 130:root@vm3 ~$ gluster v status testvol detail --xml <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cliOutput> <opRet>0</opRet> <opErrno>0</opErrno> <opErrstr/> <volStatus> <volumes> <volume> <volName>testvol</volName> <nodeCount>3</nodeCount> <node> <hostname>127.0.0.2</hostname> <path>/bricks/brick1</path> <peerid>78097816-11e9-4445-ae9a-dd1c5a8e89e3</peerid> <status>1</status> <port>49153</port> <ports> <tcp>49153</tcp> <rdma>N/A</rdma> </ports> <pid>13736</pid> <sizeTotal>4283432960</sizeTotal> <sizeFree>4249432064</sizeFree> <device>/dev/vdb1</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>127.0.0.2</hostname> <path>/bricks/brick2</path> <peerid>78097816-11e9-4445-ae9a-dd1c5a8e89e3</peerid> <status>1</status> <port>49155</port> <ports> <tcp>49155</tcp> <rdma>N/A</rdma> </ports> <pid>13755</pid> <sizeTotal>4283432960</sizeTotal> <sizeFree>4249432064</sizeFree> <device>/dev/vdb1</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>127.0.0.2</hostname> <path>/bricks/brick3</path> <peerid>78097816-11e9-4445-ae9a-dd1c5a8e89e3</peerid> <status>1</status> <port>49156</port> <ports> <tcp>49156</tcp> <rdma>N/A</rdma> </ports> <pid>13774</pid> <sizeTotal>4283432960</sizeTotal> <sizeFree>4249432064</sizeFree> <device>/dev/vdb1</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> </volume> </volumes> </volStatus> </cliOutput>
Hi Giuseppe, I am not able to recreate this issue on 3.7.17 or later. Are you sure that there isn't something that is wrong with your setup? Can you check if the glusterd logs on the node where you don't get the device id throws any error when you run the command?
Hi, I checked /var/log/glusterfs/etc-glusterfs-glusterd.vol.log and you're right: [2017-01-05 12:21:52.343032] E [MSGID: 106301] [glusterd-syncop.c:1281:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : No brick details in volume home while getting this: [root@shockley tmp]# gluster volume status home detail --xml <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <cliOutput> <opRet>0</opRet> <opErrno>0</opErrno> <opErrstr/> <volStatus> <volumes> <volume> <volName>home</volName> <nodeCount>9</nodeCount> <node> <hostname>read.gluster.private</hostname> <path>/srv/glusterfs/disk0/home_brick</path> <peerid>f1be76be-dec9-46be-98cb-a89c65aebde9</peerid> <status>1</status> <port>49152</port> <ports> <tcp>49152</tcp> <rdma>N/A</rdma> </ports> <pid>2773</pid> <sizeTotal>3767015563264</sizeTotal> <sizeFree>3045302603776</sizeFree> <device>/dev/sdb4</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>hall.gluster.private</hostname> <path>/srv/glusterfs/disk0/home_brick</path> <peerid>e391505d-372f-4148-9d3f-7dbdb8ad0366</peerid> <status>1</status> <port>49152</port> <ports> <tcp>49152</tcp> <rdma>N/A</rdma> </ports> <pid>5052</pid> <sizeTotal>3767015563264</sizeTotal> <sizeFree>3045302202368</sizeFree> <device>/dev/sdb4</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>shockley.gluster.private</hostname> <path>/srv/glusterfs/disk0/home_arbiter_brick</path> <peerid>3075fdea-4bb6-4fad-94b3-b09b13d7d6a7</peerid> <status>1</status> <port>49152</port> <ports> <tcp>49152</tcp> <rdma>N/A</rdma> </ports> <pid>10557</pid> <sizeTotal>1767605006336</sizeTotal> <sizeFree>1764696260608</sizeFree> <device>/dev/sda4</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>read.gluster.private</hostname> <path>/srv/glusterfs/disk1/home_brick</path> <peerid>f1be76be-dec9-46be-98cb-a89c65aebde9</peerid> <status>1</status> <port>49153</port> <ports> <tcp>49153</tcp> <rdma>N/A</rdma> </ports> <pid>2763</pid> <sizeTotal>3767015563264</sizeTotal> <sizeFree>3059324719104</sizeFree> <device>/dev/sdc4</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>hall.gluster.private</hostname> <path>/srv/glusterfs/disk1/home_brick</path> <peerid>e391505d-372f-4148-9d3f-7dbdb8ad0366</peerid> <status>1</status> <port>49153</port> <ports> <tcp>49153</tcp> <rdma>N/A</rdma> </ports> <pid>5058</pid> <sizeTotal>3767015563264</sizeTotal> <sizeFree>3059324432384</sizeFree> <device>/dev/sdc4</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>shockley.gluster.private</hostname> <path>/srv/glusterfs/disk1/home_arbiter_brick</path> <peerid>3075fdea-4bb6-4fad-94b3-b09b13d7d6a7</peerid> <status>1</status> <port>49153</port> <ports> <tcp>49153</tcp> <rdma>N/A</rdma> </ports> <pid>10568</pid> <sizeTotal>1767605006336</sizeTotal> <sizeFree>1766170935296</sizeFree> <device>/dev/sdb4</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>read.gluster.private</hostname> <path>/srv/glusterfs/disk2/home_brick</path> <peerid>f1be76be-dec9-46be-98cb-a89c65aebde9</peerid> <status>1</status> <port>49171</port> <ports> <tcp>49171</tcp> <rdma>N/A</rdma> </ports> <pid>2768</pid> <sizeTotal>3998831407104</sizeTotal> <sizeFree>3233375506432</sizeFree> <device>/dev/sda2</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>hall.gluster.private</hostname> <path>/srv/glusterfs/disk2/home_brick</path> <peerid>e391505d-372f-4148-9d3f-7dbdb8ad0366</peerid> <status>1</status> <port>49171</port> <ports> <tcp>49171</tcp> <rdma>N/A</rdma> </ports> <pid>5064</pid> <sizeTotal>3998831407104</sizeTotal> <sizeFree>3233376612352</sizeFree> <device>/dev/sda2</device> <blockSize>4096</blockSize> <mntOptions>rw,seclabel,relatime,attr2,inode64,noquota</mntOptions> <fsName>xfs</fsName> </node> <node> <hostname>shockley.gluster.private</hostname> <path>/srv/glusterfs/disk2/home_arbiter_brick</path> <peerid>3075fdea-4bb6-4fad-94b3-b09b13d7d6a7</peerid> <status>1</status> <port>49171</port> <ports> <tcp>49171</tcp> <rdma>N/A</rdma> </ports> <pid>10579</pid> <sizeTotal>1767605006336</sizeTotal> <sizeFree>1764696260608</sizeFree> <blockSize>4096</blockSize> </node> </volume> </volumes> </volStatus> </cliOutput> Now that I think about it, I suspect that there is one trick in my setup that is not interpreted correctly: let me explain (it is a bit involved, sorry ;-) ) Since it is well known that the arbiter does not need to have the same disk space available as the "full replica" nodes, I used smaller disks on the arbiter node (I have all arbiter bricks confined to the same node, so I call this node the "arbiter node") and then when I needed more storage I kept adding disks to the "full replica" nodes but not to the arbiter one (since it should be more than enough for arbiter bricks already). Note also that I use a separate mount point /srv/glusterfs/diskN for each new disk that I add, and then I create individual brick subdirs for my GlusterFS volumes on each disk /srv/glusterfs/diskN/volname_brick or /srv/glusterfs/diskN/volname_arbiter_brick. Now all is well, but I like to have coherent output too on all nodes (arbiter or full replica), so I used a higher-level symlink on the arbiter to fake the presence of the further disk and here it is the actual XFS filesystem view of my arbiter bricks: [root@shockley tmp]# tree -L 3 /srv/ /srv/ ├── ctdb │ └── lockfile └── glusterfs ├── disk0 │ ├── ctdb_arbiter_brick │ ├── disk2 │ ├── enginedomain_arbiter_brick │ ├── exportdomain_arbiter_brick │ ├── home_arbiter_brick │ ├── isodomain_arbiter_brick │ ├── share_arbiter_brick │ ├── software_arbiter_brick │ ├── src_arbiter_brick │ ├── tmp_arbiter_brick │ └── vmdomain_arbiter_brick ├── disk1 │ ├── ctdb_arbiter_brick │ ├── enginedomain_arbiter_brick │ ├── exportdomain_arbiter_brick │ ├── home_arbiter_brick │ ├── isodomain_arbiter_brick │ ├── share_arbiter_brick │ ├── software_arbiter_brick │ ├── src_arbiter_brick │ ├── tmp_arbiter_brick │ └── vmdomain_arbiter_brick └── disk2 -> disk0/disk2 26 directories, 1 file [root@shockley tmp]# tree -L 1 /srv/glusterfs/disk0/disk2 /srv/glusterfs/disk0/disk2 ├── enginedomain_arbiter_brick ├── exportdomain_arbiter_brick ├── home_arbiter_brick ├── share_arbiter_brick ├── software_arbiter_brick ├── src_arbiter_brick └── vmdomain_arbiter_brick 7 directories, 0 files At this point you can guess that I added the bricks by specifing the "fake" (symlinked) path for the sake of uniformity. I thought that this would not cause issues, but it seems I was wrong :-( Is this kind of setup really unsupported?
It is generally advisable to use separate paths for bricks but I don't think symlink is the issue here. FWIW, I tried to create a similar setup like yours and volume status printed everything: ------------------------- 0:root@vm1 bricks$ gluster v info Volume Name: testvol Type: Replicate Volume ID: 822fdbac-6c7d-460c-8d89-3e6222c0ec11 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 127.0.0.2:/bricks/disk0/brick1 Brick2: 127.0.0.2:/bricks/disk1/brick2 Brick3: 127.0.0.2:/bricks/disk2/brick3 (arbiter) 0:root@vm1 bricks$ ll /bricks/ total 0 drwxr-xr-x. 4 root root 33 Jan 6 11:26 disk0 drwxr-xr-x. 3 root root 20 Jan 6 11:26 disk1 lrwxrwxrwx. 1 root root 11 Jan 6 11:23 disk2 -> disk0/disk2 0:root@vm1 bricks$ tree /bricks/ /bricks/ ├── disk0 │ ├── brick1 │ └── disk2 │ └── brick3 ├── disk1 │ └── brick2 └── disk2 -> disk0/disk2 7 directories, 0 files ------------------------- Changing the BZ component to glusterd for the glusterd folks to look at the error message on comment #3.
Hi, We tried to recreate the issue once again and was not able to recreate the issue. We checked /var/log/glusterfs/etc-glusterfs-glusterd.vol.log and look in to the code as well for logs : "Staging of operation 'Volume Status' failed on localhost : No brick details in volume home". Here is my observation... Above mentioned logs will be generated when we issue a wrong CLI command like... [root@dhcp42-174 ~]# gluster v status testvol details log generated in log file... 2017-01-10 06:47:37.473855] E [MSGID: 106301] [glusterd-syncop.c:1281:gd_stage_op_phase] 0-management: Staging of operation 'Volume Status' failed on localhost : No brick details in volume testvol One thing I want you to check, whether above logs are relevant with the issue you have raised because as per my debugging we found that whenever we issue wrong command, such logs will generate.
Hi, I double checked and it seems that my history actually contains a gluster command with the same exact syntax error ("details" vs "detail") that you hinted to, maybe issued while investigating the problem: I mis-attributed the error message to the correct command - I apologize for the confusion generated. This time I checked better by issuing a: date; gluster volume status home detail --xml; echo $? ; date and actually the only log in that time interval is: [2017-01-10 21:19:29.085296] I [MSGID: 106499] [glusterd-handler.c:4331:__glusterd_handle_status_volume] 0-management: Received status volume req for volume home while the command still produces no device XML attribute for the arbiter brick on the third distributed component of the replicated-distributed volume (and the command still exits with status 0). Now: what kind of tests/logs should I produce to fully trace the XML output anomaly? Please consider that this setup is in production phase now (so all tests must not disrupt current activity, but maintenance windows can be arranged for).
Changed back component to cli since it seems that glusterd/arbiter is not the culprit here. I'm still willing to investigate the issue further with any log / debug actions. Many thanks.
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life. Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS. If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.