Description of problem: Because gluster-block is storing metadata on the same volume as that of data and since metadata updates are multi-client writes, gluster-block create hangs and goes into a loop before it dies. Reason is that the actual file size and filesize on the mount are differing and gluster-block is not able to understand if the operation succeeded or not. [root@localhost block-meta]# ls -l /brick1/ block-meta/ block-store/ .glusterfs/ .shard/ .trashcan/ [root@localhost block-meta]# ls -l /brick1/block-meta/1 -rw-------. 2 root root 52304 May 20 19:36 /brick1/block-meta/1 <<<---- true size. [root@localhost block-meta]# ls -l 1 -rw-------. 1 root root 101 May 20 19:36 1 <<----- has truncated size. Either metadata needs to be moved to separate volume or shard shouldn't be enabled on the volume for gluster-block. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Found the Root cause: When a file is opened with O_APPEND, offset gets ignored and the write buffer is always appended to the file. Where as shard doesn't ignore offset when the fd has O_APPEND. This is leading the size to be always stuck at 101 bytes because that is the biggest write that comes on the file: Thread 2 "gluster-blockd" hit Breakpoint 1, shard_writev (frame=0x61200005391c, this=0x61f00001a4c0, fd=0x61100000b21c, vector=0x60800000cee0, count=1, offset=0, flags=0, iobref=0x60d00001d7c0, xdata=0x0) at shard.c:4827 4827 shard_common_inode_write_begin (frame, this, GF_FOP_WRITE, fd, vector, Missing separate debuginfos, use: dnf debuginfo-install json-c-0.12-7.fc24.x86_64 libacl-2.2.52-11.fc24.x86_64 libattr-2.4.47-16.fc24.x86_64 libstdc++-6.2.1-2.fc25.x86_64 sssd-client-1.14.2-1.fc25.x86_64 (gdb) dis 1 (gdb) c Continuing. [Switching to Thread 0x7fffe565a700 (LWP 9037)] Thread 10 "gluster-blockd" hit Breakpoint 2, trace_writev_cbk (frame=0x612000053c1c, cookie=0x61200005391c, this=0x61f0000196c0, op_ret=101, op_errno=0, prebuf=0x61b00001a68c, postbuf=0x61b00001a6fc, xdata=0x611000052d9c) at trace.c:232 232 char preopstr[4096] = {0, }; (gdb) p postbuf.ia_size $1 = 101 (gdb) en 1 (gdb) c Continuing. Thread 10 "gluster-blockd" hit Breakpoint 1, shard_writev (frame=0x61200002841c, this=0x61f00001a4c0, fd=0x61100003cf9c, vector=0x608000020be0, count=1, offset=0, flags=0, iobref=0x60d00003d530, xdata=0x0) at shard.c:4827 4827 shard_common_inode_write_begin (frame, this, GF_FOP_WRITE, fd, vector, (gdb) c Continuing. [Switching to Thread 0x7fffe0f08700 (LWP 9038)] Thread 11 "gluster-blockd" hit Breakpoint 2, trace_writev_cbk (frame=0x61200002871c, cookie=0x61200002841c, this=0x61f0000196c0, op_ret=21, op_errno=0, prebuf=0x61b00000cd8c, postbuf=0x61b00000cdfc, xdata=0x611000064bdc) at trace.c:232 232 char preopstr[4096] = {0, }; (gdb) p postbuf.ia_size $2 = 101 (gdb) c Continuing. [New Thread 0x7fffe04e8700 (LWP 9040)] [New Thread 0x7fffdfcc4700 (LWP 9041)] [New Thread 0x7fffdf490700 (LWP 9042)] Thread 11 "gluster-blockd" hit Breakpoint 1, shard_writev (frame=0x61200003dd1c, this=0x61f00001a4c0, fd=0x61100009479c, vector=0x608000032a60, count=1, offset=0, flags=0, iobref=0x60d00006c800, xdata=0x0) at shard.c:4827 4827 shard_common_inode_write_begin (frame, this, GF_FOP_WRITE, fd, vector, (gdb) c Continuing. [Switching to Thread 0x7fffe565a700 (LWP 9037)] Thread 10 "gluster-blockd" hit Breakpoint 2, trace_writev_cbk (frame=0x61200003e01c, cookie=0x61200003dd1c, this=0x61f0000196c0, op_ret=33, op_errno=0, prebuf=0x61b00002b78c, postbuf=0x61b00002b7fc, xdata=0x61100007e5dc) at trace.c:232 232 char preopstr[4096] = {0, }; (gdb) p postbuf.ia_size $3 = 101 (gdb) q A debugging session is active. Inferior 1 [process 9024] will be killed. After fixing the issue with: [root@localhost r3]# gluster-block create r3/12 ha 3 192.168.122.61,192.168.122.123,192.168.122.113 1GiB IQN: iqn.2016-12.org.gluster-block:1aef8052-2547-482e-9316-e41ba0e4b289 PORTAL(S): 192.168.122.61:3260 192.168.122.123:3260 192.168.122.113:3260 RESULT: SUCCESS [root@localhost r3]# ls -l /brick1/block-meta/12 -rw-------. 2 root root 315 May 24 22:52 /brick1/block-meta/12 [root@localhost r3]# ls -l /mnt/block-meta/12 -rw-------. 1 root root 315 May 24 22:52 /mnt/block-meta/12
https://review.gluster.org/17387
Gluster-block create and delete works fine.In this particular case the create was failing due to invalid host. Now that issue is fixed and we are not having volume create failed. Also verified that delete is happening successfully. With multiple deletes we have vmcore issue but that is tracked in a separate bug. So marking this bug as Verified.
Apologies for the wrong bug update.
Tested and verified this on the build glusterfs-3.8.4-31 and gluster-block-0.2.1-4. Gluster-block create works. No issues/hangs are seen while executing this command. The parent volume (in which blocks are created) has all the required options set using the 'gluster volume set <volname> group' command. Also, backend data shows the .shard folder and the required shards. Moving this bug to verified in rhgs3.3. [root@dhcp47-115 ~]# gluster-block list nash nb1 nb2 nb3 [root@dhcp47-115 ~]# gluster-block create nash/nb4 Inadequate arguments for create: gluster-block create <volname/blockname> [ha <count>] [auth enable|disable] <HOST1[,HOST2,...]> <size> [--json*] [root@dhcp47-115 ~]# gluster-block create nash/nb4 ha 1 10.70.47.115 20M IQN: iqn.2016-12.org.gluster-block:2cb06c34-3c9d-493d-9511-fc061385b808 PORTAL(S): 10.70.47.115:3260 RESULT: SUCCESS [root@dhcp47-115 ~]# cd - /bricks/brick4/nash0/.shard [root@dhcp47-115 .shard]# ls -l | wc -l 277 [root@dhcp47-115 ~]# gluster v info nash Volume Name: nash Type: Replicate Volume ID: f1ea3d3e-c536-4f36-b61f-cb9761b8a0a6 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.70.47.115:/bricks/brick4/nash0 Brick2: 10.70.47.116:/bricks/brick4/nash1 Brick3: 10.70.47.117:/bricks/brick4/nash2 Options Reconfigured: server.allow-insecure: on user.cifs: off features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: enable performance.readdir-ahead: off performance.open-behind: off performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off transport.address-family: inet nfs.disable: on cluster.brick-multiplex: disable cluster.enable-shared-storage: enable [root@dhcp47-115 ~]# [root@dhcp47-115 ~]# rpm -qa | grep gluster glusterfs-cli-3.8.4-31.el7rhgs.x86_64 gluster-block-0.2.1-4.el7rhgs.x86_64 libvirt-daemon-driver-storage-gluster-3.2.0-10.el7.x86_64 glusterfs-libs-3.8.4-31.el7rhgs.x86_64 glusterfs-events-3.8.4-31.el7rhgs.x86_64 vdsm-gluster-4.17.33-1.1.el7rhgs.noarch glusterfs-api-3.8.4-31.el7rhgs.x86_64 python-gluster-3.8.4-31.el7rhgs.noarch gluster-nagios-common-0.2.4-1.el7rhgs.noarch gluster-nagios-addons-0.2.9-1.el7rhgs.x86_64 samba-vfs-glusterfs-4.6.3-3.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-31.el7rhgs.x86_64 glusterfs-server-3.8.4-31.el7rhgs.x86_64 glusterfs-rdma-3.8.4-31.el7rhgs.x86_64 glusterfs-debuginfo-3.8.4-26.el7rhgs.x86_64 glusterfs-3.8.4-31.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-31.el7rhgs.x86_64 glusterfs-fuse-3.8.4-31.el7rhgs.x86_64 [root@dhcp47-115 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774