Bug 1691320

Summary:

glusterfs: write operations fail when the size is equal or greater than 1 GB

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

Stefano Garzarella <sgarzare>

Component:

glusterfs

Assignee:

Rinku <rkothiya>

Status:

CLOSED ERRATA

QA Contact:

milind <mwaykole>

Severity:

high

Docs Contact:

Priority:

high

Version:

rhgs-3.4

CC:

ndevos, pasik, pprakash, puebele, ravishankar, rhs-bugs, rkothiya, sheggodu

Target Milestone:

---

Keywords:

ZStream

Target Release:

RHGS 3.5.z Batch Update 4

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

glusterfs-6.0-53

Doc Type:

No Doc Update

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2021-04-29 07:20:37 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1678575

Attachments:

Description	Flags
glfs_write_bug.c useful to reproduce the bug	none

Description Stefano Garzarella 2019-03-21 11:57:55 UTC

Created attachment 1546452 [details]
glfs_write_bug.c useful to reproduce the bug

Description of problem:
While debugging BZ1678575 I discovered that using write APIs (eg. glfs_write(), glfs_pwrite() or *async()) with a size >= 1GB there is a strange behaviour:
- an error is printed in the client log (no error in the server log)
- glfs_write() doesn't return any error, but the write operation is not executed
- subsequent operations fail with "Transport endpoint is not connected"

Version-Release number of selected component (if applicable):
glusterfs-server-3.12.2-40.el7rhgs.x84_64

How reproducible:
100%

Steps to Reproduce:
1. Change server, volume and path in the glfs_write_bug.c
2. gcc `pkg-config --cflags glusterfs-api` `pkg-config --libs glusterfs-api` glfs_write_bug.c -o glfs_write_bug
3. ./glfs_write_bug

Actual results:
TEST glfs_write - size: 512 MiB pattern: 171
  glfs_write - size: 536870912 ret: 536870912
  glfs_read - size: 1024 ret: 1024
  PASS
TEST glfs_write - size: 1023 MiB pattern: 42
  glfs_write - size: 1072693248 ret: 1072693248
  glfs_read - size: 1024 ret: 1024
  PASS
TEST glfs_write - size: 1024 MiB pattern: 203
  glfs_write - size: 1073741824 ret: 1073741824
[2019-03-21 10:43:24.607883] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x131)[0x7f436bd83131] (--> /lib64/libgfrpc.so.0(+0xda01)[0x7f436bd48a01] (--> /lib64/libgfrpc.so.0(+0xdb22)[0x7f436bd48b22] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x94)[0x7f436bd4a0b4] (--> /lib64/libgfrpc.so.0(+0xfc50)[0x7f436bd4ac50] ))))) 0-gv33-client-0: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2019-03-21 10:43:24.607433 (xid=0x11)
  glfs_read - size: 1024 ret: -1
  glfs_read: Transport endpoint is not connected
END ret=-1

Expected results:
TEST glfs_write - size: 512 MiB pattern: 171
  glfs_write - size: 536870912 ret: 536870912
  glfs_read - size: 1024 ret: 1024
  PASS
TEST glfs_write - size: 1023 MiB pattern: 42
  glfs_write - size: 1072693248 ret: 1072693248
  glfs_read - size: 1024 ret: 1024
  PASS
TEST glfs_write - size: 1024 MiB pattern: 203
  glfs_write - size: 1073741824 ret: 1073741824
  glfs_read - size: 1024 ret: 1024
END ret=0

Additional info:
As clients, I used both RHEL8 (glusterfs-api-3.12.2-40.2.el8.x86_64) and F29 (glusterfs-api-5.5-1.fc29.x86_64)

I had the same issue also using a Fedora 29 server (glusterfs-server-5.5-1.fc29.x86_64):
TEST glfs_write - size: 1024 MiB pattern: 172
  glfs_write - size: 1073741824 ret: 1073741824
[2019-03-21 10:55:08.998979] E [rpc-clnt.c:346:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x131)[0x7fbf924ec131] (--> /lib64/libgfrpc.so.0(+0xda01)[0x7fbf924b1a01] (--> /lib64/libgfrpc.so.0(+0xdb22)[0x7fbf924b1b22] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x94)[0x7fbf924b30b4] (--> /lib64/libgfrpc.so.0(+0xfc50)[0x7fbf924b3c50] ))))) 0-gv0-client-0: forced unwinding frame type(GlusterFS 4.x v1) op(WRITE(13)) called at 2019-03-21 10:55:08.998126 (xid=0x11)
  glfs_read - size: 1024 ret: -1
  glfs_read: Transport endpoint is not connected

Comment 2 Stefano Garzarella 2019-03-25 12:46:21 UTC

Same issue also with client and server on Fedora 30 and GlusterFS 6 (glusterfs-server-6.0-1.fc30.x86_64, glusterfs-api-6.0-1.fc30.x86_64):
TEST glfs_write - size: 1024 MiB pattern: 131
  glfs_write - size: 1073741824 ret: 1073741824
[2019-03-25 12:45:13.013398] E [rpc-clnt.c:338:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f61de8e05f6] (--> /lib64/libgfrpc.so.0(+0xe2f4)[0x7f61de8822f4] (--> /lib64/libgfrpc.so.0(+0xe412)[0x7f61de882412] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61de8833b7] (--> /lib64/libgfrpc.so.0(+0xfff8)[0x7f61de883ff8] ))))) 0-gv0-client-0: forced unwinding frame type(GlusterFS 4.x v1) op(WRITE(13)) called at 2019-03-25 12:45:13.013037 (xid=0x10)
  glfs_read - size: 1024 ret: -1
  glfs_read: Transport endpoint is not connected

Comment 3 Amar Tumballi 2019-03-25 13:13:13 UTC

Looks like a valid bug.

For devel reference:

Check: iobuf_get2() in libglusterfs/src/iobuf.c, and notice that there is a check for '== -1' with size_t format, which is normally a unsigned int.

Due to this, the check @ Line: 630, we should change *get_pagesize() function to return ssize_t or int64_t, and then also keep rounded_size as ssize_t or int64_t... With this, I guess we should be able to fix the issue.

Comment 5 Rinku 2020-09-23 15:29:55 UTC

This is expected behaviour, because I see that there is a check which has been put intentionally to return a -1 when the total bytes is greater than or equal to 1 GB, as shown below:

 static int
 __socket_proto_state_machine(rpc_transport_t *this,
                              rpc_transport_pollin_t **pollin)
 {
   .
   .
   .

             case SP_STATE_READ_FRAGHDR:
 
                 in->fraghdr = ntoh32(in->fraghdr);
                 in->total_bytes_read += RPC_FRAGSIZE(in->fraghdr);
 
                 if (in->total_bytes_read >= GF_UNIT_GB) {     <<<<<<<<<<<<<<
                     ret = -1;
                     goto out;
                 }   

   .
   .
   .

Comment 6 Stefano Garzarella 2020-09-24 07:11:27 UTC

(In reply to Rinku from comment #5)
> This is expected behaviour, because I see that there is a check which has
> been put intentionally to return a -1 when the total bytes is greater than
> or equal to 1 GB, as shown below:

If it is expected I think that should be documented and glfs_write() must fail returning an error.

The main problem was that there wasn't any error returned or put in the log, and subsequent operations failed without an obvious correlation.

Comment 7 Rinku 2020-09-24 12:59:57 UTC

Agreed. 

Patch: https://review.gluster.org/#/c/glusterfs/+/25035/

Test results : 

# tail -f /var/log/glusterfs/bricks/testxfs-brick-d1-repbrick.log
.
.
.
[2020-09-24 12:38:47.186697 +0000] E [socket.c:2365:__socket_proto_state_machine] 0-tcp.devvol-server: size >= 1073741824 is not allowed
.
.
.

Comment 8 Ravishankar N 2020-12-02 12:58:46 UTC

Hi Rinku,

The content finalization for RHGS-3.5.4 must be complete by 4th December 2020 (https://pp.engineering.redhat.com/pp/product/rhgs/release/rhgs-3-5.0/schedule/tasks).
This bug has all 3 (dev,qe and pm) acks for RHGS-3.5.4 and is in POST state. If all the patches needed to fix this BZ are not merged upstream by 3rd December 2020, please change the internal whiteboard to 3.5.5 with a justification as to why it needs to be dropped from 3.5.4.

Comment 16 Rinku 2021-01-22 10:17:53 UTC

Created attachment 1749689 [details]
Updated program to reproduce the problem

Please use the updated program attached to reproduce the problem

Comment 25 errata-xmlrpc 2021-04-29 07:20:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1462