Bug 1691320
Summary: | glusterfs: write operations fail when the size is equal or greater than 1 GB | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Stefano Garzarella <sgarzare> | ||||
Component: | glusterfs | Assignee: | Rinku <rkothiya> | ||||
Status: | CLOSED ERRATA | QA Contact: | milind <mwaykole> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | rhgs-3.4 | CC: | ndevos, pasik, pprakash, puebele, ravishankar, rhs-bugs, rkothiya, sheggodu | ||||
Target Milestone: | --- | Keywords: | ZStream | ||||
Target Release: | RHGS 3.5.z Batch Update 4 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-6.0-53 | Doc Type: | No Doc Update | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2021-04-29 07:20:37 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1678575 | ||||||
Attachments: |
|
Description
Stefano Garzarella
2019-03-21 11:57:55 UTC
Same issue also with client and server on Fedora 30 and GlusterFS 6 (glusterfs-server-6.0-1.fc30.x86_64, glusterfs-api-6.0-1.fc30.x86_64): TEST glfs_write - size: 1024 MiB pattern: 131 glfs_write - size: 1073741824 ret: 1073741824 [2019-03-25 12:45:13.013398] E [rpc-clnt.c:338:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x196)[0x7f61de8e05f6] (--> /lib64/libgfrpc.so.0(+0xe2f4)[0x7f61de8822f4] (--> /lib64/libgfrpc.so.0(+0xe412)[0x7f61de882412] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x97)[0x7f61de8833b7] (--> /lib64/libgfrpc.so.0(+0xfff8)[0x7f61de883ff8] ))))) 0-gv0-client-0: forced unwinding frame type(GlusterFS 4.x v1) op(WRITE(13)) called at 2019-03-25 12:45:13.013037 (xid=0x10) glfs_read - size: 1024 ret: -1 glfs_read: Transport endpoint is not connected Looks like a valid bug. For devel reference: Check: iobuf_get2() in libglusterfs/src/iobuf.c, and notice that there is a check for '== -1' with size_t format, which is normally a unsigned int. Due to this, the check @ Line: 630, we should change *get_pagesize() function to return ssize_t or int64_t, and then also keep rounded_size as ssize_t or int64_t... With this, I guess we should be able to fix the issue. This is expected behaviour, because I see that there is a check which has been put intentionally to return a -1 when the total bytes is greater than or equal to 1 GB, as shown below: static int __socket_proto_state_machine(rpc_transport_t *this, rpc_transport_pollin_t **pollin) { . . . case SP_STATE_READ_FRAGHDR: in->fraghdr = ntoh32(in->fraghdr); in->total_bytes_read += RPC_FRAGSIZE(in->fraghdr); if (in->total_bytes_read >= GF_UNIT_GB) { <<<<<<<<<<<<<< ret = -1; goto out; } . . . (In reply to Rinku from comment #5) > This is expected behaviour, because I see that there is a check which has > been put intentionally to return a -1 when the total bytes is greater than > or equal to 1 GB, as shown below: If it is expected I think that should be documented and glfs_write() must fail returning an error. The main problem was that there wasn't any error returned or put in the log, and subsequent operations failed without an obvious correlation. Agreed. Patch: https://review.gluster.org/#/c/glusterfs/+/25035/ Test results : # tail -f /var/log/glusterfs/bricks/testxfs-brick-d1-repbrick.log . . . [2020-09-24 12:38:47.186697 +0000] E [socket.c:2365:__socket_proto_state_machine] 0-tcp.devvol-server: size >= 1073741824 is not allowed . . . Hi Rinku, The content finalization for RHGS-3.5.4 must be complete by 4th December 2020 (https://pp.engineering.redhat.com/pp/product/rhgs/release/rhgs-3-5.0/schedule/tasks). This bug has all 3 (dev,qe and pm) acks for RHGS-3.5.4 and is in POST state. If all the patches needed to fix this BZ are not merged upstream by 3rd December 2020, please change the internal whiteboard to 3.5.5 with a justification as to why it needs to be dropped from 3.5.4. Created attachment 1749689 [details]
Updated program to reproduce the problem
Please use the updated program attached to reproduce the problem
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1462 |