Bug 1570538

Summary: linux untar errors out at completion during disperse volume inservice upgrade
Product: [Community] GlusterFS Reporter: Ashish Pandey <aspandey>
Component: disperseAssignee: Ashish Pandey <aspandey>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: amukherj, aspandey, bugs, jahernan, nchilaka, pkarampu, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-5.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1558948 Environment:
Last Closed: 2018-06-20 18:05:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Ashish Pandey 2018-04-23 07:13:20 UTC
I found that for ec_flush_cbk, out of 6, 4 bricks are having xdata as NULL while other 2 bricks are having xdata as a valid address (or response).
When we combine these 6, we decide that 2 are bad while 4 are good and there on we deal with these 4 bricks.

If any one of the 4 bricks gets bad response EC come out as IO error. Now, logically it should not happen as all the 6 bricks are UP and there was nothing to heal.

The reason we are getting xdata as NON NULL in 2 of the callback because these were the bricks which were upgraded to 3.12.2-6 from 3.8.4.

In the 3.12.2-6 version, posix_flush is returning NULL for xdata. However. when the call comes to pl_flush_cbk and when it unwinds using      
   PL_STACK_UNWIND (flush, xdata, frame, op_ret, op_errno, xdata);
This xdata becomes a valid address because of the change in its code.

Comment 2 Worker Ant 2018-04-25 11:25:05 UTC
REVIEW: https://review.gluster.org/19938 (protocol/server : unwind as per op version) posted (#1) for review on master by Ashish Pandey

Comment 3 Worker Ant 2018-05-03 07:16:48 UTC
COMMIT: https://review.gluster.org/19938 committed in master by "Xavi Hernandez" <xhernandez> with a commit message- protocol/server : unwind as per op version

Change-Id: Id6717640ac14881b490e512c4682e45ffffa7f5b
fixes: bz#1570538
BUG: 1570538
Signed-off-by: Ashish Pandey <aspandey>

Comment 4 Worker Ant 2018-05-18 06:36:31 UTC
REVIEW: https://review.gluster.org/20031 (feature/locks: Unwind response based on clinet version) posted (#1) for review on master by Ashish Pandey

Comment 5 Worker Ant 2018-05-28 02:45:36 UTC
COMMIT: https://review.gluster.org/20031 committed in master by "Amar Tumballi" <amarts> with a commit message- feature/locks: Unwind response based on clinet version

Change-Id: I6fc7755cca0d6f61cb775363618036228925842c
fixes: bz#1570538
Signed-off-by: Ashish Pandey <aspandey>

Comment 6 Shyamsundar 2018-06-20 18:05:09 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report.

glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html
[2] https://www.gluster.org/pipermail/gluster-users/

Comment 7 Shyamsundar 2018-10-23 15:07:33 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-5.0, please open a new bug report.

glusterfs-5.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-October/000115.html
[2] https://www.gluster.org/pipermail/gluster-users/