Bug 1347251 - fix the issue of Rolling upgrade or non-disruptive upgrade of disperse or erasure code volume to work [NEEDINFO]
Summary: fix the issue of Rolling upgrade or non-disruptive upgrade of disperse or era...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: disperse
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: RHGS 3.2.0
Assignee: Ashish Pandey
QA Contact: nchilaka
URL:
Whiteboard:
Depends On:
Blocks: 1347686 1351522 1351530 1360152 1360174
TreeView+ depends on / blocked
 
Reported: 2016-06-16 11:44 UTC by nchilaka
Modified: 2019-04-03 09:28 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8.4-1
Doc Type: Bug Fix
Doc Text:
Online rolling upgrades were not possible from Red Hat Gluster Storage 3.1.x to 3.1.y (where y is more recent than x) because of client limitations. Red Hat Gluster Storage 3.2 enables online rolling upgrades from 3.2.x to 3.2.y (where y is more recent than x).
Clone Of:
: 1347686 1422539 (view as bug list)
Environment:
Last Closed: 2017-03-23 05:36:44 UTC
rhinduja: needinfo? (asrivast)
lbailey: needinfo? (aspandey)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:18:45 UTC

Description nchilaka 2016-06-16 11:44:31 UTC
Description of problem:
========================
When we do a rolling upgrade of an ec volume from 3.1.2 to 3.1.3, I am hitting IO error after updating one node successfully and moving on to 2nd node to update


Version-Release number of selected component (if applicable):
==================================
before upgrade:
[root@node1-dhcp35-213 rhn]# cat /etc/redhat-*
Red Hat Enterprise Linux Server release 6.7 (Santiago)
Red Hat Gluster Storage Server 3.1 Update 2
[root@node1-dhcp35-213 rhn]# rpm -qa|grep gluster
vdsm-gluster-4.16.30-1.3.el6rhs.noarch
glusterfs-server-3.7.5-19.el6rhs.x86_64
glusterfs-api-3.7.5-19.el6rhs.x86_64
gluster-nagios-common-0.2.3-1.el6rhs.noarch
glusterfs-3.7.5-19.el6rhs.x86_64
glusterfs-client-xlators-3.7.5-19.el6rhs.x86_64
glusterfs-geo-replication-3.7.5-19.el6rhs.x86_64
glusterfs-cli-3.7.5-19.el6rhs.x86_64
glusterfs-rdma-3.7.5-19.el6rhs.x86_64
python-gluster-3.7.5-19.el6rhs.noarch
gluster-nagios-addons-0.2.5-1.el6rhs.x86_64
glusterfs-libs-3.7.5-19.el6rhs.x86_64
glusterfs-fuse-3.7.5-19.el6rhs.x86_64


after upgrade:
[root@node3-dhcp35-62 rhn]# cat /etc/redhat-*
Red Hat Enterprise Linux Server release 6.8 (Santiago)
Red Hat Gluster Storage Server 3.1 Update 3
[root@node3-dhcp35-62 rhn]# rpm -qa|grep gluster
glusterfs-rdma-3.7.9-10.el6rhs.x86_64
glusterfs-libs-3.7.9-10.el6rhs.x86_64
python-gluster-3.7.9-10.el6rhs.noarch
glusterfs-api-3.7.9-10.el6rhs.x86_64
glusterfs-client-xlators-3.7.9-10.el6rhs.x86_64
gluster-nagios-common-0.2.4-1.el6rhs.noarch
vdsm-gluster-4.16.30-1.5.el6rhs.noarch
glusterfs-3.7.9-10.el6rhs.x86_64
glusterfs-fuse-3.7.9-10.el6rhs.x86_64
glusterfs-server-3.7.9-10.el6rhs.x86_64
gluster-nagios-addons-0.2.7-1.el6rhs.x86_64
glusterfs-geo-replication-3.7.9-10.el6rhs.x86_64
glusterfs-cli-3.7.9-10.el6rhs.x86_64
[root@node3-dhcp35-62 rhn]# 


How reproducible:
===================
always


Steps to Reproduce:
=================
1.have a 3 node cluster with 4 bricks each and make sure you have setup all the configurations for doing an update(channel registration, storage subscription etc)
2.create an dist-ec volume such that each node hosts max 2 bricks of one dht-subvol
3.now start the volume
4.enable quotas
5. mount volume using fuse on a client
6. trigger IO by downloading kernel image and start untarring it
7. Now bring down all gluster processes including bricks, deamons and glusterd using pkill glusterfs,glusterfsd and service glusterd stop
8. issue an yum update to update to latest packages including gluster
===>IOs should be happening without any issue
9. Now once updating is successful, start glusterd==>this too will work and IOs will still be happening.
10. Now that node3 is updated sucessfully, we now move on to next node say node2
11. Now kill glusterfs,glusterfsd and glusterd on node2
====>You will now hit IO error or IO will stop abruptly.
To recheck just create a dir and try to copy the kernel.tar to this dir and it will fail as below:
[root@nchilaka-rhel6-fuseclient1-43-172 kern]# cp linux-4.6.2.tar.xz dir.6
cp: reading `linux-4.6.2.tar.xz': Input/output error

Comment 2 nchilaka 2016-06-20 11:36:06 UTC
Note for QE and Dev:Once this bug is fixed, make sure that the documentation is updated for that release accordingly Refer 1347252 - [DOC]: Have a note saying non-disruptive or rolling upgrade is not supported for a disperse volume

Comment 4 Atin Mukherjee 2016-09-17 13:28:20 UTC
Upstream mainline : http://review.gluster.org/14761
Upstream 3.8 : http://review.gluster.org/15013

And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.

Comment 7 nchilaka 2016-11-09 10:45:23 UTC
Still seeing EIO:
had a 4+2 setup on a 3 node setup, means 2 ec bricks on each node.

pumping IOs(lin untar) from one client
upgraded node#1 and healing completed
Now brought down node#2
Seeing EIO as soon as node#2 is down:
tar: linux-4.8.6/drivers/nfc/nfcmrvl/spi.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nfcmrvl/uart.c
tar: linux-4.8.6/drivers/nfc/nfcmrvl/uart.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nfcmrvl/usb.c
tar: linux-4.8.6/drivers/nfc/nfcmrvl/usb.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nfcsim.c
tar: linux-4.8.6/drivers/nfc/nfcsim.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nfcwilink.c
tar: linux-4.8.6/drivers/nfc/nfcwilink.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nxp-nci/
tar: linux-4.8.6/drivers/nfc/nxp-nci: Cannot mkdir: Input/output error
linux-4.8.6/drivers/nfc/nxp-nci/Kconfig
tar: linux-4.8.6/drivers/nfc/nxp-nci/Kconfig: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nxp-nci/Makefile
tar: linux-4.8.6/drivers/nfc/nxp-nci/Makefile: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nxp-nci/core.c
tar: linux-4.8.6/drivers/nfc/nxp-nci/core.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nxp-nci/firmware.c
tar: linux-4.8.6/drivers/nfc/nxp-nci/firmware.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nxp-nci/i2c.c
tar: linux-4.8.6/drivers/nfc/nxp-nci/i2c.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/nxp-nci/nxp-nci.h
tar: linux-4.8.6/drivers/nfc/nxp-nci/nxp-nci.h: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn533/
tar: linux-4.8.6/drivers/nfc/pn533: Cannot mkdir: Input/output error
linux-4.8.6/drivers/nfc/pn533/Kconfig
tar: linux-4.8.6/drivers/nfc/pn533/Kconfig: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn533/Makefile
tar: linux-4.8.6/drivers/nfc/pn533/Makefile: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn533/i2c.c
tar: linux-4.8.6/drivers/nfc/pn533/i2c.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn533/pn533.c
tar: linux-4.8.6/drivers/nfc/pn533/pn533.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn533/pn533.h
tar: linux-4.8.6/drivers/nfc/pn533/pn533.h: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn533/usb.c
tar: linux-4.8.6/drivers/nfc/pn533/usb.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn544/
tar: linux-4.8.6/drivers/nfc/pn544: Cannot mkdir: Input/output error
linux-4.8.6/drivers/nfc/pn544/Kconfig
tar: linux-4.8.6/drivers/nfc/pn544/Kconfig: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn544/Makefile
tar: linux-4.8.6/drivers/nfc/pn544/Makefile: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn544/i2c.c
tar: linux-4.8.6/drivers/nfc/pn544/i2c.c: Cannot open: Input/output error
linux-4.8.6/drivers/nfc/pn544/mei.c




Another note:
On a 6 nodes setup seeing spurious heal info entries(only mistake or configuration change from my side was that the clients were already upgraded)


I tried this on a 4+2 setup 6node setup
I fuse mounted the vol on two clients and trigger linux untar and dd of 10000files in each of the cleint
I then brought down node#1 and #2 and upgraded them.

The upgrade went smooth.
But the heal info never completes due to spurious entries. THat is the files getting written at that time are being shown as heal pending.

This leaves the admin confused as the heal info never shows as complete until IOs are stopped.

Hence admin can never proceed with completing the upgrade .
Discussed with Pranith and hence moving to failed_qa
(At the best it is still a blocker for verifying this bug)


[root@dhcp35-239 ~]# 
[root@dhcp35-239 ~]# rpm -qa|grep gluster
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-server-3.7.9-12.el7rhgs.x86_64
glusterfs-client-xlators-3.7.9-12.el7rhgs.x86_64
python-gluster-3.7.9-12.el7rhgs.noarch
glusterfs-libs-3.7.9-12.el7rhgs.x86_64
glusterfs-api-3.7.9-12.el7rhgs.x86_64
glusterfs-cli-3.7.9-12.el7rhgs.x86_64
glusterfs-geo-replication-3.7.9-12.el7rhgs.x86_64
vdsm-gluster-4.17.33-1.el7rhgs.noarch
glusterfs-rdma-3.7.9-12.el7rhgs.x86_64
glusterfs-3.7.9-12.el7rhgs.x86_64
gluster-nagios-addons-0.2.7-1.el7rhgs.x86_64
glusterfs-fuse-3.7.9-12.el7rhgs.x86_64

Comment 8 Ashish Pandey 2016-11-14 11:26:06 UTC
Main issue of this BZ has been fixed on client side - 
------------------------------------
This issue arises when we do a rolling update from 3.7.5 to 3.7.9. For 4+2 volume running 3.7.5, if we update 2 nodes and after heal completion kill 2 older nodes, this problem can be seen. After update and killing of bricks, 2 nodes will return inodelk count key in dict while other 2 nodes will not have inodelk count in dict. This is also true for get-link-count. During dictionary match , ec_dict_compare, this will lead to mismatch of answers and the file operation on mount point will fail with IO error.

To solve this don't match inode, entry and link count keys while comparing two dictionaries. However, while combining the data in ec_dict_combine, go through all the dictionaries and select the maximum values received in different dicts for these keys.

Due to this we have to upgrade client first to make sure that we do not compare inodelk counts, which we get from server and are different, on client side.

This is what mentioned in previous comment.

-----------------------------------

However, this brings the new issue.

In patch http://review.gluster.org/#/c/13733/, for every update fop we set dirty flag. Which means we create index entries then perform fop and then unset dirty flag.
So while fop is going on and we use heal info command, it also shows the files on which IO is going on.
Setting and clearing dirty flag triggers from client side.

To solve this issue, we have sent patch http://review.gluster.org/#/c/15543/
In this patch we take _LOCK_ on a file for which index entry is created. We check the version and size and only when these two differ, we list it in heal info.


The problem is - 
- we have upgraded the client, that means it will set dirty flag and create 
  index entries on servers.
- Now, we are upgrading nodes to new version one by one. However, in this 
  process, some old nodes will NOT have the second patch which can take LOCKS 
  on files and investigate if it really needs heal or not.

So the nodes which have old version will list those indices as heal NEEDED.

Comment 12 Ashish Pandey 2016-11-16 08:02:11 UTC
I am setting require_doc_text as ? as I think customer should know the issue about rolling upgrade for 3.1.X, although this is not fixed for 3.1.X

Comment 15 nchilaka 2017-02-06 13:59:08 UTC
also make sure to retest (as part of sanity)
1347257 - spurious heal info as pending heal entries never end on an EC volume while IOs are going on

Comment 19 nchilaka 2017-02-15 13:59:55 UTC
Cloned this bug to 1422539 - IO error seen with Rolling or non-disruptive upgrade of an distribute-disperse(EC) volume from 3.1.2 to 3.1.3

I am closed the BZ#1422539 to can't fix as it is theoritically not possible as we can't make change to a shipped release.

Hence I am changing the title of this bz to reflect the actual problem, ie rolling upgrade of an EC volume

Comment 20 nchilaka 2017-02-15 14:01:47 UTC
based on above comments, 
I am moving to verified

Comment 21 nchilaka 2017-02-15 14:16:06 UTC
Points to note with this fix:
=============================
1)Rolling or non-disruptive upgrade will work with base release as 3.2.0 to any greater version release. ie 3.2.0 to 3.2.0_beyond say 3.3
2)Rolling or non-disruptive upgrade will NOT work with base release as <3.2 to any supported release. Eg:3.1.3 to 3.2 or 3.1.2 to 3.1.3 or 3.1.2 to 3.2 will Not work, hence do disruptive upgrade.
3) to test the fix, we tested between dev builds of 3.2.0 ie b/w 3.8.4-3 to 3.8.4-12 , the rolling upgrade worked in this case

Comment 24 errata-xmlrpc 2017-03-23 05:36:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html


Note You need to log in before you can comment on or make changes to this bug.