Bug 1684029
Summary: | upgrade from 3.12, 4.1 and 5 to 6 broken | |||
---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | hari gowtham <hgowtham> | |
Component: | core | Assignee: | Sanju <srakonde> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | urgent | Docs Contact: | ||
Priority: | high | |||
Version: | 6 | CC: | amgad.saleh, amukherj, bugs, pasik, srakonde | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | gluster-test-day | |||
Fixed In Version: | glusterfs-6.0 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1685120 (view as bug list) | Environment: | ||
Last Closed: | 2019-03-08 14:46:11 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1685120 | |||
Bug Blocks: | 1672818, 1732875 |
Description
hari gowtham
2019-02-28 09:54:59 UTC
The peers are running inro rejected state because there is a mismatch in the volfiles. Differences are: 1. Newer volfiles are having "option transport.socket.ssl-enabled off" where older volfiles are not having this option. 2. order of quick-read and open-behind are changed commit 4e0fab4 introduced this issue. previously we didn't had any default value for the option transport.socket.ssl-enabled. So this option was not captured in the volfile. with the above commit, we are adding a default value. So this is getting captured in volfile. commit 4e0fab4 has a fix for https://bugzilla.redhat.com/show_bug.cgi?id=1651059. I feel this commit has less significance, we can revert this change. If we do so, we are out of 1st problem. not sure, why the order of quick-read and open-behind are changed. Atin, do let me know your thoughts on proposal of reverting the commit 4e0fab4. Thanks, Sanju Root cause: Commit 5a152a changed the mechanism of computing the checksum. Because of this change, in heterogeneous cluster, glusterd in upgraded node follows new mechanism for computing the cksum and non-upgraded nodes follow old mechanism for computing the cksum. So the cksum in upgraded node doesn't match with non-upgraded nodes which results in peer rejection issue. Thanks, Sanju REVIEW: https://review.gluster.org/22313 (core: make compute_cksum function op_version compatible) posted (#1) for review on release-6 by Sanju Rakonde REVIEW: https://review.gluster.org/22319 (core: make compute_cksum function op_version compatible) posted (#1) for review on release-6 by Sanju Rakonde REVIEW: https://review.gluster.org/22319 (core: make compute_cksum function op_version compatible) merged (#3) on release-6 by Shyamsundar Ranganathan This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report. glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html [2] https://www.gluster.org/pipermail/gluster-users/ Upgrade from 3.12.15 to 6.3-1 failed - 1) Have a cluster of 3 nodes on 3.12.15 2) Upgraded 1st node to 6.3-1 , bricks on that volume went off-line and can't be brought online till backed out to 3.12.15. The following are the two lines reported in the /var/log/glusterfs/glusterd.log [2019-07-08 03:11:18.641072] E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.3/rpc-transport/socket.so: undefined symbol: xlator_api The message "E [MSGID: 101097] [xlator.c:218:xlator_volopt_dynload] 0-xlator: dlsym(xlator_api) missing: /usr/lib64/glusterfs/6.3/rpc-transport/socket.so: undefined symbol: xlator_api" repeated 7 times between [2019-07-08 03:11:18.641072] and [2019-07-08 03:11:18.641729] This is really blocking!!!!!!!!!!!!!! amgad: Isn't that a different/separate issue? Which means you should open a new bugzilla entry for it.. Amgad, I tried upgrading from 3.12.15 to 6.3 and I haven't observed any issue with bricks coming up. They are online. You will face https://bugzilla.redhat.com/show_bug.cgi?id=1728126 while in-service upgrade. Fix for the bug is posted. Thanks, Sanju Hi Sanju: This one issue -- I opened https://bugzilla.redhat.com/show_bug.cgi?id=1727682 for all the issues, including glusterd not starting with the default port 24007. Regards, Amgad What release this fix is going to and when it will be available? How about the heal issue with the online rollback https://bugzilla.redhat.com/show_bug.cgi?id=1687051, does that fix address it? It should be in the same area! You may expect to have it in 6.3, please follow the bug to know, in which release the fix will be present. I have replied to you at the bugs you mentioned above and please do get back with the information. Thanks, Sanju Thanks Sanju:
> You may expect to have it in 6.3
Do you mean 6.3-2? or 6.4? 6.3-1 is already out. That's the one I experienced the issue with.
The bugfix - states "release 6"
|