| Summary: | Inservice upgrade of nfs-ganesha from 3.1 to 3.1.2 failed and triggered a shutdown of one of the node. | ||
|---|---|---|---|
| Product: | Red Hat Gluster Storage | Reporter: | Shashank Raj <sraj> |
| Component: | nfs-ganesha | Assignee: | Kaleb KEITHLEY <kkeithle> |
| Status: | CLOSED NOTABUG | QA Contact: | storage-qa-internal <storage-qa-internal> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | rhgs-3.1 | CC: | akhakhar, jthottan, kkeithle, ndevos, nlevinki, sashinde, skoduri |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-06-20 12:19:02 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
in service upgrade is not supported |
Description of problem: Inservice upgrade of nfs-ganesha from 3.1 to 3.1.2 failed and triggered a shutdown of one of the node. Version-Release number of selected component (if applicable): glusterfs-3.7.5-19 How reproducible: Once Steps to Reproduce: 1.Create a 4 node cluster and install 3.1 rhgs on all the nodes. 2.Configure all the required settings and setup ganesha on the cluster. 3.Mount the volume through VIP of node1 and start some dd from the client. 4.Do a upgrade of node1 by following the procedure mentioned below: service nfs-ganesha stop failover happened from node1 to node1 service glusterd stop pkill glusterfs pkill glusterfsd pcs cluster standby node1 pcs cluster stop node1 enable puddles for 3.1.2 latest yum update nfs-ganesha pcs cluster start node1 pcs cluster unstandby node1 service glusterd start service nfs-ganesha start 5. Node1 got upgraded properly without any issues. 6. Followed the same steps for the upgrade of node2 as below mounted the volume on client with node2 VIP service nfs-ganesha stop failover happened from node2 to node1 service glusterd stop pkill glusterfs pkill glusterfsd pcs cluster standby node2 pcs cluster stop node2 IO was going on yum update nfs-ganesha - all the packages got updated. pcs cluster start node2 pcs cluster unstandby node2 After this pcs status gives below output: Full list of resources: Clone Set: nfs-mon-clone [nfs-mon] Started: [ nfs1 nfs2 nfs3 ] Stopped: [ nfs4 ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ nfs1 nfs2 nfs3 ] Stopped: [ nfs4 ] nfs1-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs1-trigger_ip-1 (ocf::heartbeatummy): Started nfs3 nfs2-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs2-trigger_ip-1 (ocf::heartbeatummy): Started nfs3 nfs3-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs3-trigger_ip-1 (ocf::heartbeatummy): Started nfs3 nfs4-cluster_ip-1 (ocf::heartbeat:IPaddr): Started nfs3 nfs4-trigger_ip-1 (ocf::heartbeatummy): Started nfs3 nfs2-dead_ip-1 (ocf::heartbeatummy): Started nfs2 nfs1-dead_ip-1 (ocf::heartbeatummy): Started nfs1 and then the node4 got shutdown; IO got stopped with "Remote I/O error"and below messages are observed in /var/log/messages. Feb 18 08:10:32 nfs2 stonith-ng[14598]: warning: get_xpath_object: No match for //@st_delegate in /st-reply Feb 18 08:10:32 nfs2 stonith-ng[14598]: notice: remote_op_done: Operation reboot of nfs4 by nfs1 for stonith_admin.cman.12399: No such device Feb 18 08:10:32 nfs2 crmd[14602]: notice: tengine_stonith_notify: Peer nfs4 was not terminated (reboot) by nfs1 for nfs1: No such device (ref=d6228995-f151-4a89-8b62-69a98ac5d76a) by client stonith_admin.cman.12399 Feb 18 08:10:34 nfs2 root: warning: pcs resource create nfs2-dead_ip-1 ocf:heartbeatummy failed Feb 18 08:10:35 nfs2 stonith-ng[14598]: warning: get_xpath_object: No match for //@st_delegate in /st-reply Feb 18 08:10:35 nfs2 stonith-ng[14598]: notice: remote_op_done: Operation reboot of nfs4 by nfs1 for stonith_admin.cman.12432: No such device Feb 18 08:10:35 nfs2 crmd[14602]: notice: tengine_stonith_notify: Peer nfs4 was not terminated (reboot) by nfs1 for nfs1: No such device (ref=a2c22e17-bf30-4bf4-915b-9129959f1d7c) by client stonith_admin.cman.12432' Actual results: Expected results: Upgrade should be successful. Additional info: sos reports and ganesha logs are placed under http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1309984