Bug 1128421
Summary: | gluster nfs server process was crashed multiple time while mounting volume and starting volume using force option | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rachana Patel <racpatel> | |
Component: | gluster-nfs | Assignee: | Brad Hubbard <bhubbard> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ||
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 2.1 | CC: | bhubbard, mzywusko, nchilaka, ndevos, rcyriac, saujain, vagarwal | |
Target Milestone: | --- | Keywords: | Reopened | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1196520 (view as bug list) | Environment: | ||
Last Closed: | 2015-08-10 07:43:27 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1196520 | |||
Bug Blocks: |
Description
Rachana Patel
2014-08-10 11:34:05 UTC
1) The root cause: The NLM was not able to register with portmapper which prohibited NFS to start. (Log snippet): [2014-08-09 10:44:48.216245] E [rpcsvc.c:1260:rpcsvc_program_register_portmap] 0-rpc-service: Could not register with portmap [2014-08-09 10:44:48.216278] E [nfs.c:341:nfs_init_versions] 0-nfs: Program NLM4 registration failed [2014-08-09 10:44:48.216291] E [nfs.c:1327:init] 0-nfs: Failed to initialize protocols 2) I am not able to repro the issue with several attempts. Hence, I am closing the bug for now. Please feel free to reopen if you see it again. But please check why NLM or (ACL or MOUNT or NFS) fails to register with portmapper (without which NFS cant work). Thanks, Santosh Reopening this as I can reproduce it with the reproducer in bz1196520. See comments 6,7 and 8. [2015-02-28 09:35:51.873805] I [socket.c:3537:socket_init] 0-socket.NLM: using system polling thread [2015-02-28 09:35:51.900285] E [nfs.c:341:nfs_init_versions] 0-nfs: Program NLM4 registration failed [2015-02-28 09:35:51.900347] E [nfs.c:1327:init] 0-nfs: Failed to initialize protocols [2015-02-28 09:35:51.900367] E [xlator.c:423:xlator_init] 0-nfs-server: Initialization of volume 'nfs-server' failed, review your volfile again [2015-02-28 09:35:51.900386] E [graph.c:292:glusterfs_graph_init] 0-nfs-server: initializing translator failed [2015-02-28 09:35:51.900404] E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed pending frames: frame : type(0) op(0) patchset: git://git.gluster.com/glusterfs.git signal received: 11 time of crash: 2015-02-28 09:35:51configuration details: argp 1 backtrace 1 dlfcn 1 fdatasync 1 libpthread 1 llistxattr 1 setfsid 1 spinlock 1 epoll.h 1 xattr.h 1 st_atim.tv_nsec 1 package-string: glusterfs 3.4.0.72rhs I did the following as part of QE validation for the fix: 1)had a 3 node cluster A,B,C 2)had a client 3)now created a volume with 2 bricks one each on node A and B(distribute only) 4)then started volume 5)killed one nfs process of one of the nodes(A) and one brick process(B) 6)Now did a force restart and mounted the volume on NodeA and Node C using NFs and on node C using fuse. Mount was successful without any crash Also, did the following 1-4 same as above 5)Mounted volume on nfs client using nodeA server IP and fuse mount using C 6)killed brick and nfs process of node A 7)nfs mount point was not responding due to A nfs process down, fuse was responding 8)nfs mounted using node C, worked fine 9)restarted the volume using force 10)the mount point using nodeA was stuck for some time about say 3-4 min, but then started to respond Did following on dist-rep volume 1-4 as above 5)mounted using nfs of node A 6)kept appending a file 7)killed nfs process of A and B and brick of A 8)the writes stopped(append stopped) 9)mounted using fuse using C IP 10)saw contents of file, append from nfs mount of A had stopped as expected 11)did a force start 12)mounted using nfs mount from node C on client and saw that the append started to continue from where it stopped and both node A and C mounts were responding hence moving the bug to verified Server version: [root@nchilaka-nfsv3-6 yum.repos.d]# rpm -qa|grep gluster gluster-nagios-common-0.2.0-1.el6rhs.noarch glusterfs-3.7.1-9.el6rhs.x86_64 glusterfs-cli-3.7.1-9.el6rhs.x86_64 gluster-nagios-addons-0.2.4-4.el6rhs.x86_64 glusterfs-libs-3.7.1-9.el6rhs.x86_64 glusterfs-client-xlators-3.7.1-9.el6rhs.x86_64 glusterfs-api-3.7.1-9.el6rhs.x86_64 glusterfs-server-3.7.1-9.el6rhs.x86_64 glusterfs-rdma-3.7.1-9.el6rhs.x86_64 vdsm-gluster-4.16.20-1.2.el6rhs.noarch python-gluster-3.7.1-8.el6rhs.x86_64 glusterfs-fuse-3.7.1-9.el6rhs.x86_64 glusterfs-geo-replication-3.7.1-9.el6rhs.x86_64 [root@nchilaka-nfsv3-6 yum.repos.d]# cat /etc/redhat-* Red Hat Enterprise Linux Server release 6.7 (Santiago) Red Hat Gluster Storage Server 3.1 [root@nchilaka-nfsv3-6 yum.repos.d]# gluster --version glusterfs 3.7.1 built on Jul 12 2015 22:27:42 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. [root@nchilaka-nfsv3-6 yum.repos.d]# NFS client version: [root@nchilaka-nfs-client-6 distrep]# cat /etc/redhat-* cat: /etc/redhat-access-insights: Is a directory cat: /etc/redhat-lsb: Is a directory Red Hat Enterprise Linux Server release 6.7 (Santiago) [root@nchilaka-nfs-client-6 distrep]# rpm -qa|grep gluster fuse mount client version: [root@nchilaka-fuse-client-6 distrep]# cat /etc/redhat- cat: /etc/redhat-: No such file or directory [root@nchilaka-fuse-client-6 distrep]# cat /etc/redhat-* Red Hat Enterprise Linux Server release 6.7 (Santiago) Red Hat Gluster Storage Server 3.1 [root@nchilaka-fuse-client-6 distrep]# rpm -qa|grep gluster gluster-nagios-common-0.2.0-1.el6rhs.noarch gluster-nagios-addons-0.2.4-4.el6rhs.x86_64 glusterfs-3.7.1-11.el6rhs.x86_64 glusterfs-fuse-3.7.1-11.el6rhs.x86_64 glusterfs-devel-3.7.1-11.el6rhs.x86_64 glusterfs-geo-replication-3.7.1-11.el6rhs.x86_64 python-gluster-3.7.1-9.el6rhs.x86_64 glusterfs-libs-3.7.1-11.el6rhs.x86_64 glusterfs-client-xlators-3.7.1-11.el6rhs.x86_64 glusterfs-cli-3.7.1-11.el6rhs.x86_64 glusterfs-api-devel-3.7.1-11.el6rhs.x86_64 glusterfs-rdma-3.7.1-11.el6rhs.x86_64 vdsm-gluster-4.16.20-1.2.el6rhs.noarch glusterfs-api-3.7.1-11.el6rhs.x86_64 glusterfs-server-3.7.1-11.el6rhs.x86_64 [root@nchilaka-fuse-client-6 distrep]# ifconfig eth0 Link encap:Ethernet HWaddr 52:54:00:11:06:C8 inet addr:10.70.43.157 Bcast:10.70.43.255 Mask:255.255.252.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:151875 errors:0 dropped:0 overruns:0 frame:0 TX packets:3857 errors:0 dropped:0 overruns:0 carrier:0 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |