Description of problem: ------------------------ Running `./autogen.sh' from git clone of glusterfs on gluster-nfs mount hangs for hours. The same on a fuse mount takes only a few minutes to complete. Version-Release number of selected component (if applicable): -------------------------------------------------------------- glusterfs-3.7.1-6.el6rhs.x86_64 How reproducible: ------------------ 100% Steps to Reproduce: -------------------- 1. On NFS mount of distribute-replicate (1x2) volume, run ./autogen.sh from git clone of glusterfs source. Actual results: ---------------- Command is hung on the mount point. Expected results: ----------------- Command is not expected to be hung and should go through. Additional info:
Volume configuration - # gluster v info rep Volume Name: rep Type: Replicate Volume ID: 364ec34f-c989-47b7-b2e4-a07185e84b79 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.70.37.168:/rhs/brick6/b1 Brick2: 10.70.37.199:/rhs/brick6/b1 Options Reconfigured: cluster.consistent-metadata: on features.quota-deem-statfs: on features.inode-quota: on features.quota: on features.uss: on performance.readdir-ahead: on
Under investigation...
This is not easily reproducible, I have run the "git checkout" and "./autogen.sh" a few times now, but it continues to succeed for me. I tested both with default volume options and the ones given in comment #3. How many times out of how many runs does this fail for you? Do you have the logs somewhere so that I can have a look? A typical run on my test environment looks like this: [root@vm016 ~]# time /tmp/clone-and-autogen.sh + mount -t nfs -o vers=3 vm017.example.com:/bz1238404 /mnt/ + pushd /mnt/ /mnt ~ + git clone /srv/src/glusterfs/ Cloning into 'glusterfs'... done. Checking out files: 100% (1877/1877), done. + pushd glusterfs/ /mnt/glusterfs /mnt ~ + ./autogen.sh ... GlusterFS autogen ... Running aclocal... Running autoheader... Running libtoolize... Running autoconf... Running automake... configure.ac:249: installing './config.guess' configure.ac:249: installing './config.sub' configure.ac:16: installing './install-sh' configure.ac:16: installing './missing' api/examples/Makefile.am: installing './depcomp' geo-replication/syncdaemon/Makefile.am:3: installing './py-compile' parallel-tests: installing './test-driver' Running autogen.sh in argp-standalone ... configure.ac:10: installing './install-sh' configure.ac:10: installing './missing' Makefile.am: installing './depcomp' Please proceed with configuring, compiling, and installing. + popd /mnt ~ + rm -rf glusterfs + popd ~ + umount /mnt real 4m29.031s user 0m28.391s sys 0m6.374s
I've taken a look at the nfs.log from the sosreports mentioned in comment #2. There are quite some obvious messages in sosreport-dhcp37-168.1238404-20150702010409.tar.xz , and I wonder if you have missed those? I do not know if that is the NFS-server you mounted the volume from, the other sosreport does not have them. It is also not clear what NFS-client you used, and if you have an sosreport from that one. It would be trivial to check if a firewall and/or rpcbind is enabled and running there... [2015-07-01 19:18:28.943961] E [MSGID: 112167] [nlm4.c:1013:nlm4_establish_callback] 0-nfs-NLM: Unable to get NLM port of the client. Is the firewall running on client? OR Are RPC services running (rpcinfo -p)?
Shruti/Niels, Local policy module is given by Milos in Comment 17. Could you test with that and confirm if that works?
(In reply to Prasanth from comment #18) > Shruti/Niels, > > Local policy module is given by Milos in Comment 17. Could you test with > that and confirm if that works? Seems to work for me, thanks!
(In reply to Prasanth from comment #18) > Shruti/Niels, > > Local policy module is given by Milos in Comment 17. Could you test with > that and confirm if that works? Worked for me too.
I have tested this on selinux-policy-3.13.1-32.el7 (which is the FIV of RHEL BZ #1240584). See https://bugzilla.redhat.com/show_bug.cgi?id=1240584#c3 Any reason why the FIV for this BZ is higher - selinux-policy-3.13.1-34.el7 ?
Moving back to modified, the bz was moved to on_qa by the errata tool.To be moved to On_QA post 11 aug batch update
In order to rectify this problem, please use the workaround as mentioned below on all the servers: Step1: # cat bz1238404.te policy_module(bz1238404,1.0) require { type glusterd_t; } corenet_tcp_connect_portmap_port(glusterd_t) Step2: # make -f /usr/share/selinux/devel/Makefile Compiling targeted bz1238404 module /usr/bin/checkmodule: loading policy configuration from tmp/bz1238404.tmp /usr/bin/checkmodule: policy configuration loaded /usr/bin/checkmodule: writing binary representation (version 10) to tmp/bz1238404.mod Creating targeted bz1238404.pp policy package rm tmp/bz1238404.mod tmp/bz1238404.mod.fc Step3: # semodule -i bz1238404.pp
Filling DOC text as required.
Verified as fixed in selinux-policy-3.7.19-279.el6_7.1.noarch.