Bug 1238318 - NFS mount throws Remote I/O error
Summary: NFS mount throws Remote I/O error
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: nfs
Version: 3.7.2
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Niels de Vos
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-07-01 15:12 UTC by Vladimir Mitiouchev
Modified: 2017-03-08 11:04 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-03-08 11:04:53 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
network communication (101.54 KB, application/octet-stream)
2015-07-01 15:12 UTC, Vladimir Mitiouchev
no flags Details

Description Vladimir Mitiouchev 2015-07-01 15:12:28 UTC
Created attachment 1045126 [details]
network communication

Description of problem:
NFS is showing Remote I/O errors. When disable and re-enable acl, it works.

Version-Release number of selected component (if applicable):
CentOS Linux release 7.1.1503 (Core) 
Kernel: Linux 4.0.5-1.el7.centos.x86_64 
GlusterFS: glusterfs 3.7.2 built on Jun 24 2015 11:51:59

How reproducible:
Always

Steps to Reproduce:
1. create and run distributed volume
2. mount gluster volume with nfs v3 
3. try to touch a file

Actual results:
[root@ip-172-31-26-220 ~]# touch /nfs/zzzzz
touch: cannot touch ‘/nfs/zzzzz’: Remote I/O error
[root@ip-172-31-26-220 ~]# gluster volume set gv0 nfs.acl off
volume set: success
[root@ip-172-31-26-220 ~]# touch /nfs/zzzzz
[root@ip-172-31-26-220 ~]# gluster volume set gv0 nfs.acl on
volume set: success
[root@ip-172-31-26-220 ~]# touch /nfs/zzzzz2
[root@ip-172-31-26-220 ~]# 


Expected results:
[root@ip-172-31-26-220 ~]# touch /nfs/zzzzz
[root@ip-172-31-26-220 ~]# 

Additional info:
# mount -vvv -t nfs 127.0.0.1:/gv0 /nfs
mount.nfs: timeout set for Wed Jul  1 14:58:09 2015
mount.nfs: trying text-based options 'vers=4,addr=127.0.0.1,clientaddr=127.0.0.1'
mount.nfs: mount(2): Protocol not supported
mount.nfs: trying text-based options 'addr=127.0.0.1'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 127.0.0.1 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: portmap query retrying: RPC: Program not registered
mount.nfs: prog 100005, trying vers=3, prot=6
mount.nfs: trying 127.0.0.1 prog 100005 vers 3 prot TCP port 38465

mount options:
127.0.0.1:/gv0 on /nfs type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=127.0.0.1,mountvers=3,mountport=38465,mountproto=tcp,local_lock=none,addr=127.0.0.1)

nfs.log:
[2015-07-01 13:44:22.441442] W [socket.c:642:__socket_rwv] 0-glusterfs: readv on 127.0.0.1:24007 failed (No data available)
[2015-07-01 13:44:23.797116] W [glusterfsd.c:1219:cleanup_and_exit] (--> 0-: received signum (15), shutting down
[2015-07-01 13:45:41.267495] I [MSGID: 100030] [glusterfsd.c:2301:main] 0-/usr/local/sbin/glusterfs: Started running /usr/local/sbin/glusterfs version 3.7.2 (args: /usr/local/sbin/glusterfs -s localhost --volfile-id gluster/nfs -p /var/lib/glusterd/nfs/run/nfs.pid -l /var/log/glusterfs/nfs.log -S /var/run/gluster/e09d4d8e1a2e7e95fc758343486c0d55.socket)
[2015-07-01 13:45:41.310041] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2015-07-01 13:45:41.435880] I [rpcsvc.c:2213:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 16
[2015-07-01 13:45:41.442879] W [MSGID: 112153] [mount3.c:3910:mnt3svc_init] 0-nfs-mount: Exports auth has been disabled!
[2015-07-01 13:45:41.448427] W [MSGID: 112178] [nlm4.c:2534:nlm4svc_init] 0-nfs-NLM: unable to get pid of rpc.statd from /var/run/rpc.statd.pid 
[2015-07-01 13:45:41.498169] I [rpc-drc.c:694:rpcsvc_drc_init] 0-rpc-service: DRC is turned OFF
[2015-07-01 13:45:41.498190] I [MSGID: 112110] [nfs.c:1494:init] 0-nfs: NFS service started
[2015-07-01 13:45:41.501042] W [graph.c:357:_log_if_unknown_option] 0-nfs-server: option 'rpc-auth.auth-glusterfs' is not recognized
[2015-07-01 13:45:41.501103] W [graph.c:357:_log_if_unknown_option] 0-nfs-server: option 'rpc-auth-allow-insecure' is not recognized
[2015-07-01 13:45:41.501157] W [graph.c:357:_log_if_unknown_option] 0-nfs-server: option 'transport-type' is not recognized
[2015-07-01 13:45:41.501222] I [MSGID: 114020] [client.c:2118:notify] 0-gv0-client-0: parent translators are ready, attempting connect on transport
[2015-07-01 13:45:41.504008] I [MSGID: 114020] [client.c:2118:notify] 0-gv0-client-1: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
  1: volume gv0-client-0
  2:     type protocol/client
  3:     option ping-timeout 42
  4:     option remote-host 172.31.25.38
  5:     option remote-subvolume /pool0/brick0
  6:     option transport-type socket
  7:     option username 5441a511-4d11-4a5e-852b-210895b5ec75
  8:     option password 82e3aa21-63a4-47d1-8540-9ed797076f19
  9:     option send-gids true
 10: end-volume
 11:  
 12: volume gv0-client-1
 13:     type protocol/client
 14:     option ping-timeout 42
 15:     option remote-host 172.31.26.220
 16:     option remote-subvolume /pool0/brick0
 17:     option transport-type socket
 18:     option username 5441a511-4d11-4a5e-852b-210895b5ec75
 19:     option password 82e3aa21-63a4-47d1-8540-9ed797076f19
 20:     option send-gids true
 21: end-volume
 22:  
 23: volume gv0-dht
 24:     type cluster/distribute
 25:     subvolumes gv0-client-0 gv0-client-1
 26: end-volume
 27:  
 28: volume gv0-write-behind
 29:     type performance/write-behind
 30:     subvolumes gv0-dht
 31: end-volume
 32:  
 33: volume gv0
 34:     type debug/io-stats
 35:     option latency-measurement off
 36:     option count-fop-hits off
 37:     subvolumes gv0-write-behind
 38: end-volume
 39:  
 40: volume nfs-server
 41:     type nfs/server
 42:     option rpc-auth.auth-glusterfs on
 43:     option rpc-auth.auth-unix on
 44:     option rpc-auth.auth-null on
 45:     option rpc-auth.ports.insecure on
 46:     option rpc-auth-allow-insecure on
 47:     option transport-type socket
 48:     option transport.socket.listen-port 2049
 49:     option nfs.dynamic-volumes on
 50:     option nfs.nlm on
 51:     option nfs.drc off
 52:     option rpc-auth.addr.gv0.allow *
 53:     option nfs3.gv0.volume-id 45f15b5d-5b89-49d8-88bf-8049ad1609da
 54:     option nfs.acl on
 55:     subvolumes gv0
 56: end-volume
 57:  
+------------------------------------------------------------------------------+
[2015-07-01 13:45:41.506752] I [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2
[2015-07-01 13:45:41.512862] E [MSGID: 114058] [client-handshake.c:1525:client_query_portmap_cbk] 0-gv0-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-07-01 13:45:41.512907] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-gv0-client-1: disconnected from gv0-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2015-07-01 13:45:41.513226] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-gv0-client-0: changing port to 49152 (from 0)
[2015-07-01 13:45:41.516482] I [MSGID: 114057] [client-handshake.c:1438:select_server_supported_programs] 0-gv0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-07-01 13:45:41.517000] I [MSGID: 114046] [client-handshake.c:1214:client_setvolume_cbk] 0-gv0-client-0: Connected to gv0-client-0, attached to remote volume '/pool0/brick0'.
[2015-07-01 13:45:41.517013] I [MSGID: 114047] [client-handshake.c:1225:client_setvolume_cbk] 0-gv0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2015-07-01 13:45:41.517847] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-0: Server lk version = 1
[2015-07-01 13:45:41.518444] I [dht-layout.c:697:dht_layout_normalize] 0-gv0-dht: Found anomalies in / (gfid = 00000000-0000-0000-0000-000000000001). Holes=1 overlaps=0
[2015-07-01 13:45:41.518471] W [MSGID: 109005] [dht-selfheal.c:1778:dht_selfheal_directory] 0-gv0-dht: Directory selfheal failed: 1 subvolumes down.Not fixing. path = /, gfid = 00000000-0000-0000-0000-000000000001
[2015-07-01 13:45:45.300071] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-gv0-client-1: changing port to 49152 (from 0)
[2015-07-01 13:45:45.302695] I [MSGID: 114057] [client-handshake.c:1438:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-07-01 13:45:45.302910] I [MSGID: 114046] [client-handshake.c:1214:client_setvolume_cbk] 0-gv0-client-1: Connected to gv0-client-1, attached to remote volume '/pool0/brick0'.
[2015-07-01 13:45:45.302923] I [MSGID: 114047] [client-handshake.c:1225:client_setvolume_cbk] 0-gv0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2015-07-01 13:45:41.517847] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-0: Server lk version = 1
[2015-07-01 13:45:41.518444] I [dht-layout.c:697:dht_layout_normalize] 0-gv0-dht: Found anomalies in / (gfid = 00000000-0000-0000-0000-000000000001). Holes=1 overlaps=0
[2015-07-01 13:45:41.518471] W [MSGID: 109005] [dht-selfheal.c:1778:dht_selfheal_directory] 0-gv0-dht: Directory selfheal failed: 1 subvolumes down.Not fixing. path = /, gfid = 00000000-0000-0000-0000-000000000001
[2015-07-01 13:45:45.300071] I [rpc-clnt.c:1819:rpc_clnt_reconfig] 0-gv0-client-1: changing port to 49152 (from 0)
[2015-07-01 13:45:45.302695] I [MSGID: 114057] [client-handshake.c:1438:select_server_supported_programs] 0-gv0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-07-01 13:45:45.302910] I [MSGID: 114046] [client-handshake.c:1214:client_setvolume_cbk] 0-gv0-client-1: Connected to gv0-client-1, attached to remote volume '/pool0/brick0'.
[2015-07-01 13:45:45.302923] I [MSGID: 114047] [client-handshake.c:1225:client_setvolume_cbk] 0-gv0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2015-07-01 13:45:45.303030] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-gv0-client-1: Server lk version = 1
[2015-07-01 14:56:09.740053] W [rpcsvc.c:278:rpcsvc_program_actor] 0-rpc-service: RPC program version not available (req 100003 4) for 127.0.0.1:687
[2015-07-01 14:56:09.740087] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
[2015-07-01 14:57:16.723283] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:16.737239] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:16.737427] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:16.737572] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:16.738848] I [MSGID: 112108] [nfs.c:1329:nfs_reconfigure_state] 0-nfs: ACL is manually disabled
[2015-07-01 14:57:16.739231] I [rpc-drc.c:694:rpcsvc_drc_init] 0-rpc-service: DRC is turned OFF
[2015-07-01 14:57:16.739306] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-07-01 14:57:16.739330] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-07-01 14:57:16.739349] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-07-01 14:57:19.523509] E [rpcsvc.c:565:rpcsvc_check_and_reply_error] 0-rpcsvc: rpc actor failed to complete successfully
[2015-07-01 14:57:22.145386] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:22.159005] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:22.159263] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:22.159446] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-07-01 14:57:22.160306] I [MSGID: 112108] [nfs.c:1329:nfs_reconfigure_state] 0-nfs: ACL is manually enabled
[2015-07-01 14:57:22.160623] I [rpc-drc.c:694:rpcsvc_drc_init] 0-rpc-service: DRC is turned OFF
[2015-07-01 14:57:22.160688] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-07-01 14:57:22.160712] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-07-01 14:57:22.160730] I [glusterfsd-mgmt.c:1512:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing

Comment 1 Vlad Ionescu 2015-09-14 03:29:00 UTC
I am experiencing a similar issue. For me, any open() or getxaddr() operation on a gluster NFS mount fails with an EREMOTEIO. The interesting thing is, operations succeed if and only if the file already exists. 

For example:

[root@gluster-2 ~]$ mount -t nfs gluster-1.local:/engine /mnt
[root@gluster-2 ~]$ cat > /mnt/test
-bash: /mnt/test: Remote I/O error
[root@gluster-2 ~]$ cat > /mnt/test
asdf
^D
[root@gluster-2 ~]$

Setting nfs.acl on the volume to off has fixed this issue for me. I have tried a myriad other things all to no avail.

This is a blocking bug and, had it not been for my stumbling upon this bug report, I would not be able to use Gluster's NFS exports.

Comment 2 Niels de Vos 2016-01-18 12:53:41 UTC
Please note that mounting NFS over localhost on a Gluster server is not expected to work flawlessly. There are several services (mainly locking and ACLs) that have conflicts between the kernel component (needed for the NFS-client) and the Gluster component (needed for the NFS-server).

If you need to use Gluster volumes on a Gluster server, you really should mount them with FUSE and not with NFS.

Could you confirm that this problem only occurs with mounting over NFS on Gluster servers that have the NFS-server running?

Comment 3 Vladimir Mitiouchev 2016-07-06 09:38:49 UTC
I can no longer reproduce bug with gluster 3.7.11

Comment 4 Kaushal 2017-03-08 11:04:53 UTC
This bug is getting closed because GlusteFS-3.7 has reached its end-of-life.

Note: This bug is being closed using a script. No verification has been performed to check if it still exists on newer releases of GlusterFS.
If this bug still exists in newer GlusterFS releases, please reopen this bug against the newer release.


Note You need to log in before you can comment on or make changes to this bug.