Bug 765261 - (GLUSTER-3529) Clients mounting replicated FS via NFS return input / output error on some files
Clients mounting replicated FS via NFS return input / output error on some files
Product: GlusterFS
Classification: Community
Component: access-control (Show other bugs)
x86_64 Linux
medium Severity low
: ---
: ---
Assigned To: shishir gowda
Depends On:
  Show dependency treegraph
Reported: 2011-09-08 11:55 EDT by Jonathan Barber
Modified: 2013-12-08 20:26 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: ---
Regression: ---
Mount Type: nfs
Documentation: DP
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Jonathan Barber 2011-09-08 11:55:29 EDT
I have two nodes running RHEL 5u4 x86_64 and glusterfs 3.2.3-1 from the RPMs here:

They have a replicated GlusterFS created with:
gluster volume create TEST replica 2 transport tcp server1:/exp server2:/exp

This volume is exported via NFS to two clients (RHEL 5u2 i386). Each client mounts one of the servers via autofs with the mount options:

i.e. the setup is like this:
client1 <---NFS---> server1 <---GlusterFS---> server2 <---NFS---> client2

When I create files on client under the NFS mount with the command:
for i in {1..1000}; do
  sudo -u user1 dd if=/dev/urandom of=/repository/info150/test$RANDOM.vox bs=1024 count=36

where user1 is:
uid=501(user1) gid=10000(group1) groups=10000(group1),11000(group2)

and then read the files on the client2 with the command:
cat /repository/info150/test*.vox

with the user user2:
uid=500(user2) gid=10000(group1) groups=10000(group1),11000(group2)

I get errors with some files (not all files):
cat: /repository/info150/test6129.vox: Input/output error

Looking at a TCP dump of the NFS exchange (using wireshark), the error returned by the NFS server is NFS3ERR_PERM.

Setting the log-level to DEBUG with:
gluster volume set TEST diagnostics.brick-log-level TRACE

Generates the following output in /var/log/glusterfs/bricks/exp.log:
[2011-09-08 11:31:44.618975] T [rpcsvc.c:958:rpcsvc_handle_rpc_call] rpcsvc: Client port: 1023
[2011-09-08 11:31:44.619074] T [rpcsvc-auth.c:276:rpcsvc_auth_request_init] rpc-service: Auth handler: AUTH_GLUSTERFS
[2011-09-08 11:31:44.619090] T [rpcsvc.c:887:rpcsvc_request_create] rpc-service: recieved rpc-message (XID: 0x14a0, Ver: 2, Program: 1298437, ProgVers: 310, Proc: 11) from rpc-transport (tcp.TEST-server)
[2011-09-08 11:31:44.619106] T [auth-glusterfs.c:185:auth_glusterfs_authenticate] rpc-service: Auth Info: pid: 0, uid: 500, gid: 10000, owner: 12343
[2011-09-08 11:31:44.619121] T [rpcsvc.c:723:rpcsvc_program_actor] rpc-service: Actor found: GlusterFS-3.1.0 - OPEN
[2011-09-08 11:31:44.619170] T [server-resolve.c:127:resolve_loc_touchup] TEST-server: return value inode_path 21
[2011-09-08 11:31:44.619240] T [access-control.c:210:ac_test_access] access-control: Testing owner access
[2011-09-08 11:31:44.619276] T [access-control.c:220:ac_test_access] access-control: Testing group access
[2011-09-08 11:31:44.619296] T [access-control.c:231:ac_test_access] access-control: Testing other access
[2011-09-08 11:31:44.619315] T [access-control.c:239:ac_test_access] access-control: No access allowed
[2011-09-08 11:31:44.619349] D [server3_1-fops.c:1283:server_open_cbk] TEST-server: 5280: OPEN /info150/test6129.vox (49238) ==> -1 (Operation not permitted)
[2011-09-08 11:31:44.619398] T [rpcsvc.c:1516:rpcsvc_submit_generic] rpc-service: Tx message: 16
[2011-09-08 11:31:44.619417] T [rpcsvc.c:1151:rpcsvc_record_build_header] rpc-service: Reply fraglen 40, payload: 16, rpc hdr: 24
[2011-09-08 11:31:44.619463] T [rpcsvc.c:1555:rpcsvc_submit_generic] rpc-service: submitted reply for rpc-message (XID: 0x5280x, Program: GlusterFS-3.1.0, ProgVers: 310, Proc: 11) to rpc-transport (tcp.TEST-server)

After reading the file as root on client2, user2 can read the file without generating errors.

Upgrading to the 3.2.3-1 RPMs fixes this problem, but I wanted to document this in case anyone else had this problem.
Comment 1 shishir gowda 2011-09-12 21:01:24 EDT
With the Posix ACL support introduced in 3.2.2 release, the issue seems to be fixed. Can have the bug documented in releases.

Note You need to log in before you can comment on or make changes to this bug.