Bug 1463472
Summary: | Failed setxattr operation: Invalid argument for nfs4_setfacl as sec=null when mounting NetApp v4.1 by default | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Yongcheng Yang <yoyang> | ||||||
Component: | nfs4-acl-tools | Assignee: | Steve Dickson <steved> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Yongcheng Yang <yoyang> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.4 | CC: | bfields, xzhou, yoyang | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2019-03-11 09:50:26 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Yongcheng Yang
2017-06-21 02:20:00 UTC
(In reply to Yongcheng Yang from comment #0) > Steps to Reproduce: > 1. mount netapp server with v4.1 > 2. nfs4_setfacl -a "A::${some_number}:RW" $mountpoint/${some_file} Not sure whether the above operation is common or not. If it's the reconfiguration issue, looks like it's another issue we need to doc (like Bug1450447). This problem is also present in CentOS Linux release 7.5.1804 (Core). On the versions of the protocol, NSF 4.0 and 4.1 give similar errors, which are described in the chapter. [root@testserver ~]# cat /etc/exports /opt *(rw,acl) [root@testserver ~]# mount -t nfs -o vers=4.0 127.0.0.1:/opt /mnt/ [root@testserver ~]# mount 127.0.0.1:/opt on /mnt type nfs4 (rw,relatime,vers=4.0,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1) [root@testserver ~]# nfs4_getfacl /mnt/test A::OWNER@:rwaxtTcCy A::GROUP@:rwaxtcy A::EVERYONE@:rwaxtcy [root@testserver ~]# nfs4_setfacl -a A::test.0.1:rw /mnt/test Failed setxattr operation: Invalid argument [root@testserver ~]# echo $? 255 (In reply to Yongcheng Yang from comment #0) > Description of problem: > > During bz1427974 testing with Netapp server, if mounted with > NFSv4.1, it always get error "Failed setxattr operation: Invalid > argument" when execute "nfs4_setfacl" setting facl to users other > than "OWNER", "GROUP" and "EVERYONE". This operation can pass if > mounted with v4.0. That's very strange! Would it be possible to get network traces showing both the successful (4.0) and unsuccessful (4.1) cases? (In reply to Andrey from comment #2) > This problem is also present in CentOS Linux release 7.5.1804 (Core). > On the versions of the protocol, NSF 4.0 and 4.1 give similar errors, which > are described in the chapter. That (and the fact that you're testing against a CentOS server instead of Netapp) suggests to me that this is a different bug (or possibly a misconfiguration of some sort). I'd also suggest starting with a network trace. My first suspicion would be something about the name test.0.1 that the server doesn't like. Created attachment 1474935 [details]
successful (4.0) network traces
Created attachment 1474936 [details]
failed (4.1) network traces
(In reply to J. Bruce Fields from comment #3) > ... > Would it be possible to get network traces showing both the successful (4.0) > and unsuccessful (4.1) cases? Sorry for didn't investigate that before. Just found that in the bad case, server returned "NFS4ERR_BADOWNER" for SETATTR. However, I just checked the created file was like "nobody:nobody" in both v4.0 and v4.1. Still can't figure out where's the difference. Simple tshark output: ~~~~~~~~~~~~~~~~~~~~ [root ~]# tshark -n -r v4.0.pcap | grep -A1 SETATTR ... -- 61 19 10.73.4.149 -> 10.73.4.5 NFS 362 V4 Call SETATTR FH: 0x80fbef23 62 19 10.73.4.5 -> 10.73.4.149 NFS 130 V4 Reply (Call In 61)[Malformed Packet] [root ~]# [root ~]# tshark -n -r v4.1.pcap | grep -A1 SETATTR ... -- 66 29 10.73.4.149 -> 10.73.4.5 NFS 334 V4 Call SETATTR FH: 0x7d021756 67 29 10.73.4.5 -> 10.73.4.149 NFS 170 V4 Reply (Call In 66) SETATTR Status: NFS4ERR_BADOWNER 68 29 10.73.4.149 -> 10.73.4.5 NFS 334 V4 Call SETATTR FH: 0x7d021756 69 29 10.73.4.5 -> 10.73.4.149 NFS 170 V4 Reply (Call In 68) SETATTR Status: NFS4ERR_BADOWNER 70 29 10.73.4.149 -> 10.73.4.5 TCP 66 756 > 2049 [ACK] Seq=4705 Ack=5537 Win=52864 Len=0 TSval=281894477 TSecr=877673443 [root ~]# I wonder why the client's using AUTH_NULL in the 4.1 case and AUTH_UNIX in the 4.0 case? It shouldn't make a difference, but I'm curious. Is /sys/module/nfsd/parameters/nfs4_disable_idmapping on the server the same in both cases? (In reply to J. Bruce Fields from comment #8) Hi Bruce, > Is /sys/module/nfsd/parameters/nfs4_disable_idmapping on the server the same > in both cases? The tests are all with the same nfs server, i.e. Netapp (see "Additional info" in comment #0). > I wonder why the client's using AUTH_NULL in the 4.1 case and AUTH_UNIX in > the 4.0 case? > > It shouldn't make a difference, but I'm curious. Thanks for the hint! It does matter with using AHTH_NULL or AUTH_UNIX: #################################### # When specifying sec=sys explicitly #################################### [root ~]# mount -o vers=4.1,sec=sys netapp-pnfs-02.rhts.eng.pek2.redhat.com:/export/qe-test /mnt [root ~]# nfsstat -m /mnt from netapp-pnfs-02.rhts.eng.pek2.redhat.com:/export/qe-test Flags: rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.66.144.84,local_lock=none,addr=10.73.4.5 [root ~]# touch /mnt/testfile [root ~]# nfs4_getfacl /mnt/testfile # file: /mnt/testfile A::OWNER@:rwatTnNcCy A:g:GROUP@:rtncy A::EVERYONE@:rtncy [root ~]# nfs4_setfacl -a "A::10000:RW" /mnt/testfile [root ~]# echo $? 0 ^^^ <<<<<<<<<<<<<<<<< success! [root ~]# umount /mnt/ [root ~]# #################################### # It's sec=null by default with v4.1 #################################### [root ~]# mount -o vers=4.1 netapp-pnfs-02.rhts.eng.pek2.redhat.com:/export/qe-test /mnt [root ~]# nfsstat -m /mnt from netapp-pnfs-02.rhts.eng.pek2.redhat.com:/export/qe-test Flags: rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=null,clientaddr=10.66.144.84,local_lock=none,addr=10.73.4.5 [root ~]# nfs4_setfacl -a "A::10000:RW" /mnt/testfile Failed setxattr operation: Invalid argument [root ~]# echo $? 255 [root ~]# rm -f /mnt/testfile [root ~]# umount /mnt/ Then the problem changes to be "why mounting v4.1 with NetApp turns out to be sec=null instead of sec=sys" maybe. (In reply to Yongcheng Yang from comment #9) > (In reply to J. Bruce Fields from comment #8) > Hi Bruce, > > Is /sys/module/nfsd/parameters/nfs4_disable_idmapping on the server the same > > in both cases? > > The tests are all with the same nfs server, i.e. Netapp (see > "Additional info" in comment #0). Oh, thanks for pointing out my oversight. This definitely looks like a server bug. Note that according to RFC 7530, servers and clients should allow numeric id's (like "1000") on the wire only when not using krb5. I wonder if the server has incorrectly implemented that check so that it also rejects them over auth_null. Anyway, we should report this to Netapp. That said, my guess would be that auth_null is not that widely used, so perhaps our tests should just be using auth_sys. I'm closing this as NOTABUG for RHEL client according to comment #10 and comment #9. However, haven't report this issue to Netapp as I have no account for it. |