Bug 964319
| Summary: | Upgrading to 3.9 kernel breaks NFS | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Julian Sikorski <belegdol> | ||||||
| Component: | kernel | Assignee: | nfs-maint | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 18 | CC: | aidanamarks, bfields, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| URL: | https://github.com/sahlberg/libnfs/issues/33 | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | 3.9.5-201.fc18.x86_64 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2013-06-13 19:30:00 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Julian Sikorski
2013-05-17 20:50:34 UTC
Updating to 3.9.3-201.fc18 does not help. I wonder what libnfs does. Might be worth strace'ing xmbc to see what the mount system call looks like and how that compares with the one done by mount. "Eventually I have discovered that downgrading back to kernel-3.8.11-200.fc18 has solved the problem" Is it the client or server that you're upgrading and downgrading here? Comparing network traces in the two cases might also be interesting. (tcpdump -s0 -wtmp.pcap, then look at tmp.pcap in wireshark or attach it to this bug.) It is the server that I'm downgrading. Keep in mind that the client is a minimalistic linux distribution (openelec), and thus it might be hard to debug the problem from that end. xbmc is booted by openelec automatically, so I'm not sure how to strace it. When it comes to tcpdump, should I run on the client or the server? Created attachment 752837 [details]
tcpdump file
Ok, there seems to be no strace nor tcpdump on the client. On the server, the command you gave returns the following:
$ tcpdump -s0 -wtmp.pcap
tcpdump: no suitable device found
$ tcpdump -s0 -i wlan0 -wtmp.pcap
has worked. Let me know if there is any useful information in that file.
Thanks! I assume that trace was taken in the failing case? The last call there is a LOOKUP which returns a badcred authentication error. OK, I see--this is another consequence of recent user namespace changes which tend to treat -1 id's as invalid. And for some reason xmbc is sending that last lookup with a gid of 0xffff. I'll see if I can come up with a patch.... Created attachment 752881 [details]
[PATCH] svcrpc: fix failures to handle -1 uid's and gid's
Could you see whether this patch helps?
Hi, I hit this issue with openelec too (sending gid 0xffff) returning bad cred when moving from 3.8 to 3.9. Tried your patch on top of gentoo-sources 3.9.3 but still returns bad cred. The patch seems to work for me, thank you. Aidan, why don't you try it with the Fedora kernel: http://belegdol.fedorapeople.org/nfs-xbmc-fix/ I am uploading the files as of writing this comment, there is also a source rpm if you prefer to rebuild the kernel yourself. Sorry, my mistake, yes the patch does work. Please commit to master and 3.9 branches. I have now uploaded a fixed 3.9.4-200.fc18 kernel. Thanks for the testing! Should be going upstream as well in the next few days. BTW this should also be reported to xmbc (or whoever maintains their nfs library); -1 is an extremely poor choice of gid and it wouldn't be surprising to see it cause problems for other NFS servers as well. I think it might be fixed already: https://github.com/sahlberg/libnfs/commit/43e0e7a7e6cbec9ba55db89eac368d42e969ad55 |