Bug 2063929
| Summary: | Fedora/FCOS 35 seems to have an issue writting to CephFS mounted filesystems | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | John Fortin <fortinj66> |
| Component: | kernel | Assignee: | Jeff Layton <jlayton> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 35 | CC: | acaringi, adeza, adscvr, airlied, alciregi, aos-bugs, branto, bruce_link, bskeggs, cglombek, danmick, david, dustymabe, fedora, hdegoede, hpa, i, jan.grieb, jarodwilson, jglisse, jlayton, jonathan, josef, kernel-maint, kkeithle, lgoncalv, linville, loic, masami256, mchehab, me, m_liker, philipp.dallig, ptalbert, ramkrsna, sricharan.ramanujam, steved, steve, travier, xiubli |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-05-24 18:13:06 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
John Fortin
2022-03-14 16:24:50 UTC
It may be something different. Look at the permissions on the data directory: sh-4.4$ ls -al /registry/docker/registry/v2/repositories/dev-shop-micro/artifact-micro-addressbook-jdk11/_uploads/16726db4-2946-4ce6-9e83-28ce1980fcd0/ ls: cannot access '/registry/docker/registry/v2/repositories/dev-shop-micro/artifact-micro-addressbook-jdk11/_uploads/16726db4-2946-4ce6-9e83-28ce1980fcd0/data': Permission denied total 1 drwxr-xr-x. 2 1000330000 root 2 Mar 14 15:44 . drwxr-xr-x. 5 1000330000 root 3 Mar 14 16:32 .. -?????????? ? ? ? ? ? data -rw-r--r--. 1 1000330000 root 20 Mar 14 15:44 startedat so maybe the registry is doing something odd with the data file permissions? We experience this issue with odd permissions and strange denials of write access with any cephfs volume both on a systems with a clean okd 4.10 installation with rook 1.8.7 / ceph 16.2.7 as well as with a okd 4.9 updated to 4.10. Both installations were based on FCOS 35. Creating more than one new file on a cephfs volume leads to permission denied errors. Waiting more than a minute between the requests seems to help (but is obviously no solution). sh-4.4$ echo 1 > /test/1.txt sh-4.4$ echo 2 > /test/2.txt sh: /test/2.txt: Permission denied sh-4.4$ echo 3 > /test/3.txt sh: /test/3.txt: Permission denied sh-4.4$ ls -la /test/ ls: cannot access '/test/3.txt': Permission denied ls: cannot access '/test/2.txt': Permission denied total 1 drwxrwxrwx. 2 root root 3 Mar 17 22:46 . dr-xr-xr-x. 1 root root 40 Mar 17 22:44 .. -rw-r--r--. 1 rook rook 2 Mar 17 22:46 1.txt -?????????? ? ? ? ? ? 2.txt -?????????? ? ? ? ? ? 3.txt sh-4.4$ sleep 120 sh-4.4$ echo > /test/4.txt sh-4.4$ ls -la /test/ ls: cannot access '/test/3.txt': Permission denied ls: cannot access '/test/2.txt': Permission denied total 1 drwxrwxrwx. 2 root root 4 Mar 17 22:48 . dr-xr-xr-x. 1 root root 40 Mar 17 22:44 .. -rw-r--r--. 1 rook rook 2 Mar 17 22:46 1.txt -?????????? ? ? ? ? ? 2.txt -?????????? ? ? ? ? ? 3.txt -rw-r--r--. 1 rook rook 1 Mar 17 22:48 4.txt No problems with rdb block volumes. No hints in logs or on the ceph status dashboard. xref'ing the reports/discussions on Github for further context: https://github.com/openshift/okd/discussions/1153 https://github.com/openshift/okd/issues/1160 This actually seems to be an issue with Fedora/FCOS So I am redirecting this to the Fedora team (hopefully) I managed to track down what I believe to be the root cause of this issue. See https://github.com/openshift/okd/issues/1160#issuecomment-1105940751 and https://github.com/openshift/okd/issues/1160#issuecomment-1105980765 for more details. It looks like https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f7a67b463fb83a4b9b11ceaa8ec4950b8fb7f902 is the root cause of the issue we're seeing. Enabling async dirops for Ceph seems to cause SELinux contexts (possibly all xattrs in general, though I did not explicitly test that) to not be propagated correctly on the local mountpoint (ie, the mount on which the files were originally created). I don't have a Ceph cluster or CephFS volumes outside of OKD to test this with, but I believe this repro is sound so long as your Ceph volume mounts have selinux enabled. Reproduction steps: 1. Mount a CephFS volume. 2. Create a new directory in that volume. 3. Rapidly create a bunch of files in that directory - perhaps with a bash loop like this: for i in $(seq 1 1000); do echo "testing123" > test$i.txt; done 4. Note that the SELinux contexts for all of these newly created files EXCEPT FOR THE FIRST ONE MADE (usually test1.txt) are NOT what you expect, but are instead system_u:object_r:unlabeled_t:s0 5. Unmount the volume and re-mount it. 6. Navigate back to your test folder and note that all of the SELinux contexts are now correct. Re-mounting the volume with the wsync option (-o wsync) and performing the test should show that all the created files have the correct context on them right away. Thanks Sri, Can you follow up with the upstream commiter (jlayton) on the issue? Should we move this bug to the `kernel` component? Sounds like a plan. I'll take a look soon. Thanks. It looks like we are sending off the SELinux context when we create the files, but then we end up creating the inode locally without setting the xattr. I'm still looking at what the right fix is at this point and I'll need to set up a reproducer to ensure I understand the problem correctly. Doing a bit of testing with this today, I made a script to create 10 files, and after the first one (which is created synchronously), the others don't have the same selinux context: [jlayton@client1 ~]$ ls -lZ /mnt/test/mkfiles total 0 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 0 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 1 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 2 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 3 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 4 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 5 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 6 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 7 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 8 -rw-r--r--. 1 root root system_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 9 But... if we unmount and mount, then we do see it: [jlayton@client1 ~]$ sudo umount /mnt/test [jlayton@client1 ~]$ sudo mount /mnt/test [jlayton@client1 ~]$ ls -lZ /mnt/test/mkfiles total 0 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 0 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 1 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 2 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 3 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 4 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 5 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 6 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 7 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 8 -rw-r--r--. 1 root root unconfined_u:object_r:unlabeled_t:s0 0 Apr 25 12:54 9 So the files are being created with the correct contexts, but the asynchronously created inodes aren't inheriting it correctly so you won't see them until you remount (or the inodes get flushed out of the cache). I think we're not passing the xattr blob along correctly to finish the async create. I'll have a look at a fix. I just sent a patch that should fix this. I'll see about getting Xiubo to mark it for stable too since this is a regression now that async dirops are enabled. Thanks Jeff! Did this ever land? If so is there an upstream commit hash (or link)? Yes:
commit 620239d9a32e9fe27c9204ec11e40058671aeeb6
Author: Jeff Layton <jlayton>
Date: Mon Apr 25 15:54:27 2022 -0400
ceph: fix setting of xattrs on async created inodes
Looks like that made it into 5.17.9 too:
commit 25633e355cbea61e5a18b938a56f391b7185cf60
Refs: v5.17.8-92-g25633e355cbe
Author: Jeff Layton <jlayton>
AuthorDate: Mon Apr 25 15:54:27 2022 -0400
Commit: Greg Kroah-Hartman <gregkh>
CommitDate: Wed May 18 10:28:21 2022 +0200
ceph: fix setting of xattrs on async created inodes
5.17.9 is now in F35+ so I think we can close this one. |