Description of problem: The ceph.dir.layout layout xattr does not exist for subdirs on non default pool. Below is testing step: [root@client ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/rhel-root 45G 1.8G 43G 4% / devtmpfs 4.8G 0 4.8G 0% /dev tmpfs 4.9G 0 4.9G 0% /dev/shm tmpfs 4.9G 8.4M 4.8G 1% /run tmpfs 4.9G 0 4.9G 0% /sys/fs/cgroup /dev/sda1 1014M 143M 872M 15% /boot tmpfs 984M 0 984M 0% /run/user/0 10.72.35.187:6789:/ 855G 968M 854G 1% /mnt/cephfs ceph-fuse 271G 0 271G 0% /mnt/ceph-fuse setfattr -n ceph.dir.layout -v "stripe_unit=524288 stripe_count=8 object_size=4194304 pool=cephfs_data" /mnt/ceph-fuse/test getfattr -n ceph.dir.layout.pool /mnt/ceph-fuse/test getfattr: Removing leading '/' from absolute path names # file: mnt/ceph-fuse/test ceph.dir.layout.pool="cephfs_data" mkdir -p /mnt/ceph-fuse/test/test_new getfattr -n ceph.dir.layout.pool /mnt/ceph-fuse/test/test_new /mnt/ceph-fuse/test/test_new: ceph.dir.layout.pool: No such attribute <<here touch /mnt/ceph-fuse/test/test_new/test_file;getfattr -n ceph.file.layout.pool /mnt/ceph-fuse/test/test_new/test_file getfattr: Removing leading '/' from absolute path names # file: mnt/ceph-fuse/test/test_new/test_file ceph.file.layout.pool="cephfs_data" The subdir should also display it's inherited dir.layout.pool value, but it does not. Version-Release number of selected component (if applicable): ceph-common-12.2.4-10.el7cp.x86_64 Tue Jun 12 15:35:27 2018 ceph-fuse-12.2.4-10.el7cp.x86_64 Tue Jun 12 15:35:10 2018 libcephfs2-12.2.4-10.el7cp.x86_64 Tue Jun 12 15:34:34 2018 python-cephfs-12.2.4-10.el7cp.x86_64 Tue Jun 12 15:34:38 2018 How reproducible: 100% reproduced Steps to Reproduce: 1. 2. 3. Actual results: The subdir should also display it's inherited dir.layout.pool value, but it does not. Expected results: The subdir should also display it's inherited dir.layout.pool value Additional info: For the log collection, please check below: http://collab-shell.usersys.redhat.com/02170308/
If I recall correctly, this is how it has always worked -- you see the layout vxattr on the inode where it is set, and child directories have it blank (meaning "inherit"). If we showed the layout vxattrs on all child directories, then the user could not distinguish between where a layout is really set persistently, vs. where the layout is just being shown because an ancestor has one.
(In reply to John Spray from comment #6) > If I recall correctly, this is how it has always worked -- you see the > layout vxattr on the inode where it is set, and child directories have it > blank (meaning "inherit"). > > If we showed the layout vxattrs on all child directories, then the user > could not distinguish between where a layout is really set persistently, vs. > where the layout is just being shown because an ancestor has one. That's good background, thanks John. Maybe we should display a value like "@inherited:pool" or set a ceph.dir.layout.inherited=<true|false>?
(In reply to Patrick Donnelly from comment #7) > (In reply to John Spray from comment #6) > > If I recall correctly, this is how it has always worked -- you see the > > layout vxattr on the inode where it is set, and child directories have it > > blank (meaning "inherit"). > > > > If we showed the layout vxattrs on all child directories, then the user > > could not distinguish between where a layout is really set persistently, vs. > > where the layout is just being shown because an ancestor has one. > > That's good background, thanks John. Maybe we should display a value like > "@inherited:pool" or set a ceph.dir.layout.inherited=<true|false>? Edit for clarification: "@inherited:pool" or similar which is presumably not a valid pool name.
> That's good background, thanks John. Maybe we should display a value like "@inherited:pool" or set a ceph.dir.layout.inherited=<true|false>? I'm struggling to think of a common use case for reading the actual value, but perhaps it's worth having the field there to make the actual behaviour a bit more obvious + avoid people thinking their layouts haven't taken effect because their subdirs don't show one. If we knew what the situation was that led to someone picking up on the existing behaviour as unexpected, that might help inform the decision.
Updating the QA Contact to a Hemant. Hemant will be rerouting them to the appropriate QE Associate. Regards, Giri
I had posted this to the wrong place before. Adding it here now: The kernel client only copies off the layout when given Fw or or Fr caps. We could change the MDS to gratuitously set the layout field for any inode in the traces, and then just have the client always copy them. The expectation would be that you can't actually _do_ anything with the layout without Fr or Fw, and we can just ensure that it's updated otherwise. We might need to do something to both client and MDS for that. Another possibility: The client currently does a getattr on most ceph vxattrs. The cap mask is usually 0, unless it's ceph.rstats* in which case it sets CEPH_CAP_FILE_WREXTEND. Is there a cap mask we could request that would give us the layout? Maybe we could issue a getattr for Fr caps, and just assume that we'll get the layout as a matter of course?
Moving this out to 5.2 — it's still in-progress upstream and I'm not aware of any urgent need.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage Security, Bug Fix, and Enhancement Update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5997
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days