Bug 764245 (GLUSTER-2513)

Summary: [FEAT] glusterfs requires CAP_SYS_ADMIN capability for "trusted" extended attributes - container unfriendly
Product: [Community] GlusterFS Reporter: igor sviridov <sia>
Component: posixAssignee: bugs <bugs>
Status: CLOSED EOL QA Contact:
Severity: low Docs Contact:
Priority: medium    
Version: mainlineCC: bugs, cww, gluster-bugs, redhatbugzilla, rwheeler
Target Milestone: ---Keywords: FutureFeature, Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-22 15:46:38 UTC Type: ---
Regression: --- Mount Type: fuse
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description igor sviridov 2011-03-11 01:11:06 UTC
hi,

Glusterfs posix backend stores data in extended file attributes, specifically Trusted extended attributes.
Those require CAP_SYS_ADMIN capability to access/change.

This requirement makes it impractical to run glusterfs in light-weight virtualization environments (lxc/openvz/linux vserver), since delegating CAP_SYS_ADMIN to guests is usually undesirable.

One approach would be to allow using user extended attributes (with config or command-line option).

Here is example discussion and common work-arounds:
http://serverfault.com/questions/29615/how-to-mount-glusterfs-inside-a-openvz-container

Here is example of error when started in container without CAP_SYS_ADMIN capability:

[2011-03-08 20:48:28.747038] C [posix.c:4339:init] gvol-posix: Extended attribute not supported, exiting.
[2011-03-08 20:48:28.747053] E [xlator.c:909:xlator_init] gvol-posix: Initialization of volume 'gvol-posix' failed, review your volfile again

--igor

Comment 1 Harshavardhana 2011-11-07 20:49:46 UTC
(In reply to comment #0)
> hi,
> 
> Glusterfs posix backend stores data in extended file attributes, specifically
> Trusted extended attributes.
> Those require CAP_SYS_ADMIN capability to access/change.
> 
> This requirement makes it impractical to run glusterfs in light-weight
> virtualization environments (lxc/openvz/linux vserver), since delegating
> CAP_SYS_ADMIN to guests is usually undesirable.
> 
Problem is the other way round, CAP_SYS_ADMIN is a standard mechanism where a given application is allowed to behave in Linux in 'extended' attribute space. 

Now that lxc/openvz/linux vserver fail with this is their problem rather than GlusterFS itself. 

For better virtualization environments recommended is KVM. 

lxc/openvz/linux vserver are not tested in-house, in-turn calls for unwarranted issues.

Comment 2 igor sviridov 2011-11-07 21:57:31 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > Glusterfs posix backend stores data in extended file attributes, specifically
> > Trusted extended attributes.
> > Those require CAP_SYS_ADMIN capability to access/change.
> > 
> > This requirement makes it impractical to run glusterfs in light-weight
> > virtualization environments (lxc/openvz/linux vserver), since delegating
> > CAP_SYS_ADMIN to guests is usually undesirable.
> > 
> Problem is the other way round, CAP_SYS_ADMIN is a standard mechanism where a
> given application is allowed to behave in Linux in 'extended' attribute space. 
> Now that lxc/openvz/linux vserver fail with this is their problem rather than
> GlusterFS itself. 

I believe providing an option to use "user" extended attributes instead of "trusted" extended attributes could allow for better security trade-offs in some configurations using  light-weight virtualization environments. It's possible to allow CAP_SYS_ADMIN for each container, but in many configurations using "user" attributes instead would be a better choice.
  
> For better virtualization environments recommended is KVM.
> lxc/openvz/linux vserver are not tested in-house, in-turn calls for unwarranted issues.

Well, KVM and container-based virtualization are just two different beasts, you cannot say one is better than the other - they just have different purposes/profiles and user bases.

--igor

Comment 3 Niels de Vos 2014-11-09 11:17:10 UTC
Using containers to host bricks requires writing the "trusted.*" xattrs to the underlaying filesystem. The brick processes need to have the CAP_SYS_ADMIN capability (see: man 7 capabilities). Containers are expected to have little privileges, and grating CAP_SYS_ADMIN to a container is frowned upon.

Proposed solution:
- instead of using "trusted.*" xattrs on the bricks, use "user.*" xattrs

Implementation details/notes/ideas:
- provide a volume or mount option to specify that "user.*" should be used
- it is impractical (and would hurt compatibility) when all the xattrs would be
  replaced by "user.*" throughout the whole sources. It would be simpler and more
  compatible to have the posix-xlator handle the volume/mount option. A fallback
  on checking for the other xattr prefix would be more efficient there too.

Comment 4 Kaleb KEITHLEY 2015-10-22 15:46:38 UTC
because of the large number of bugs filed against mainline version\ is ambiguous and about to be removed as a choice.

If you believe this is still a bug, please change the status back to NEW and choose the appropriate, applicable version for it.

Comment 5 Andrew Miller 2020-12-14 11:35:59 UTC
I have written a tiny library which you can load with LD_PRELOAD that translates attribute names - trusted. into user.tr. - and will test how well GlusterFS behaves when glusterd is run for all bricks in an unprivileged / no CAP_SYS_ADMIN docker container, with xattr names under user.

Initial testing of it hasn't found anything broken by it.

Here is my library in case other people want to try it: https://gitlab.com/A1kmm/fake-trusted-xattr

You could consider either incorporating this approach into the Docker container, or potentially making it an option in GlusterFS itself.