Hide Forgot
We have identified a bug in the openvz kernel fuse framework when we use gluster client inside an openvz container. The client does not properly parse the uid/gid info before making request from the server, it uses the host pid vs the openvz remapped client pid to discover the UID/GID info. We are considering using an unsupported change to /etc/glusterd/vols files which would resolve the issue in our environment by excluding the "type features/access-control" volume (skipping the acl translator) in our servers, relying entirely on clients for uid/gid permission mapping. We believe a better resolution to this issue is to use the kernel fuse 2.8 capability to get the list of supplementary groups for the owner of a given pid verses parsing it out of /proc The essence of the problem is highlighted here https://github.com/gluster/glusterfs/blob/v3.2.3/xlators/mount/fuse/src/fuse-helpers.c#L151 ret = snprintf (filename, 128, "/proc/%d/status", frame->root->pid); within the openvz environment the host pid is returned, not the remapped container pid with correct info. The following posts seem related to the issue: http://sourceforge.net/mailarchive/message.php?msg_id=26973994 http://lwn.net/Articles/259217/ Here is an example of the function call to get group info from fuse 2.8: http://fuse.sourcearchive.com/documentation/2.8.4-1/fuse__lowlevel_8h_57f4dabcf044aafcdba6c4682b3a1869.html We are using the openvz kernel 2.6.18-274.el5.028stab093.2 which is built by openvz for CentOS 5.6 but it is not clear which kernel fuse is in there, we are having trouble finding the openvz kernel patch set too. Thanks to my colleagues who have identified and focused this issue. Can we provide more info to reproduce the problem? Is there a better interim fix? Regards, -George
Is this a reasonable workaround?
The mapping of the pids that the kernel provides through the FUSE protocol to FUSE server that runs in an OpenVZ container should be done in-kernel. AFAICS OpenVZ adds only a generic compatibility glue to FUSE which makes it possible to use this kernel service from a container; no semantic fixes are provided. (For reference, as I checked, these are the OpenVZ modifications for FUSE: - For 2.6.18, ie. your kernel: http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=1e1b7d1 http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=9fd1662 http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=7a016ad http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=5f35bc7 - For 2.6.32, ie. latest kernel they support: http://git.openvz.org/?p=linux-2.6.32-openvz;a=commitdiff;h=8f48174c ) So I think the issue should be reported to / handled by the OpenVZ project. It would be possible to work this around in glusterfs code
The mapping of the pids that the kernel provides through the FUSE protocol to FUSE server that runs in an OpenVZ container should be done in-kernel. AFAICS OpenVZ adds only a generic compatibility glue to FUSE which makes it possible to use this kernel service from a container; no semantic fixes are provided. (For reference, as I checked, these are the OpenVZ modifications for FUSE: - For 2.6.18, ie. your kernel: http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=1e1b7d1 http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=9fd1662 http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=7a016ad http://git.openvz.org/?p=linux-2.6.18-openvz;a=commitdiff;h=5f35bc7 - For 2.6.32, ie. latest kernel they support: http://git.openvz.org/?p=linux-2.6.32-openvz;a=commitdiff;h=8f48174c ) So I think the issue should be reported to / handled by the OpenVZ project. @avati: should we provide a way to mount the fs with "default_permissions" and have the brick servers not do ACL? I know there is "--acl" glusterfs option that controls client behavior but why can we not control this for bricks? %%%%%%%%%%%%%%%%%%%% TL;DR: %%%%%%%%%%%%%%%%%%%%%%%%%%%% Therefore, if no objection raised against the above analysis, I will close this bug with RESOLVED/WONTFIX. For now, I set it to severity normal, as Glusterfs has no official support for OpenVZ. If we happen to agree that a "volume set"-able option is to be introduced to control brick's ACL loading, let that make a separate enhancement entry.
No objection was raised by anyone against my claim that the fix for this issue is out of glusterfs' scope.