Bug 1053579 - i/o error when one user tries to access RHS volume over NFS with 100+ GIDs
Summary: i/o error when one user tries to access RHS volume over NFS with 100+ GIDs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: nfs
Version: mainline
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: ---
Assignee: Niels de Vos
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1044646 1096425
TreeView+ depends on / blocked
 
Reported: 2014-01-15 12:50 UTC by santosh pradhan
Modified: 2014-11-11 08:26 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.6.0beta1
Clone Of: 1044646
: 1096425 (view as bug list)
Environment:
Last Closed: 2014-11-11 08:26:59 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Comment 1 Anand Avati 2014-01-15 13:09:01 UTC
REVIEW: http://review.gluster.org/6715 (gNFS: I/O Error with more than 128 aux-gids) posted (#1) for review on master by Santosh Pradhan (spradhan)

Comment 2 Anand Avati 2014-01-15 13:18:35 UTC
REVIEW: http://review.gluster.org/6715 (gNFS: I/O Error with more than 128 aux-gids) posted (#2) for review on master by Santosh Pradhan (spradhan)

Comment 3 Anand Avati 2014-01-15 13:21:09 UTC
REVIEW: http://review.gluster.org/6715 (gNFS: I/O Error with more than 128 aux-gids) posted (#3) for review on master by Santosh Pradhan (spradhan)

Comment 4 Anand Avati 2014-03-06 16:40:02 UTC
REVIEW: http://review.gluster.org/7202 (rpc: warn and truncate grouplist when more then 93 groups are used) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 5 Anand Avati 2014-03-07 17:32:10 UTC
REVIEW: http://review.gluster.org/7202 (rpc: warn and truncate grouplist when more then 93 groups are used) posted (#2) for review on master by Niels de Vos (ndevos)

Comment 6 Anand Avati 2014-03-20 17:21:44 UTC
REVIEW: http://review.gluster.org/7202 (rpc: warn and truncate grouplist if RPC/AUTH can not hold everything) posted (#3) for review on master by Niels de Vos (ndevos)

Comment 8 Anand Avati 2014-04-08 17:51:04 UTC
COMMIT: http://review.gluster.org/7202 committed in master by Vijay Bellur (vbellur) 
------
commit 8235de189845986a535d676b1fd2c894b9c02e52
Author: Niels de Vos <ndevos>
Date:   Thu Mar 20 18:13:49 2014 +0100

    rpc: warn and truncate grouplist if RPC/AUTH can not hold everything
    
    The GlusterFS protocol currently uses AUTH_GLUSTERFS_V2 in the RPC/AUTH
    header. This header contains the uid, gid and auxiliary groups of the
    user/process that accesses the Gluster Volume.
    
    The AUTH_GLUSTERFS_V2 structure allows up to 65535 auxiliary groups to
    be passed on. Unfortunately, the RPC/AUTH header is limited to 400 bytes
    by the RPC specification: http://tools.ietf.org/html/rfc5531#section-8.2
    
    In order to not cause complete failures on the client-side when trying
    to encode a AUTH_GLUSTERFS_V2 that would result in more than 400 bytes,
    we can calculate the expected size of the other elements:
    
        1 | pid
        1 | uid
        1 | gid
        1 | groups_len
       XX | groups_val (GF_MAX_AUX_GROUPS=65535)
        1 | lk_owner_len
       YY | lk_owner_val (GF_MAX_LOCK_OWNER_LEN=1024)
      ----+-------------------------------------------
        5 | total xdr-units
    
      one XDR-unit is defined as BYTES_PER_XDR_UNIT = 4 bytes
      MAX_AUTH_BYTES = 400 is the maximum, this is 100 xdr-units.
      XX + YY can be 95 to fill the 100 xdr-units.
    
      Note that the on-wire protocol has tighter requirements than the
      internal structures. It is possible for xlators to use more groups and
      a bigger lk_owner than that can be sent by a GlusterFS-client.
    
    This change prevents overflows when allocating the RPC/AUTH header. Two
    new macros are introduced to calculate the number of groups that fit in
    the RPC/AUTH header, when taking the size of the lk_owner in account. In
    case the list of groups exceeds the maximum possible, only the first
    groups are passed over the RPC/GlusterFS protocol to the bricks.
    A warning is added to the logs, so that most system administrators will
    get informed.
    
    The reducing of the number of groups is not a new inventions. The
    RPC/AUTH header (AUTH_SYS or AUTH_UNIX) that NFS uses has a limit of 16
    groups. Most, if not all, NFS-clients will reduce any bigger number of
    groups to 16. (nfs.server-aux-gids can be used to workaround the limit
    of 16 groups, but the Gluster NFS-server will be limited to a maximum of
    93 groups, or fewer in case the lk_owner structure contains more items.)
    
    Change-Id: I8410e59d0fd246d601b54b961d3ae9cb5a858c10
    BUG: 1053579
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/7202
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Harshavardhana <harsha>
    Reviewed-by: Santosh Pradhan <spradhan>
    Reviewed-by: Vijay Bellur <vbellur>

Comment 9 Anand Avati 2014-04-17 16:43:07 UTC
REVIEW: http://review.gluster.org/7501 (protocol: implement server.manage-gids for group resolving on the bricks) posted (#1) for review on master by Niels de Vos (ndevos)

Comment 10 Anand Avati 2014-04-25 12:44:20 UTC
REVIEW: http://review.gluster.org/7501 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#2) for review on master by Niels de Vos (ndevos)

Comment 11 Anand Avati 2014-04-26 06:52:47 UTC
REVIEW: http://review.gluster.org/7501 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#3) for review on master by Niels de Vos (ndevos)

Comment 12 Anand Avati 2014-04-26 09:43:04 UTC
REVIEW: http://review.gluster.org/7501 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#4) for review on master by Niels de Vos (ndevos)

Comment 13 Anand Avati 2014-04-26 10:23:05 UTC
REVIEW: http://review.gluster.org/7501 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#5) for review on master by Niels de Vos (ndevos)

Comment 14 Anand Avati 2014-04-26 10:38:37 UTC
REVIEW: http://review.gluster.org/7501 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#6) for review on master by Niels de Vos (ndevos)

Comment 15 Anand Avati 2014-04-26 16:59:40 UTC
REVIEW: http://review.gluster.org/7501 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#7) for review on master by Niels de Vos (ndevos)

Comment 16 Anand Avati 2014-04-27 12:22:30 UTC
REVIEW: http://review.gluster.org/7501 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#8) for review on master by Niels de Vos (ndevos)

Comment 17 Anand Avati 2014-05-09 19:22:46 UTC
COMMIT: http://review.gluster.org/7501 committed in master by Anand Avati (avati) 
------
commit 2fd499d148fc8865c77de8b2c73fe0b7e1737882
Author: Niels de Vos <ndevos>
Date:   Thu Apr 17 18:32:07 2014 +0200

    rpc: implement server.manage-gids for group resolving on the bricks
    
    The new volume option 'server.manage-gids' can be enabled in
    environments where a user belongs to more than the current absolute
    maximum of 93 groups. This option triggers the following behavior:
    
    1. The AUTH_GLUSTERFS structure sent by GlusterFS clients (fuse, nfs or
       libgfapi) will contain only one (1) auxiliary group, instead of
       a full list. This reduces network usage and prevents problems in
       encoding the AUTH_GLUSTERFS structure which should fit in 400 bytes.
    2. The single group in the RPC Calls received by the server is replaced
       by resolving the groups server-side. Permission checks and similar in
       lower xlators are applied against the full list of groups where the
       user belongs to, and not the single auxiliary group that the client
       sent.
    
    Change-Id: I9e540de13e3022f8b63ff893ecba511129a47b91
    BUG: 1053579
    Signed-off-by: Niels de Vos <ndevos>
    Reviewed-on: http://review.gluster.org/7501
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Santosh Pradhan <spradhan>
    Reviewed-by: Harshavardhana <harsha>
    Reviewed-by: Anand Avati <avati>

Comment 19 Niels de Vos 2014-09-22 12:35:03 UTC
A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 20 Niels de Vos 2014-11-11 08:26:59 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users


Note You need to log in before you can comment on or make changes to this bug.