Bug 1096425
Summary: | i/o error when one user tries to access RHS volume over NFS with 100+ GIDs | ||
---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | Niels de Vos <ndevos> |
Component: | nfs | Assignee: | Niels de Vos <ndevos> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 3.5.0 | CC: | gluster-bugs, spradhan, vumrao |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.5.1beta2 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 1053579 | Environment: | |
Last Closed: | 2014-06-24 11:05:21 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1053579, 1104997 | ||
Bug Blocks: | 1071800 |
Description
Niels de Vos
2014-05-10 02:16:25 UTC
REVIEW: http://review.gluster.org/7829 (rpc: warn and truncate grouplist if RPC/AUTH can not hold everything) posted (#1) for review on release-3.5 by Niels de Vos (ndevos) REVIEW: http://review.gluster.org/7830 (rpc: implement server.manage-gids for group resolving on the bricks) posted (#1) for review on release-3.5 by Niels de Vos (ndevos) COMMIT: http://review.gluster.org/7829 committed in release-3.5 by Niels de Vos (ndevos) ------ commit 57ec16e7f6d08b9a1c07f8ece3db630b08557372 Author: Niels de Vos <ndevos> Date: Sun May 11 22:51:15 2014 -0300 rpc: warn and truncate grouplist if RPC/AUTH can not hold everything The GlusterFS protocol currently uses AUTH_GLUSTERFS_V2 in the RPC/AUTH header. This header contains the uid, gid and auxiliary groups of the user/process that accesses the Gluster Volume. The AUTH_GLUSTERFS_V2 structure allows up to 65535 auxiliary groups to be passed on. Unfortunately, the RPC/AUTH header is limited to 400 bytes by the RPC specification: http://tools.ietf.org/html/rfc5531#section-8.2 In order to not cause complete failures on the client-side when trying to encode a AUTH_GLUSTERFS_V2 that would result in more than 400 bytes, we can calculate the expected size of the other elements: 1 | pid 1 | uid 1 | gid 1 | groups_len XX | groups_val (GF_MAX_AUX_GROUPS=65535) 1 | lk_owner_len YY | lk_owner_val (GF_MAX_LOCK_OWNER_LEN=1024) ----+------------------------------------------- 5 | total xdr-units one XDR-unit is defined as BYTES_PER_XDR_UNIT = 4 bytes MAX_AUTH_BYTES = 400 is the maximum, this is 100 xdr-units. XX + YY can be 95 to fill the 100 xdr-units. Note that the on-wire protocol has tighter requirements than the internal structures. It is possible for xlators to use more groups and a bigger lk_owner than that can be sent by a GlusterFS-client. This change prevents overflows when allocating the RPC/AUTH header. Two new macros are introduced to calculate the number of groups that fit in the RPC/AUTH header, when taking the size of the lk_owner in account. In case the list of groups exceeds the maximum possible, only the first groups are passed over the RPC/GlusterFS protocol to the bricks. A warning is added to the logs, so that most system administrators will get informed. The reducing of the number of groups is not a new inventions. The RPC/AUTH header (AUTH_SYS or AUTH_UNIX) that NFS uses has a limit of 16 groups. Most, if not all, NFS-clients will reduce any bigger number of groups to 16. (nfs.server-aux-gids can be used to workaround the limit of 16 groups, but the Gluster NFS-server will be limited to a maximum of 93 groups, or fewer in case the lk_owner structure contains more items.) Cherry picked from commit 8235de189845986a535d676b1fd2c894b9c02e52: > BUG: 1053579 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: http://review.gluster.org/7202 > Tested-by: Gluster Build System <jenkins.com> > Reviewed-by: Harshavardhana <harsha> > Reviewed-by: Santosh Pradhan <spradhan> > Reviewed-by: Vijay Bellur <vbellur> Change-Id: I8410e59d0fd246d601b54b961d3ae9cb5a858c10 BUG: 1096425 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: http://review.gluster.org/7829 Reviewed-by: Lalatendu Mohanty <lmohanty> Tested-by: Gluster Build System <jenkins.com> COMMIT: http://review.gluster.org/7830 committed in release-3.5 by Niels de Vos (ndevos) ------ commit 6b624e5502193b9d57116fb341119c8468f9758f Author: Niels de Vos <ndevos> Date: Tue May 20 16:12:03 2014 +0200 rpc: implement server.manage-gids for group resolving on the bricks The new volume option 'server.manage-gids' can be enabled in environments where a user belongs to more than the current absolute maximum of 93 groups. This option triggers the following behavior: 1. The AUTH_GLUSTERFS structure sent by GlusterFS clients (fuse, nfs or libgfapi) will contain only one (1) auxiliary group, instead of a full list. This reduces network usage and prevents problems in encoding the AUTH_GLUSTERFS structure which should fit in 400 bytes. 2. The single group in the RPC Calls received by the server is replaced by resolving the groups server-side. Permission checks and similar in lower xlators are applied against the full list of groups where the user belongs to, and not the single auxiliary group that the client sent. Cherry picked from commit 2fd499d148fc8865c77de8b2c73fe0b7e1737882: > BUG: 1053579 > Signed-off-by: Niels de Vos <ndevos> > Reviewed-on: http://review.gluster.org/7501 > Tested-by: Gluster Build System <jenkins.com> > Reviewed-by: Santosh Pradhan <spradhan> > Reviewed-by: Harshavardhana <harsha> > Reviewed-by: Anand Avati <avati> Change-Id: I9e540de13e3022f8b63ff893ecba511129a47b91 BUG: 1096425 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: http://review.gluster.org/7830 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Santosh Pradhan <spradhan> The first (and last?) Beta for GlusterFS 3.5.1 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.1beta release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-May/040377.html [2] http://supercolony.gluster.org/pipermail/gluster-users/ The server.manage-gids option causes an issue related to the op-version. Setting server.manage-gids increases the op-version to 4 (intentional), but the maximum op-version that glusterd supports is 3 (in 3.5.0). A restart of glusterd fails because it can not start a volume that requires a higher op-version than it supports. Discussion on the devel list: - http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6699 REVIEW: http://review.gluster.org/8010 (glusterd: Better op-version values and ranges) posted (#1) for review on release-3.5 by Niels de Vos (ndevos) COMMIT: http://review.gluster.org/8010 committed in release-3.5 by Niels de Vos (ndevos) ------ commit 6648b92e980c9d59c719a461b37951109839182e Author: Niels de Vos <ndevos> Date: Sun Jun 8 18:39:55 2014 +0200 glusterd: Better op-version values and ranges Till now, the op-version was an incrementing integer that was incremented by 1 for every Y release (when using the X.Y.Z release numbering). This is not flexible enough to handle backports of features into Z releases. Going forward, from the upcoming 3.6.0 and 3.5.1 releases, the op-versions will be multi-digit integer values composed of the version numbers, instead of a simple incrementing integer. An X.Y.Z release will have XYZ as its op-version. Y and Z will always be 2 digits wide and will be padded with 0 if required. This way of bumping op-versions allows for gaps in between the subsequent Y releases. These gaps will allow backporting features from new Y releases into old Z releases. Change-Id: Ib6a09989f03521146e299ec0588fe36273191e47 Depends-on: http://review.gluster.org/7963 BUG: 1096425 Signed-off-by: Niels de Vos <ndevos> Reviewed-on: http://review.gluster.org/8010 Tested-by: Gluster Build System <jenkins.com> Reviewed-by: Atin Mukherjee <amukherj> The second (and last?) Beta for GlusterFS 3.5.1 has been released [1]. Please verify if the release solves this bug report for you. In case the glusterfs-3.5.1beta2 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED. Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-June/040547.html [2] http://supercolony.gluster.org/pipermail/gluster-users/ This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.1, please reopen this bug report. glusterfs-3.5.1 has been announced on the Gluster Users mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://supercolony.gluster.org/pipermail/gluster-users/2014-June/040723.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user |