Description of problem: We're using OpenAFS on our systems and most of our webpages are stored in AFS. We have a lot of small projects for which a separate server would be a waste of 'metal'. Even in a virtual environment. So we're hosting a lot of apache instances on a single machine. Beause suexec doesn't work in an AFS environment, each instance is started by root with its own IP (to be able to talk HTTPS) and in a PAG with a separate token for a service user (to isolate the projects). Although each apache switches over to the service user, the initial tokens are acquired by root. On RHEL 3 with the old 2.4 kernel this was never a problem. But now... with RHEl 5 the Kernel keyring quotas are to restricted for our environment. As result of this problem, processes are running unauthenticated and are unable to deliver the requested data. As an additional problem the login for users with home directories in AFS (our webmasters get limited access) gets impossible on these systems. We just hit this wall while migrating from RHEl 3 to RHEL 5 with some of our webservers. [root@lvr11 ~]# cat /proc/key-users 0: 99 98/98 96/100 1681/10000 32: 2 2/2 2/100 56/10000 38: 2 2/2 2/100 56/10000 43: 2 2/2 2/100 56/10000 51: 2 2/2 2/100 56/10000 68: 2 2/2 2/100 56/10000 81: 2 2/2 2/100 56/10000 99: 2 2/2 2/100 56/10000 348: 2 2/2 2/100 58/10000 42216: 2 2/2 2/100 62/10000 55188: 3 3/3 3/100 72/10000 56537: 2 2/2 2/100 62/10000 63743: 2 2/2 2/100 62/10000 68054: 2 2/2 2/100 62/10000 .... Btw.: We have some machines (RHEL 3) with about hundred (!) different projects which need tokens. For us, this limitations are a real showstopper for our migration from RHEL 3 to RHEL 5. On our webservers RHEL 5 is nearly useless at the moment. There is a patch available from David Howells which makes these limits configurable via /proc/sys: http://lkml.org/lkml/2008/3/28/225 We request a backport of this patch to the RHEL 5 kernel as soon as possible. Version-Release number of selected component (if applicable): 2.6.18-53.1.13.el5 How reproducible: Each time Steps to Reproduce: 1. In AFS environment call pagsh in different terminal windows (lot of...) 2. Call 'klog <user>' to get a token (or kinit in krb5 environment) 3. Try to read data only available for <user> in AFS Actual results: Some processes are unauthenticated in AFS. Expected results: Each process is authenticated in AFS. Additional info:
Correct Template ---------------- SEG RFE Template [Customer/Frontline driven section] -- This section should be completed by the front-line engineer with the assistance of the customer. 1.) Who is the customer? University of Cologne | Universität zu Köln http://www.pressoffice.uni-koeln.de/ 2.) What is the exact nature of the problem trying to be solved with this request? Customer is requesting a kernel modification: keyring quotas controllable through /proc/sys. --- The problem is caused by the value for KEYQUOTA_MAX_KEYS (100) in the kernel source. Kernel keyring quotas are too restricted for the customer's environment. The limit of 100 rings for a single user is too small for their usecase; OpenAFS is the solution that hits this limit, but every other application that does so would have the very same problem. We are not the only people to have been bitten by this limit, hence the patch from David Howells (a RedHat employee). As result of this problem, processes are running unauthenticated and are unable to deliver the requested data. As an additional problem the login for users with home directories in AFS is impossible. We just hit this wall while migrating from RHEl 3 to RHEL 5 with some of our webservers. --- 3.) What, if any, business requirements are satisfied by this request? (What is the use case context?) They're using OpenAFS on their systems and most of their webpages are stored in AFS. They have a lot of small projects for which a separate server would be a waste of 'metal'. Even in a virtual environment. So several apache instances are hosted on a single machine. Because suexec doesn't work in an AFS environment, each instance is started by root with its own IP (to be able to talk HTTPS) and in a PAG with a separate token for a service user (to isolate the projects). Although each apache switches over to the service user, the initial tokens are acquired by root. On RHEL 3 with the old 2.4 kernel this was never a problem. They have some machines (RHEL 3) with about hundred (!) different projects which need tokens. For the customer, this limitations are a real showstopper for their migration from RHEL 3 to RHEL 5. On all their webservers RHEL 5 is nearly useless at the moment. 4.) List the functional requirement(s) for performing the action(s) that are not presently possible. Please focus on describing the problem related requirements without projecting any specific solution. The only functional requirement is to be able to change the KEYQUOTA_MAX_KEYS kernel parameter via /proc/sys and /etc/sysctl.conf. 5.) Each functional requirement must have clear acceptance criteria so Red Hat understands what success looks like. If test cases can be provided this would be even more ideal (bonus points for RHTS test cases). A patch was written by a Red Hat engineer (David Howells) and apparently there are no technical reasons that this patch can't be used in RHEL-5. The keys are also used by CIFS now too, possibly in RHEL-5 and the patch doesn't break kABI in any way. 6.) What is the desired release vehicle to satisfy these requirements? Major or Minor release? Hotfix, Minor. In the next kernel update if possible. 7.) Please justify with reference to the release vehicle policy described in the RHEL Inclusion Criteria wiki page Since the customer is still stuck on RHEL3 because of that missing functionality, I think he wouldn't mind using the latest kernel. 8.) What package(s) are affected by this RFE? (List "new" if new technology is likely to be required) kernel. [Red Hat Sales/Frontline] -- This section should be completed by the front line engineer with the assistance of the account manager/sales rep. 9.) Who is the sales sponsor? Daniel Stiff 10.) What is the Red Hat business opportunity with this customer? Currently have over 170 RHEL subscriptions which we are trying to co-term for 6 months now! Do also participate in a statewide program of several Northrhine-Westfalian Universities in evaluating the Satellite server for use of its multi - org functionalities. 11.) What is the status and risk to the contract if this RFE is not satisfied? That we loose a possible sales opportunity of up to 80K Euros! [Red Hat Engineering] -- this section will be completed by development engineering 12.) What is the scope of this request for work required and risk? *** Answer 12 *** 13.) What technology (specific list of packages) is affected by this RFE if not fully captured above? *** Answer 13 *** Internal Status set to 'Waiting on SEG' Severity set to: High Priority set to: 2 This event sent from IssueTracker by jwest issue 173344
Updating PM score.
Is anything happening with this? We are hit by this aswell as we have somewhere between 50-75 users concurrently using openafs on the same machine, after a while the users end up in the uid_session keyring instead of getting a new session keyring when logging in.
Event posted on 04-14-2010 06:31pm EDT by dmosby Justification appears below. I am told that they will have to move from Red Hat if they cannot get this increased. I have no way to know if that is true. Initially they asked for the patch which was first mentioned in this Issue Tracker. Only when we told them that they would not be getting it did they further explain the problem and we found that they did not actually require all the features of the patch. All they require is expanding the hard coded limit of 100. It seems a pretty safe fix to alter that to 1000 and recompile. ----------------------------------------------------- We are using large machines (48 way systems) using a grid based job queuing system, where each job requires 4 keyrings and the systems can handle 96 jobs. So, with the number of jobs that these systems can run, we are exceeding the hard-coded limit of 100 keyrings (for root). This means that we are only utilizing a fraction of the system. This seems to be a fairly wide-spread issue that I'm sure other companies and users are running into especially since systems are getting more and more cores/capacity. Based on this capacity lost in the grid, many of our hardware design team's schedules will dramatically slip. 10+ highly visible projects will be impacted if this is not fixed. Please let me know as soon as possible if you find out one way or another if there is hope in getting an official hotfix. If not, we need to make other plans as soon as possible. This event sent from IssueTracker by jkachuck issue 500943
in kernel-2.6.18-198.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Please update the appropriate value in the Verified field (cf_verified) to indicate this fix has been successfully verified. Include a comment with verification details.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0017.html