Description of problem: Company standard image has default umask of 077 - following procedures does not produce a working install. Version-Release number of selected component (if applicable): How reproducible: failed on two different install environments with different umask values Steps to Reproduce: 1. SA sets default umask to modified value 2. follow install guide 3. fails Actual results: Expected results: Additional info: Umask requirement should be in documentation.
Would you mind clarifying exactly what fails?
It's been a few weeks since the PoC kicked off, but if memory serves the first problem we ran into was incorrect permissions on the passenger sockets located in /tmp on the broker. If you override those perms, apache/passenger are happy but the next issue arrises on the nodes when you attempt to provision a gear. Perms are wrong for the gear's container. Shelling to the gear fails and app deliver fails causing errors in the UI. Explicitly specifying a default umask of 022 in the install docs for both broker and node as a prerequisite prior to any package installation will be sufficient for now but we may need additional work done at a later date to override the system umask in the packages. (hope it doesn't come to that)
Back in April, we ran into a similar issue with a customer's installation where the broker wouldn't start because Bundler had created Gemfile.lock with the wrong file mode and (after fixing that problem) the node failed to create applications because the useradd command silently failed. We probably would have run into additional problems if we had continued to work around the problems. We discovered that /etc/sysconfig/init on the customer's system had the command `umask 027`, and changing that resolved the issues. We could add a check for /etc/sysconfig/init, but chances are that there are other places where umask can get set to a problematic value. I also looked into adding a check that would get the umask of the mcollective daemon's running process, but the only ways to do that were convoluted (the simplest involved hooking into the process using gdb). So checking umask turns out to be a tricky problem to do well and thoroughly. One approach we could take is to make the initscripts set an explicit umask (and hope nothing manages to override that setting). We can combine that with checks on file modes (bug 1002559), which might catch some problems but wouldn't catch all and would at best catch problems after the fact.
In this case, the umask was overridden here: /etc/profile A quick spelunk around a OpenShift broker node on RHEL 6.4 shows the following places that have umask explicitly set: ./rc5.d/S12rsyslog: umask 077 ./rc6.d/K88rsyslog: umask 077 ./ltrace.conf:octal umask(octal); ./ltrace.conf:octal SYS_umask(octal); ./rc4.d/S12rsyslog: umask 077 ./rc1.d/K88rsyslog: umask 077 ./rc.d/rc5.d/S12rsyslog: umask 077 ./rc.d/rc6.d/K88rsyslog: umask 077 ./rc.d/rc4.d/S12rsyslog: umask 077 ./rc.d/rc1.d/K88rsyslog: umask 077 ./rc.d/rc2.d/S12rsyslog: umask 077 ./rc.d/rc0.d/K88rsyslog: umask 077 ./rc.d/rc3.d/S12rsyslog: umask 077 ./rc.d/init.d/rsyslog: umask 077 ./rc.d/init.d/functions:umask 022 ./csh.cshrc: umask 002 ./csh.cshrc: umask 022 ./sysconfig/network-scripts/ifup-post: oldumask=$(umask) ./sysconfig/network-scripts/ifup-post: umask 022 ./sysconfig/network-scripts/ifup-post: umask $oldumask ./sysconfig/network-scripts/ifup-ippp: umask 066 ./sysconfig/network-scripts/ifup-ippp: umask 022 ./sysconfig/network-scripts/ifup-isdn: umask 066 ./sysconfig/network-scripts/ifup-isdn: umask 022 ./latrace.d/stat.conf:__mode_t umask(__mode_t mask); ./latrace.d/stat.conf:__mode_t getumask(); ./latrace.d/sysdeps/x86_64/syscall.conf: SYS_umask = 95, ./rc2.d/S12rsyslog: umask 077 ./bashrc: umask 002 ./bashrc: umask 022 ./profile: umask 002 ./profile: umask 022 ./rc0.d/K88rsyslog: umask 077 ./rc3.d/S12rsyslog: umask 077 ./init.d/rsyslog: umask 077 ./init.d/functions:umask 022 ./lvm/lvm.conf: umask = 077 ./lvm/lvm.conf: #umask = 022 ./ssl/certs/make-dummy-cert:umask 077 ./ssl/certs/Makefile: umask 77 ; \ ./ssl/certs/Makefile: umask 77 ; \ ./ssl/certs/Makefile: umask 77 ; \ ./ssl/certs/Makefile: umask 77 ; \ ./ssl/certs/Makefile: umask 77 ; \ ./ssl/certs/Makefile: umask 77 ; \ ./pki/tls/certs/make-dummy-cert:umask 077 ./pki/tls/certs/Makefile: umask 77 ; \ ./pki/tls/certs/Makefile: umask 77 ; \ ./pki/tls/certs/Makefile: umask 77 ; \ ./pki/tls/certs/Makefile: umask 77 ; \ ./pki/tls/certs/Makefile: umask 77 ; \ ./pki/tls/certs/Makefile: umask 77 ; \ Most of these can likely be discarded but a number of them (especially the init scripts and function source) could be added to a check. -Art
I don't have access to the failing system anymore, so I can't provide more information than Aurthur provided.
Cloned this to BZ#1141884 to track any further engineering discussion around this topic there. This BZ will continue to track docs changes/additions that can go in currently.
A new "Default umask Setting" section will be included with the next OSE 2 Deploy Guide publication. Follow BZ#1141884 for developments on any changes to the product itself regarding these issues.
New section published: https://access.redhat.com/documentation/en-US/OpenShift_Enterprise/2/html-single/Deployment_Guide/index.html#Default_umask_Setting