Bug 1428915
Summary: | [UPDATES] selinux prevents haproxy stat socket creation | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Yurii Prokulevych <yprokule> |
Component: | puppet-tripleo | Assignee: | RHOS Maint <rhos-maint> |
Status: | CLOSED ERRATA | QA Contact: | Tomas Jamrisko <tjamrisk> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 11.0 (Ocata) | CC: | berrange, bperkins, dasmith, eglynn, fdinitto, jjoyce, jschluet, kchamart, lbezdick, lhh, mburns, mcornea, mgrepl, michele, rhallise, rohara, royoung, sbauza, sferdjao, sgordon, slinaber, srevivo, tvignaud, ushkalim, vromanso |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 11.0 (Ocata) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | puppet-tripleo-6.3.0-5.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-05-17 20:04:43 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1394025 |
Description
Yurii Prokulevych
2017-03-03 15:23:17 UTC
So the error was from the nova cell command failing to list the domains: Mar 3 10:08:24 localhost os-collect-config: #033[1;31mError: Failed to apply catalog: Command: 'openstack ["domain", "list", "--quiet", "--format", "csv", []]' has been running for more than 40 seconds (tried 4, for a total of 170 seconds)#033[0m Mar 3 10:08:24 localhost os-collect-config: [2017-03-03 15:08:24,550] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/57c9d1cf-5bf6-4074-93a9-37215261fc1e.pp. [1] The cluster is okay except for haproxy (which has constraints with the VIPs so those are down as a consequence): Cluster name: tripleo_cluster Stack: corosync Current DC: controller-1 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum Last updated: Fri Mar 3 15:34:07 2017 Last change: Fri Mar 3 14:51:10 2017 by root via cibadmin on controller-0 *** Resource management is DISABLED *** The cluster will not attempt to start, stop or recover services 3 nodes and 20 resources configured Online: [ controller-0 controller-1 controller-2 ] Full list of resources: Master/Slave Set: galera-master [galera] (unmanaged) galera (ocf::heartbeat:galera): Master controller-2 (unmanaged) galera (ocf::heartbeat:galera): Master controller-1 (unmanaged) galera (ocf::heartbeat:galera): Master controller-0 (unmanaged) Clone Set: rabbitmq-clone [rabbitmq] (unmanaged) rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started controller-2 (unmanaged) rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started controller-1 (unmanaged) rabbitmq (ocf::heartbeat:rabbitmq-cluster): Started controller-0 (unmanaged) Master/Slave Set: redis-master [redis] (unmanaged) redis (ocf::heartbeat:redis): Slave controller-2 (unmanaged) redis (ocf::heartbeat:redis): Master controller-1 (unmanaged) redis (ocf::heartbeat:redis): Slave controller-0 (unmanaged) ip-192.168.24.11 (ocf::heartbeat:IPaddr2): Stopped (unmanaged) ip-10.0.0.101 (ocf::heartbeat:IPaddr2): Stopped (unmanaged) ip-172.17.1.18 (ocf::heartbeat:IPaddr2): Stopped (unmanaged) ip-172.17.1.11 (ocf::heartbeat:IPaddr2): Stopped (unmanaged) ip-172.17.3.12 (ocf::heartbeat:IPaddr2): Stopped (unmanaged) ip-172.17.4.18 (ocf::heartbeat:IPaddr2): Stopped (unmanaged) Clone Set: haproxy-clone [haproxy] (unmanaged) Stopped: [ controller-0 controller-1 controller-2 ] openstack-cinder-backup (systemd:openstack-cinder-backup): Started controller-1 (unmanaged) openstack-cinder-volume (systemd:openstack-cinder-volume): Started controller-0 (unmanaged) Haproxy fails to start because: messages:Mar 3 09:15:21 localhost systemd: Started Cluster Controlled haproxy. messages:Mar 3 09:15:21 localhost systemd: Starting Cluster Controlled haproxy... messages:Mar 3 09:15:21 localhost haproxy-systemd-wrapper: [WARNING] 061/141521 (284010) : Setting tune.ssl.default-dh-param to 1024 by default, if your workload permits it you should set it to at least 2048. Please set a value >= 1024 to make this warning disappear. messages:Mar 3 09:15:21 localhost haproxy-systemd-wrapper: [ALERT] 061/141521 (284010) : Starting frontend GLOBAL: error when trying to preserve previous UNIX socket [/var/run/haproxy.sock]messages:Mar 3 09:15:21 localhost haproxy-systemd-wrapper: haproxy-systemd-wrapper: exit, haproxy RC=1 messages:Mar 3 09:15:21 localhost systemd: haproxy.service: main process exited, code=exited, status=1/FAILURE messages:Mar 3 09:15:21 localhost systemd: Unit haproxy.service entered failed state. messages:Mar 3 09:15:21 localhost systemd: haproxy.service failed. messages:Mar 3 09:15:23 localhost crmd[283248]: notice: Result of start operation for haproxy on controller-0: 7 (not running) messages:Mar 3 09:15:26 localhost crmd[283248]: notice: Result of stop operation for haproxy on controller-0: 0 (ok) The corresponding configuration is: Haproxy Configuration says: ssl-default-bind-ciphers !SSLv2:kEECDH:kRSA:kEDH:kPSK:+3DES:!aNULL:!eNULL:!MD5:!EXP:!RC4:!SEED:!IDEA:!DES ssl-default-bind-options no-sslv3 stats socket /var/run/haproxy.sock mode 600 level user stats timeout 2m user haproxy Seems like a selinux issue: var/log/audit/audit.log:type=AVC msg=audit(1488550521.483:4475): avc: denied { link } for pid=284010 comm="haproxy" name="haproxy.sock" dev="tmpfs" ino=330803 scontext=system_u:system_r:haproxy_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=sock_file var/log/messages:Mar 3 09:15:21 localhost haproxy-systemd-wrapper: [ALERT] 061/141521 (284010) : Starting frontend GLOBAL: error when trying to preserve previous UNIX socket [/var/run/haproxy.sock] The file is mislabeled. We'll have to assign a context to it; let me check upstream. ./policy/modules/contrib/rhcs.fc:/var/run/haproxy\.sock.* -- gen_context(system_u:object_r:haproxy_var_run_t,s0) ... going back to RHEL 7.2.z Whatever's creating the file needs to be calling restorecon; it's being created with the incorrect label. [root@localhost ~]# touch /var/run/haproxy.sock [root@localhost ~]# ls -lZ !$ ls -lZ /var/run/haproxy.sock -rw-r--r--. root root unconfined_u:object_r:var_run_t:s0 /var/run/haproxy.sock [root@localhost ~]# restorecon !$ restorecon /var/run/haproxy.sock [root@localhost ~]# ls -lZ /var/run/haproxy.sock -rw-r--r--. root root unconfined_u:object_r:haproxy_var_run_t:s0 /var/run/haproxy.sock So we added the stat socket for stats in haproxy, by adding this to the global section: global ... stats socket /var/run/haproxy.sock mode 600 level user stats timeout 2m ... Does haproxy need to be modified to call restorecon or is something else needed here? The stats socket differs from the default in RHEL-7.3 branch: # turn on stats unix socket stats socket /var/lib/haproxy/stats [root@localhost ~]# touch /var/lib/haproxy/sock [root@localhost ~]# ls -lZ !$ ls -lZ /var/lib/haproxy/sock -rw-r--r--. root root unconfined_u:object_r:haproxy_var_lib_t:s0 /var/lib/haproxy/sock [root@localhost ~]# ls -lZ /var/lib/haproxy [root@localhost ~]# touch /var/lib/haproxy/stats [root@localhost ~]# ls -lZ /var/lib/haproxy -rw-r--r--. root root unconfined_u:object_r:haproxy_var_lib_t:s0 stats 1) Follow the location in RHEL for the stats socket (/var/lib/haproxy/stats), or 2) Add a patch to create and call restorecon somewhere in puppet-haproxy for the stats socket. I believe that either of these patches which are likely specific to running on RHEL. haproxy_var_run_t and haproxy_var_lib_t are both allowed to be utilized by haproxy. The problem with (2) is that it might not work if something unlinks /var/run/haproxy.sock I assume you'd want to change this downstream-only. Other distributions might be fine with /var/run/haproxy.sock. Thanks Lon, since the wrong path is generated via puppet-tripleo I will fix it there Note that I will go for option 1) and then later on make the path a parameter in case other distros/operators might need a different path. ACK, sounds good. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |