Bug 1428915

Summary: [UPDATES] selinux prevents haproxy stat socket creation
Product: Red Hat OpenStack Reporter: Yurii Prokulevych <yprokule>
Component: puppet-tripleoAssignee: RHOS Maint <rhos-maint>
Status: CLOSED ERRATA QA Contact: Tomas Jamrisko <tjamrisk>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 11.0 (Ocata)CC: berrange, bperkins, dasmith, eglynn, fdinitto, jjoyce, jschluet, kchamart, lbezdick, lhh, mburns, mcornea, mgrepl, michele, rhallise, rohara, royoung, sbauza, sferdjao, sgordon, slinaber, srevivo, tvignaud, ushkalim, vromanso
Target Milestone: rcKeywords: Triaged
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: puppet-tripleo-6.3.0-5.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-05-17 20:04:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1394025    

Description Yurii Prokulevych 2017-03-03 15:23:17 UTC
Description of problem:
-----------------------
RHOS-11 minor updates fails:

openstack stack failures list overcloud
overcloud.AllNodesDeploySteps.ControllerDeployment_Step3.0:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: 124cac89-bd17-44d9-8192-02d21bdfb6d0
  status: UPDATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1
  deploy_stdout: |
    ...
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/10-gnocchi_wsgi.conf]/ensure: removed
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/10-horizon_vhost.conf]/ensure: removed
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/10-panko_wsgi.conf]/ensure: removed
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.d/openstack-dashboard.conf]/ensure: removed
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.modules.d/remoteip.conf]/ensure: removed
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.modules.d/remoteip.load]/ensure: removed
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.modules.d/status.conf]/ensure: removed
    Notice: /Stage[main]/Apache/File[/etc/httpd/conf.modules.d/status.load]/ensure: removed
    Notice: /Stage[main]/Apache/Concat[/etc/httpd/conf/ports.conf]/File[/etc/httpd/conf/ports.conf]/content: content changed '{md5}35f25b87e1f8ad89b39518066549ea6e' to '{md5}5d60f0a60394ddc109afc8df5168fa5b'
    Notice: /Stage[main]/Apache::Service/Service[httpd]: Triggered 'refresh' from 1 events
    (truncated, view all with --long)
  deploy_stderr: |
    ...
    Warning: Scope(Oslo::Messaging::Rabbit[keystone_config]): The oslo_messaging rabbit_host, rabbit_hosts, rabbit_port, rabbit_userid, rabbit_password, rabbit_virtual_host parameters have been deprecated by the [DEFAULT]\transport_url. Please use oslo::messaging::default::transport_url instead.
    Warning: Scope(Oslo::Messaging::Rabbit[glance_api_config]): The oslo_messaging rabbit_host, rabbit_hosts, rabbit_port, rabbit_userid, rabbit_password, rabbit_virtual_host parameters have been deprecated by the [DEFAULT]\transport_url. Please use oslo::messaging::default::transport_url instead.
    Warning: Scope(Oslo::Messaging::Rabbit[glance_registry_config]): The oslo_messaging rabbit_host, rabbit_hosts, rabbit_port, rabbit_userid, rabbit_password, rabbit_virtual_host parameters have been deprecated by the [DEFAULT]\transport_url. Please use oslo::messaging::default::transport_url instead.
    Warning: Scope(Oslo::Messaging::Rabbit[neutron_config]): The oslo_messaging rabbit_host, rabbit_hosts, rabbit_port, rabbit_userid, rabbit_password, rabbit_virtual_host parameters have been deprecated by the [DEFAULT]\transport_url. Please use oslo::messaging::default::transport_url instead.
    Warning: Scope(Oslo::Messaging::Rabbit[ceilometer_config]): The oslo_messaging rabbit_host, rabbit_hosts, rabbit_port, rabbit_userid, rabbit_password, rabbit_virtual_host parameters have been deprecated by the [DEFAULT]\transport_url. Please use oslo::messaging::default::transport_url instead.
    Warning: Scope(Oslo::Messaging::Rabbit[aodh_config]): The oslo_messaging rabbit_host, rabbit_hosts, rabbit_port, rabbit_userid, rabbit_password, rabbit_virtual_host parameters have been deprecated by the [DEFAULT]\transport_url. Please use oslo::messaging::default::transport_url instead.
    Warning: Scope(Oslo::Messaging::Rabbit[sahara_config]): The oslo_messaging rabbit_host, rabbit_hosts, rabbit_port, rabbit_userid, rabbit_password, rabbit_virtual_host parameters have been deprecated by the [DEFAULT]\transport_url. Please use oslo::messaging::default::transport_url instead.
    Warning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.
    Error: /Stage[main]/Nova::Cell_v2::Simple_setup/Nova::Cell_v2::Cell[default]/Exec[nova-cell_v2-cell-default]/unless: Check "nova-manage  cell_v2 list_cells | grep -q default" exceeded timeout
    Error: Failed to apply catalog: Command: 'openstack ["domain", "list", "--quiet", "--format", "csv", []]' has been running for more than 40 seconds (tried 4, for a total of 170 seconds)
    (truncated, view all with --long)



Version-Release number of selected component (if applicable):
-------------------------------------------------------------
openstack-tripleo-heat-templates-6.0.0-0.20170222195630.46117f4.el7ost.noarch

openstack-nova-cert-15.0.1-0.20170224183627.6087675.el7ost.noarch
python-novaclient-7.1.0-0.20170208162119.f6e0128.el7ost.noarch
python-nova-15.0.1-0.20170224183627.6087675.el7ost.noarch
openstack-nova-placement-api-15.0.1-0.20170224183627.6087675.el7ost.noarch
openstack-nova-common-15.0.1-0.20170224183627.6087675.el7ost.noarch
openstack-nova-scheduler-15.0.1-0.20170224183627.6087675.el7ost.noarch
openstack-nova-novncproxy-15.0.1-0.20170224183627.6087675.el7ost.noarch
openstack-nova-conductor-15.0.1-0.20170224183627.6087675.el7ost.noarch
openstack-nova-api-15.0.1-0.20170224183627.6087675.el7ost.noarch
openstack-nova-console-15.0.1-0.20170224183627.6087675.el7ost.noarch
puppet-nova-10.3.0-0.20170220173041.97656fb.el7ost.noarch
openstack-nova-compute-15.0.1-0.20170224183627.6087675.el7ost.noarch

Steps to Reproduce:
-------------------
1. Deploy RHOS-1 (2017-02-24.2)
2. Setup latest repos on undercloud and overcloud
3. Update undercloud
4. Update overcloud

Actual results:
---------------
Update fails

Comment 2 Michele Baldessari 2017-03-06 09:46:38 UTC
So the error was from the nova cell command failing to list the domains:
Mar  3 10:08:24 localhost os-collect-config: #033[1;31mError: Failed to apply catalog: Command: 'openstack ["domain", "list", "--quiet", "--format", "csv", []]' has been running for more than 40 seconds (tried 4, for a total of 170 seconds)#033[0m
Mar  3 10:08:24 localhost os-collect-config: [2017-03-03 15:08:24,550] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/57c9d1cf-5bf6-4074-93a9-37215261fc1e.pp. [1]

The cluster is okay except for haproxy (which has constraints with the VIPs so those are down as a consequence):
Cluster name: tripleo_cluster
Stack: corosync
Current DC: controller-1 (version 1.1.15-11.el7_3.4-e174ec8) - partition with quorum
Last updated: Fri Mar  3 15:34:07 2017          Last change: Fri Mar  3 14:51:10 2017 by root via cibadmin on controller-0
         
              *** Resource management is DISABLED *** 
  The cluster will not attempt to start, stop or recover services
         
3 nodes and 20 resources configured
         
Online: [ controller-0 controller-1 controller-2 ]
         
Full list of resources:
         
 Master/Slave Set: galera-master [galera] (unmanaged)
     galera     (ocf::heartbeat:galera):        Master controller-2 (unmanaged)
     galera     (ocf::heartbeat:galera):        Master controller-1 (unmanaged)
     galera     (ocf::heartbeat:galera):        Master controller-0 (unmanaged)
 Clone Set: rabbitmq-clone [rabbitmq] (unmanaged)
     rabbitmq   (ocf::heartbeat:rabbitmq-cluster):      Started controller-2 (unmanaged)
     rabbitmq   (ocf::heartbeat:rabbitmq-cluster):      Started controller-1 (unmanaged)
     rabbitmq   (ocf::heartbeat:rabbitmq-cluster):      Started controller-0 (unmanaged)
 Master/Slave Set: redis-master [redis] (unmanaged)                                                                                                                                           
     redis      (ocf::heartbeat:redis): Slave controller-2 (unmanaged)
     redis      (ocf::heartbeat:redis): Master controller-1 (unmanaged)
     redis      (ocf::heartbeat:redis): Slave controller-0 (unmanaged)
 ip-192.168.24.11       (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)
 ip-10.0.0.101  (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)
 ip-172.17.1.18 (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)
 ip-172.17.1.11 (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)
 ip-172.17.3.12 (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)
 ip-172.17.4.18 (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)
 Clone Set: haproxy-clone [haproxy] (unmanaged)
     Stopped: [ controller-0 controller-1 controller-2 ]
 openstack-cinder-backup        (systemd:openstack-cinder-backup):      Started controller-1 (unmanaged)
 openstack-cinder-volume        (systemd:openstack-cinder-volume):      Started controller-0 (unmanaged)


Haproxy fails to start because:
messages:Mar  3 09:15:21 localhost systemd: Started Cluster Controlled haproxy.
messages:Mar  3 09:15:21 localhost systemd: Starting Cluster Controlled haproxy...
messages:Mar  3 09:15:21 localhost haproxy-systemd-wrapper: [WARNING] 061/141521 (284010) : Setting tune.ssl.default-dh-param to 1024 by default, if your workload permits it you should set it to at least 2048. Please set a value >= 1024 to make this warning disappear.
messages:Mar  3 09:15:21 localhost haproxy-systemd-wrapper: [ALERT] 061/141521 (284010) : Starting frontend GLOBAL: error when trying to preserve previous UNIX socket [/var/run/haproxy.sock]messages:Mar  3 09:15:21 localhost haproxy-systemd-wrapper: haproxy-systemd-wrapper: exit, haproxy RC=1
messages:Mar  3 09:15:21 localhost systemd: haproxy.service: main process exited, code=exited, status=1/FAILURE
messages:Mar  3 09:15:21 localhost systemd: Unit haproxy.service entered failed state.
messages:Mar  3 09:15:21 localhost systemd: haproxy.service failed.
messages:Mar  3 09:15:23 localhost crmd[283248]:  notice: Result of start operation for haproxy on controller-0: 7 (not running)
messages:Mar  3 09:15:26 localhost crmd[283248]:  notice: Result of stop operation for haproxy on controller-0: 0 (ok)


The corresponding configuration is:
Haproxy Configuration says:
  ssl-default-bind-ciphers  !SSLv2:kEECDH:kRSA:kEDH:kPSK:+3DES:!aNULL:!eNULL:!MD5:!EXP:!RC4:!SEED:!IDEA:!DES
  ssl-default-bind-options  no-sslv3
  stats  socket /var/run/haproxy.sock mode 600 level user
  stats  timeout 2m
  user  haproxy 


Seems like a selinux issue:
var/log/audit/audit.log:type=AVC msg=audit(1488550521.483:4475): avc:  denied  { link } for  pid=284010 comm="haproxy" name="haproxy.sock" dev="tmpfs" ino=330803 scontext=system_u:system_r:haproxy_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=sock_file
var/log/messages:Mar  3 09:15:21 localhost haproxy-systemd-wrapper: [ALERT] 061/141521 (284010) : Starting frontend GLOBAL: error when trying to preserve previous UNIX socket [/var/run/haproxy.sock]

Comment 5 Lon Hohberger 2017-03-08 13:32:22 UTC
The file is mislabeled. We'll have to assign a context to it; let me check upstream.

Comment 6 Lon Hohberger 2017-03-08 13:42:56 UTC
./policy/modules/contrib/rhcs.fc:/var/run/haproxy\.sock.*        --  gen_context(system_u:object_r:haproxy_var_run_t,s0)

... going back to RHEL 7.2.z

Whatever's creating the file needs to be calling restorecon; it's being created with the incorrect label.

[root@localhost ~]# touch /var/run/haproxy.sock
[root@localhost ~]# ls -lZ !$
ls -lZ /var/run/haproxy.sock
-rw-r--r--. root root unconfined_u:object_r:var_run_t:s0 /var/run/haproxy.sock
[root@localhost ~]# restorecon !$
restorecon /var/run/haproxy.sock
[root@localhost ~]# ls -lZ /var/run/haproxy.sock
-rw-r--r--. root root unconfined_u:object_r:haproxy_var_run_t:s0 /var/run/haproxy.sock

Comment 7 Michele Baldessari 2017-03-08 13:49:50 UTC
So we added the stat socket for stats in haproxy, by adding this to the global
section:
global
...
  stats  socket /var/run/haproxy.sock mode 600 level user
  stats  timeout 2m
...

Does haproxy need to be modified to call restorecon or is something else needed here?

Comment 8 Lon Hohberger 2017-03-08 13:52:46 UTC
The stats socket differs from the default in RHEL-7.3 branch:

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

Comment 9 Lon Hohberger 2017-03-08 13:55:13 UTC
[root@localhost ~]# touch /var/lib/haproxy/sock
[root@localhost ~]# ls -lZ !$
ls -lZ /var/lib/haproxy/sock
-rw-r--r--. root root unconfined_u:object_r:haproxy_var_lib_t:s0 /var/lib/haproxy/sock

Comment 10 Lon Hohberger 2017-03-08 13:57:10 UTC
[root@localhost ~]# ls -lZ /var/lib/haproxy
[root@localhost ~]# touch /var/lib/haproxy/stats
[root@localhost ~]# ls -lZ /var/lib/haproxy
-rw-r--r--. root root unconfined_u:object_r:haproxy_var_lib_t:s0 stats

Comment 11 Lon Hohberger 2017-03-08 14:01:19 UTC
1) Follow the location in RHEL for the stats socket (/var/lib/haproxy/stats), or

2) Add a patch to create and call restorecon somewhere in puppet-haproxy for the stats socket.


I believe that either of these patches which are likely specific to running on RHEL.

haproxy_var_run_t and haproxy_var_lib_t are both allowed to be utilized by haproxy.

Comment 12 Lon Hohberger 2017-03-08 14:12:01 UTC
The problem with (2) is that it might not work if something unlinks /var/run/haproxy.sock

Comment 13 Lon Hohberger 2017-03-08 14:12:57 UTC
I assume you'd want to change this downstream-only. Other distributions might be fine with /var/run/haproxy.sock.

Comment 14 Michele Baldessari 2017-03-08 14:19:44 UTC
Thanks Lon, since the wrong path is generated via puppet-tripleo I will fix it there

Comment 15 Michele Baldessari 2017-03-08 14:29:01 UTC
Note that I will go for option 1) and then later on make the path a parameter in case other distros/operators might need a different path.

Comment 17 Lon Hohberger 2017-03-08 14:33:31 UTC
ACK, sounds good.

Comment 24 errata-xmlrpc 2017-05-17 20:04:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245

Comment 25 Red Hat Bugzilla 2023-09-14 03:54:39 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days