Bug 1543914 - [OSP13]Cannot launch instance - "message": "Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 7e6fc2de-ffed-4a7d-b004-b164d42ab117. Last exception: internal error: process exited while connecting to monitor"
Summary: [OSP13]Cannot launch instance - "message": "Exceeded maximum number of retrie...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director-images
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: beta
: 13.0 (Queens)
Assignee: Lon Hohberger
QA Contact: Ofer Blaut
URL:
Whiteboard:
: 1545350 1545532 1551617 1551651 1562586 1569226 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-09 14:50 UTC by Artem Hrechanychenko
Modified: 2021-02-08 20:18 UTC (History)
29 users (show)

Fixed In Version: rhosp-director-images-13.0-20180430.1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:24:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
audit.log (1.04 MB, text/plain)
2018-02-09 16:45 UTC, Ollie Walsh
no flags Details
audit.log with disabled dontaudit rules (1.09 MB, text/plain)
2018-04-03 12:33 UTC, Artem Hrechanychenko
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2018:2083 0 None None None 2018-06-27 13:26:04 UTC

Description Artem Hrechanychenko 2018-02-09 14:50:38 UTC
Description of problem:
OSP13
3ctrl+3compute+3ceph + IPv4+GRE

{"message": "Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 7e6fc2de-ffed-4a7d-b004-b164d42ab117. Last exception: internal error: process exited while connecting to monitor", "code": 500, "details": "  File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 564, in build_instances |
|                                      |     filter_properties, instances[0].uuid)

Nova compute logs is empty  - > https://bugzilla.redhat.com/show_bug.cgi?id=1541431

Workaround - setenforce ->0 on compute nodes 
Version-Release number of selected component (if applicable):
OSP13
openstack-zaqar-6.0.0-0.20180123030653.1feeb36.el7ost.noarch
openstack-nova-placement-api-17.0.0-0.20180123163703.27eadbc.el7ost.noarch
openstack-selinux-0.8.13-0.20180108164838.c6158ce.el7ost.noarch
openstack-glance-16.0.0-0.20180123074513.466c104.el7ost.noarch
python-openstackclient-lang-3.13.0-0.20180109064941.8c5f755.el7ost.noarch
openstack-swift-object-2.15.2-0.20180118220641.449d83f.el7ost.noarch
openstack-neutron-openvswitch-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
python2-openstackclient-3.13.0-0.20180109064941.8c5f755.el7ost.noarch
openstack-heat-api-10.0.0-0.20180123134729.85a3e98.el7ost.noarch
openstack-tripleo-heat-templates-8.0.0-0.20180122224017.el7ost.noarch
openstack-mistral-engine-6.0.0-0.20180122153726.ae7950e.el7ost.noarch
openstack-ironic-common-10.0.1-0.20180123173432.2c630d9.el7ost.noarch
openstack-swift-account-2.15.2-0.20180118220641.449d83f.el7ost.noarch
openstack-heat-engine-10.0.0-0.20180123134729.85a3e98.el7ost.noarch
openstack-tripleo-ui-8.1.1-0.20180122135122.aef02d8.el7ost.noarch
openstack-tripleo-common-containers-8.3.1-0.20180123050219.el7ost.noarch
puppet-openstack_extras-12.2.0-0.20180119072029.b090b97.el7ost.noarch
openstack-tripleo-image-elements-8.0.0-0.20180117094122.02d0985.el7ost.noarch
openstack-nova-api-17.0.0-0.20180123163703.27eadbc.el7ost.noarch
openstack-nova-conductor-17.0.0-0.20180123163703.27eadbc.el7ost.noarch
openstack-nova-compute-17.0.0-0.20180123163703.27eadbc.el7ost.noarch
openstack-neutron-ml2-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
openstack-swift-proxy-2.15.2-0.20180118220641.449d83f.el7ost.noarch
openstack-heat-common-10.0.0-0.20180123134729.85a3e98.el7ost.noarch
openstack-ironic-conductor-10.0.1-0.20180123173432.2c630d9.el7ost.noarch
openstack-tempest-17.2.1-0.20180123175631.fe655fe.el7ost.noarch
openstack-mistral-api-6.0.0-0.20180122153726.ae7950e.el7ost.noarch
openstack-tripleo-validations-8.1.1-0.20180119231917.2ff3c79.el7ost.noarch
openstack-neutron-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
openstack-tripleo-puppet-elements-8.0.0-0.20180117092204.120eca8.el7ost.noarch
openstack-nova-common-17.0.0-0.20180123163703.27eadbc.el7ost.noarch
openstack-swift-container-2.15.2-0.20180118220641.449d83f.el7ost.noarch
openstack-neutron-common-12.0.0-0.20180123043113.d32ad6e.el7ost.noarch
openstack-heat-api-cfn-10.0.0-0.20180123134729.85a3e98.el7ost.noarch
openstack-ironic-staging-drivers-0.8.1-0.20180118014800.ae60c0b.el7ost.noarch
puppet-openstacklib-12.2.0-0.20180119083603.12107fd.el7ost.noarch
openstack-tripleo-common-8.3.1-0.20180123050219.el7ost.noarch
openstack-mistral-executor-6.0.0-0.20180122153726.ae7950e.el7ost.noarch
openstack-nova-scheduler-17.0.0-0.20180123163703.27eadbc.el7ost.noarch
python2-openstacksdk-0.9.19-0.20171105041723.bd60aa4.el7ost.noarch
openstack-keystone-13.0.0-0.20180123044720.9cd5f19.el7ost.noarch
openstack-ironic-api-10.0.1-0.20180123173432.2c630d9.el7ost.noarch
openstack-ironic-inspector-7.0.1-0.20180122225049.0cf3316.el7ost.noarch
openstack-puppet-modules-11.0.0-0.20171011152327.71ad01c.el7ost.noarch
openstack-mistral-common-6.0.0-0.20180122153726.ae7950e.el7ost.noarch

How reproducible:


Steps to Reproduce:
1.Deploy OSP13 with 3controller+3compute+3ceph nodes + IPv4GRE
2. Try to launch instance

Actual results:
Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 7e6fc2de-ffed-4a7d-b004-b164d42ab117. Last exception: internal error: process exited while connecting to monitor

Expected results:
Instance was launch

Additional info:

Comment 2 Ollie Walsh 2018-02-09 15:16:06 UTC
Traceback:

 Instance failed to spawn: libvirtError: internal error: process exited while connecting to monitor
 Traceback (most recent call last):
   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2244, in _build_resources
     yield resources
   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2037, in _build_and_run_instance
     block_device_info=block_device_info)
   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2926, in spawn
     destroy_disks_on_failure=True)
   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5361, in _create_domain_and_network
     destroy_disks_on_failure)
   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
     self.force_reraise()
   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
     six.reraise(self.type_, self.value, self.tb)
   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5331, in _create_domain_and_network
     post_xml_callback=post_xml_callback)
   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5266, in _create_domain
     guest.launch(pause=pause)
   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch
     self._encoded_xml, errors='ignore')
   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
     self.force_reraise()
   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
     six.reraise(self.type_, self.value, self.tb)
   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch
     return self._domain.createWithFlags(flags)
   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
     result = proxy_call(self._autowrap, f, *args, **kwargs)
   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
     rv = execute(f, *args, **kwargs)
   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
     six.reraise(c, e, tb)
   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
     rv = meth(*args, **kwargs)
   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1099, in createWithFlags
     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
 libvirtError: internal error: process exited while connecting to monitor

[root@compute-0 heat-admin]# grep denied /var/log/audit/audit.log 
type=AVC msg=audit(1518136909.628:1688): avc:  denied  { write } for  pid=54924 comm="qemu-kvm" path="pipe:[285743]" dev="pipefs" ino=285743 scontext=system_u:system_r:svirt_t:s0:c780,c989 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file
type=AVC msg=audit(1518136909.628:1688): avc:  denied  { write } for  pid=54924 comm="qemu-kvm" path="pipe:[285743]" dev="pipefs" ino=285743 scontext=system_u:system_r:svirt_t:s0:c780,c989 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file
type=AVC msg=audit(1518136909.628:1688): avc:  denied  { write } for  pid=54924 comm="qemu-kvm" path="pipe:[286029]" dev="pipefs" ino=286029 scontext=system_u:system_r:svirt_t:s0:c780,c989 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file
type=AVC msg=audit(1518188712.923:3319): avc:  denied  { write } for  pid=233891 comm="qemu-kvm" path="pipe:[4062348]" dev="pipefs" ino=4062348 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file
type=AVC msg=audit(1518188722.025:3332): avc:  denied  { append } for  pid=233891 comm=43505520302F4B564D path="pipe:[4062463]" dev="pipefs" ino=4062463 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file
type=AVC msg=audit(1518188723.198:3334): avc:  denied  { append } for  pid=233891 comm=43505520302F4B564D path="pipe:[4062463]" dev="pipefs" ino=4062463 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file

Comment 3 Kashyap Chamarthy 2018-02-09 16:27:01 UTC
Can you try relabelling the filesystem (and rebooting for it to take effect):

    $ touch /.autorelabel; reboot

Comment 4 Ollie Walsh 2018-02-09 16:45:09 UTC
Created attachment 1393870 [details]
audit.log

Comment 6 Lon Hohberger 2018-02-14 17:05:42 UTC
This is a bit concerning:

type=SELINUX_ERR msg=audit(1518189489.547:3336): op=security_compute_av reason=bounds scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=dir perms=ioctl,read,getattr,lock,search,open
type=AVC msg=audit(1518189489.547:3336): avc:  denied  { search } for  pid=233891 comm="qemu-kvm" name="48323" dev="proc" ino=188612 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=dir
type=SELINUX_ERR msg=audit(1518189489.547:3336): op=security_compute_av reason=bounds scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=file perms=ioctl,read,getattr,lock,open
type=AVC msg=audit(1518189489.547:3336): avc:  denied  { read } for  pid=233891 comm="qemu-kvm" name="cmdline" dev="proc" ino=189961 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=file
type=AVC msg=audit(1518189489.547:3336): avc:  denied  { open } for  pid=233891 comm="qemu-kvm" path="/proc/48323/cmdline" dev="proc" ino=189961 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=file
type=SYSCALL msg=audit(1518189489.547:3336): arch=c000003e syscall=2 success=yes exit=42 a0=55c60b9e5920 a1=0 a2=7fffe91e8750 a3=5a7dbbb1 items=0 ppid=48307 pid=233891 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c577,c824 key=(null)
type=PROCTITLE msg=audit(1518189489.547:3336): proctitle=2F7573722F6C6962657865632F71656D752D6B766D002D6E616D650067756573743D696E7374616E63652D30303030303030622C64656275672D746872656164733D6F6E002D53002D6F626A656374007365637265742C69643D6D61737465724B6579302C666F726D61743D7261772C66696C653D2F7661722F6C69622F6C69
type=AVC msg=audit(1518189489.550:3337): avc:  denied  { getattr } for  pid=233891 comm="qemu-kvm" path="/proc/48323/cmdline" dev="proc" ino=189961 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=file
type=SYSCALL msg=audit(1518189489.550:3337): arch=c000003e syscall=5 success=yes exit=0 a0=2a a1=7fffe91e8660 a2=7fffe91e8660 a3=5a7dbbb1 items=0 ppid=48307 pid=233891 auid=4294967295 uid=107 gid=107 euid=107 suid=107 fsuid=107 egid=107 sgid=107 fsgid=107 tty=(none) ses=4294967295 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c577,c824 key=(null)
type=PROCTITLE msg=audit(1518189489.550:3337): proctitle=2F7573722F6C6962657865632F71656D752D6B766D002D6E616D650067756573743D696E7374616E63652D30303030303030622C64656275672D746872656164733D6F6E002D53002D6F626A656374007365637265742C69643D6D61737465724B6579302C666F726D61743D7261772C66696C653D2F7661722F6C69622F6C69
type=SELINUX_ERR msg=audit(1518189489.550:3338): op=security_compute_av reason=bounds scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file perms=ioctl,read,write,getattr,lock,append,open
type=AVC msg=audit(1518189489.550:3338): avc:  denied  { write } for  pid=233891 comm="qemu-kvm" path="pipe:[4062348]" dev="pipefs" ino=4062348 scontext=system_u:system_r:svirt_t:s0:c577,c824 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file


I don't like the idea that a confined svirt needs access to write file w/ spc_t labels. That seems to transcend boundaries provided by svirt.

Comment 7 Lon Hohberger 2018-02-14 17:27:47 UTC
Dan, is there anything new in container-selinux for this kind of thing? It's running a VM in an spc_t that then transitions to svirt, ... but now needs to do something with spc_t again.

Comment 8 Alan Pevec 2018-02-14 18:36:07 UTC
libvirt is running in a container and this is qemu writing to the qemu monitor pipe back to libvirt?

Comment 9 David Hill 2018-02-14 18:42:55 UTC
This would be the same as https://bugzilla.redhat.com/show_bug.cgi?id=1538651

Comment 10 Alan Pevec 2018-02-14 18:54:57 UTC
(In reply to David Hill from comment #9)
> This would be the same as https://bugzilla.redhat.com/show_bug.cgi?id=1538651

That one was for qemu where context was svirt_tcg_t, here we have svirt_t source context.

Comment 11 Lee Yarwood 2018-02-14 19:02:15 UTC
*** Bug 1545350 has been marked as a duplicate of this bug. ***

Comment 12 Daniel Berrangé 2018-02-15 10:48:18 UTC
We need clarity on what 'spc_t' type is used for and why QEMU is attempting to write to it before we can suggest a solution. That is not something I've heard of before, so its questionable whether there's config fix required,vs an selinux policy extension

Comment 13 Daniel Walsh 2018-02-15 14:35:43 UTC
spc_t is the label of a "super priviled container" basically this is the label that you get when you run a container out of docker without confining it with SELinux.  It is an unconfined domain and we let most domains communicate with it.

These AVC's show that it is walking through the process data of the container process, which is not a problem and should be allowed.

These AVC's are allowed in the current selinux-policy.

 audit2allow  -i /tmp/t


#============= svirt_t ==============

#!!!! This avc is allowed in the current policy
allow svirt_t spc_t:dir search;

#!!!! This avc is allowed in the current policy
allow svirt_t spc_t:fifo_file write;

#!!!! This avc is allowed in the current policy
allow svirt_t spc_t:file { getattr open read };

Are you getting these errors on RHEL7.5?  This might be a change to container-selinux, that is causing the typebounds checks, and needs to be fixed in container-selinux.

Comment 14 Artem Hrechanychenko 2018-02-15 15:03:30 UTC
Also got next message when count of compute nodes == 1

{"message": "Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 23416b2e-c06c-4523-b8d4-58d0d60364df.", "code": 500, "details": "  File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 578, in build_instances |
|                                      |     raise exception.MaxRetriesExceeded(reason=msg)

Comment 15 Ollie Walsh 2018-02-15 16:26:15 UTC
selinux context of virtlogd on a 7.5 deployment:

[root@compute-0 heat-admin]# ps -lxZ  | grep virtlogd
system_u:system_r:spc_t:s0      4     0   27956   27939  20   0 196016  4388 poll_s Ssl  ?          0:00 /usr/sbin/virtlogd --config /etc/libvirt/virtlogd.conf

Comment 16 Lon Hohberger 2018-02-16 12:02:05 UTC
Yes, it's on RHEL 7.5

Comment 17 Lon Hohberger 2018-02-16 12:03:35 UTC
So, we can wait for a fix to container-selinux and use workarounds a la comment #13 in the mean time. That's fine.

Comment 18 Lon Hohberger 2018-02-16 12:07:52 UTC
Ollie, can you let us know what container-selinux package is installed?

Comment 19 Lee Yarwood 2018-02-16 14:14:03 UTC
(In reply to Lon Hohberger from comment #18)
> Ollie, can you let us know what container-selinux package is installed?

owlash is out today but AFAICT we used container-selinux-2.41-1.git126c1c0.el7.noarch.

Comment 20 Ronnie Rasouli 2018-02-18 12:40:36 UTC
*** Bug 1545532 has been marked as a duplicate of this bug. ***

Comment 21 Ollie Walsh 2018-02-19 18:45:32 UTC
(In reply to Lon Hohberger from comment #18)
> Ollie, can you let us know what container-selinux package is installed?

Would need to check Artem.

Comment 22 Ollie Walsh 2018-02-19 18:45:58 UTC
(In reply to Lon Hohberger from comment #18)
> Ollie, can you let us know what container-selinux package is installed?

Would need to check with Artem.

Comment 23 Lon Hohberger 2018-02-19 19:54:50 UTC
Confirmed; this should be working on 7.5 w/ 

[root@localhost ~]# audit2allow -i audit-1393870.txt 


#============= svirt_t ==============

#!!!! This avc is allowed in the current policy
allow svirt_t spc_t:dir search;

#!!!! This avc is allowed in the current policy
allow svirt_t spc_t:fifo_file { append write };

#!!!! This avc is allowed in the current policy
allow svirt_t spc_t:file { getattr open read };

[root@localhost ~]# rpm -q openstack-selinux selinux-policy container-selinux
openstack-selinux-0.8.13-0.20180108164838.c6158ce.el7ost.noarch
selinux-policy-3.13.1-190.el7.noarch
container-selinux-2.41-1.git126c1c0.el7.noarch

Comment 26 Alan Pevec 2018-02-19 22:30:03 UTC
I'm also getting "This avc is allowed in the current policy" for attached audit.log on CentOS7.4 with:
selinux-policy-3.13.1-166.el7_4.7.noarch
container-selinux-2.36-1.gitff95335.el7.noarch
openstack-selinux-0.8.13-0.20180108164838.c6158ce.el7.centos.noarch

Please report version for the above on the test system.

Comment 27 Artem Hrechanychenko 2018-02-20 15:07:08 UTC
[heat-admin@compute-0 ~]$ sudo rpm -qa "*container-selinux*"
container-selinux-2.41-1.git126c1c0.el7.noarch
openstack-selinux-0.8.13-0.20180108164838.c6158ce.el7ost.noarch
selinux-policy-3.13.1-186.el7.noarch

Comment 31 Kashyap Chamarthy 2018-03-09 14:25:15 UTC
*** Bug 1551651 has been marked as a duplicate of this bug. ***

Comment 32 Lee Yarwood 2018-03-12 10:35:12 UTC
*** Bug 1551617 has been marked as a duplicate of this bug. ***

Comment 40 Artem Hrechanychenko 2018-04-03 12:33:00 UTC
Created attachment 1416727 [details]
audit.log with disabled dontaudit rules

Comment 41 Lon Hohberger 2018-04-03 13:27:06 UTC
type=AVC msg=audit(1522752025.041:226): avc:  denied  { read } for  pid=20007 comm="sshd" name="lastlog" dev="vda2" ino=8628871 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:object_r:var_log_t:s0 tclass=file

This is the same as:
  https://bugzilla.redhat.com/show_bug.cgi?id=1562586

The problem there is that something recreated /var/log/lastlog with the wrong file label. When you create a file in /var/log, it will inherit /var/log's directory type. This is 'var_log_t'. The SELinux base policy in RHEL 7 has /var/log/lastlog labeled as 'lastlog_t', however, so there are two options when creating this file:

  - The creating process needs to call restorecon, or
  - we need a file transition rule

Doing the former requires the process to simply call 'restorecon' on /var/log/lastlog from whatever process recreated it.

To do the latter, we would need to know the security context of the process which deleted/recreated /var/log/lastlog; file transitions are extremely precise (need to know source context, target directory label (e.g. var_log_t), target file label (lastlog_t), and filename (lastlog) ).

Anyway, if the label were correct, lastlog_t would be accessible. Default security policies allow:

  allow sshd_t lastlog_t : file { ioctl read write create getattr setattr lock append open } ;

Comment 42 Lon Hohberger 2018-04-03 13:54:29 UTC
*** Bug 1562586 has been marked as a duplicate of this bug. ***

Comment 43 Lon Hohberger 2018-04-03 14:38:45 UTC
At this point, the problem is figuring out what is deleting/recreating /var/log/lastlog.

Comment 44 Ollie Walsh 2018-04-05 14:43:01 UTC
Noticed this on a custom image (based on the lastest overcloud-full) that I couldn't ssh to because of issues with lastlog.

compute-0/var/log/secure:Apr  5 09:08:34 computerealtime-0 sshd[3899]: lastlog_openseek: Couldn't stat /var/log/lastlog: No such file or directory
compute-0/var/log/secure:Apr  5 09:09:31 computerealtime-0 su: pam_lastlog(su:session): file /var/log/lastlog created
compute-0/var/log/secure:Apr  5 09:11:03 computerealtime-0 sshd[9590]: pam_lastlog(sshd:session): unable to open /var/log/lastlog: Permission denied

I did hit this with a newer build of the custom image. 2 significant differences:
1 - I ran a virt-customize --selinux-relabel on the image after building it (although lastlog doesn't exist).
2 - I waited much longer to ssh after the overcloud VM was created. When I initially hit this I tried to ssh almost as soon as the VM booted.

Comment 45 Ollie Walsh 2018-04-05 14:43:56 UTC
s/I did hit this with a newer build of the custom image/I didn't hit this with a newer build of the custom image/

Comment 46 Lon Hohberger 2018-04-11 12:46:05 UTC
Should we have the image build somehow do a restorecon or something like that? The problem is that something is (re)creating lastlog with the wrong label, but we don't know what it is.

Comment 47 Lon Hohberger 2018-04-19 15:43:00 UTC
*** Bug 1569226 has been marked as a duplicate of this bug. ***

Comment 48 Lon Hohberger 2018-04-24 19:48:53 UTC
The goal here is to make the images have /var/log/lastlog. The virt-sysprep line is deleting it; defaults are deleting /var/log/lastlog*.

Comment 49 Lon Hohberger 2018-04-24 19:51:10 UTC
pam seems to be recreating it, but with the wrong label.

Comment 50 Lon Hohberger 2018-04-25 18:17:19 UTC
We're going to recreate lastlog, restorecon on it, and then re-nuke machine-id during image builds to ensure that lastlog exists and has the correct label.

Comment 57 Artem Hrechanychenko 2018-05-08 14:03:23 UTC
VERIFIED

+--------------------------------------+--------------+--------+------------+-------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+--------------+--------+------------+-------------+--------------------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks                             |
+--------------------------------------+--------------+--------+------------+-------------+--------------------------------------+
| 4d621325-ce47-4751-aa2b-c26b7de8f8ac | after_deploy | ACTIVE | -          | Running     | tenantvxlan=192.168.32.7, 10.0.0.185 |
| 0d325810-caf5-4573-95a6-596f85206b92 | after_reboot | ACTIVE | -          | Running     | tenantvxlan=192.168.32.3, 10.0.0.184 |
+--------------------------------------+--------------+--------+------------+-------------+--------------------------------------+

puddle - 2018-05-07.2

heat-admin@compute-0 ~]$ sudo sestatus
SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Max kernel policy version:      31

Comment 60 errata-xmlrpc 2018-06-27 13:24:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2083


Note You need to log in before you can comment on or make changes to this bug.