DescriptionMartin Schuppert
2019-10-22 14:10:46 UTC
+++ This bug was initially created as a clone of Bug #1764240 +++
+++ This bug was initially created as a clone of Bug #1761373 +++
Description of problem:
While Rolling out newest updates from RHSOP13z8, all Instances on our Hosts were automatically Rebooted.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Instances should not be rebooted
Additional info:
--- Additional comment from Martin Schuppert on 2019-10-21 13:47:28 UTC ---
* Ansible host prepare step which started libvirt-guests correct:
Oct 9 11:40:41 overcloud-compute-15 ansible-stat: Invoked with checksum_algorithm=sha1 get_checksum=True follow=False checksum_algo=sha1 path=/etc/systemd/system/libvirt-guests.service get_md5=None get_mime=True get_attributes=True
Oct 9 11:40:41 overcloud-compute-15 ansible-copy: Invoked with directory_mode=None force=True remote_src=None _original_basename=tmpcf2YY1 owner=None follow=False local_follow=None group=None unsafe_writes=None setype=None content=NOT_LOGGING_PARAMETER serole=None dest=/etc/
systemd/system/libvirt-guests.service selevel=None regexp=None validate=None src=/root/.ansible/tmp/ansible-tmp-1570614041.02-199123311360310/source checksum=b05237d34e522f44407f65882217e7e518b356dc seuser=None delimiter=None mode=None attributes=None backup=False
Oct 9 11:40:41 overcloud-compute-15 ansible-systemd: Invoked with no_block=False force=None name=libvirt-guests enabled=True daemon_reload=True state=started masked=None user=False
Oct 9 11:40:41 overcloud-compute-15 systemd: Reloading.
Oct 9 11:40:41 overcloud-compute-15 systemd: Started Flexible Branding Service.
Oct 9 11:40:42 overcloud-compute-15 systemd: Reloading.
Oct 9 11:40:42 overcloud-compute-15 systemd: Reached target Libvirt guests shutdown.
Oct 9 11:40:42 overcloud-compute-15 systemd: Starting Suspend/Resume Running libvirt Guests...
Oct 9 11:40:42 overcloud-compute-15 systemd: Started Flexible Branding Service.
Oct 9 11:40:42 overcloud-compute-15 systemd: Started Suspend/Resume Running libvirt Guests.
...
Oct 9 11:40:58 overcloud-compute-15 os-collect-config: TASK [is Nova Resume Guests State On Host Boot enabled] ************************
Oct 9 11:40:58 overcloud-compute-15 os-collect-config: ok: [localhost]
Oct 9 11:40:58 overcloud-compute-15 os-collect-config: TASK [libvirt-guests unit to stop nova_compute container before shutdown VMs] ***
Oct 9 11:40:58 overcloud-compute-15 os-collect-config: changed: [localhost]
Oct 9 11:40:58 overcloud-compute-15 os-collect-config: TASK [libvirt-guests enable VM shutdown on compute reboot/shutdown] ************
Oct 9 11:40:58 overcloud-compute-15 os-collect-config: changed: [localhost]
* Then puppet-tripleo got called for the added OS::TripleO::Services::NovaLibvirtGuests service to the compute role:
Oct 9 11:53:56 overcloud-compute-15 puppet-user[560442]: Compiled catalog for overcloud-compute-15.xyz in environment production in 2.96 seconds
Oct 9 11:53:57 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Main/Package_manifest[/var/lib/tripleo/installed-packages/overcloud_Compute]/ensure) created
Oct 9 11:53:57 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Tripleo::Profile::Base::Nova::Compute::Libvirt_guests/File[/etc/systemd/system/virt-guest-shutdown.target.wants]/ensure) created
Oct 9 11:53:58 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Tripleo::Profile::Base::Kernel/Kmod::Load[nf_conntrack_proto_sctp]/Exec[modprobe nf_conntrack_proto_sctp]/returns) executed successfully
Oct 9 11:53:58 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Tripleo::Profile::Base::Nova::Compute::Libvirt_guests/Systemd::Unit_file[paunch-container-shutdown.service]/File[/etc/systemd/system/virt-guest-shutdown.target.wants/paunch-container-shutdown.service]/ensure) created
- Note we also do a systemctl daemon reload in [1]
Oct 9 11:53:58 overcloud-compute-15 systemd: Reloading.
Oct 9 11:53:59 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Systemd::Systemctl::Daemon_reload/Exec[systemctl-daemon-reload]) Triggered 'refresh' from 1 events
Oct 9 11:53:59 overcloud-compute-15 systemd: Started Flexible Branding Service.
Oct 9 11:53:59 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Nova::Compute::Libvirt_guests/File_line[/etc/sysconfig/libvirt-guests ON_BOOT]/ensure) created
Oct 9 11:53:59 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Nova::Compute::Libvirt_guests/File_line[/etc/sysconfig/libvirt-guests ON_SHUTDOWN]/ensure) created
Oct 9 11:53:59 overcloud-compute-15 puppet-user[560442]: (/Stage[main]/Nova::Compute::Libvirt_guests/File_line[/etc/sysconfig/libvirt-guests SHUTDOWN_TIMEOUT]/ensure) created
- the libvirt-guests stop was a result of the puppet-tripleo/puppet-nova run:
Oct 9 11:53:59 overcloud-compute-15 systemd: Stopping Suspend/Resume Running libvirt Guests...
Oct 9 11:54:05 overcloud-compute-15 journal: 2019-10-09 09:54:05.820+0000: 557266: info : libvirt version: 4.5.0, package: 23.el7_7.1 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2019-08-16-11:33:27, x86-vm-28.build.eng.bos.redhat.com)
Oct 9 11:54:05 overcloud-compute-15 journal: 2019-10-09 09:54:05.820+0000: 557266: info : hostname: overcloud-compute-15
Oct 9 11:54:05 overcloud-compute-15 journal: 2019-10-09 09:54:05.820+0000: 557266: error : virNetSocketReadWire:1806 : End of file while reading data: Input/output error
Oct 9 11:54:05 overcloud-compute-15 dockerd-current: time="2019-10-09T11:54:05.931711967+02:00" level=warning msg="dcb3e6206fe2a41e7d9888fe9a8fe0577516e50ac5f6fb18d6cb8494d0d3e26b cleanup: failed to unmount secrets: invalid argument"
Oct 9 11:54:05 overcloud-compute-15 docker: nova_compute
Oct 9 11:54:06 overcloud-compute-15 libvirt-guests.sh: Running guests on default URI: instance-0000111d, instance-00000e59, instance-000015ea, instance-00000d36
Oct 9 11:54:06 overcloud-compute-15 libvirt-guests.sh: Shutting down guests on default URI...
Oct 9 11:54:06 overcloud-compute-15 libvirt-guests.sh: Starting shutdown on guest: instance-0000111d
Oct 9 11:54:08 overcloud-compute-15 libvirt-guests.sh: Waiting for guest instance-0000111d to shut down, 300 seconds left
...
--- Additional comment from Martin Schuppert on 2019-10-22 14:04:25 UTC ---
If there is a config change to /etc/sysconfig/libvirt-guests, the service is notified
to get restarted which results in a stop of the instances on the compute via libvirt-guests.
Due to the NovaResumeGuestsStateOnHostBoot set to true the instances get later started
again by nova.
Working on a patch to remove the restart on config change as /usr/libexec/libvirt-guests.sh
sources /etc/syscontig/libvirt-guests on each run, so it is not required.
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.
If this bug does not require doc text, please set the 'requires_doc_text' flag to '-'.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2020:0643