Bug 1427849
| Summary: | [atomic] The engine starts HA VM if the VM powered off from the guest OS | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Artyom <alukiano> | ||||||
| Component: | ovirt-guest-agent | Assignee: | Tomáš Golembiovský <tgolembi> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Jiri Belka <jbelka> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 4.1.7 | CC: | alukiano, bugs, lsurette, michal.skrivanek, mkenneth, pbrilla, rbalakri, srevivo, tgolembi, tjelinek, ykaul, ylavi | ||||||
| Target Milestone: | ovirt-4.1.8 | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: |
Previously, during Atomic host shutdown, the container was killed before the Guest Agent had a chance to send 'session-shutdown' message to VDSM host. This is now fixed.
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2018-01-05 16:12:52 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
|
Description
Artyom
2017-03-01 12:03:24 UTC
please check if verification of bug 1341106 (by you) is satisfactory. If not, please retest alright, can you please attach vdsm.log? Created attachment 1319070 [details]
vdsm and engine logs
You can start looking from:
2017-08-28 15:37:30,453+03 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-13) [] VM '06fefa09-a213-410c-ae12-149b0de90f42'(atomic-vm) moved from 'Up' --> 'Down'
2017-08-28 15:37:30,521+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-13) [] EVENT_ID: VM_DOWN_ERROR(119), VM atomic-vm is down with error. Exit message: VM has been terminated on the host.
Hi Artyom, we still need to see the guest side logs from ovirt-guest-agent and/or vdsm logs with debug level. The agent needs to shut down cleanly and send session-shutdown message in the last 240s before the actual qemu termination in order to detect it correctly on Atomic. Created attachment 1359462 [details] logs Checked again on: # atomic images info 867512d0966f Image Name: 867512d0966f architecture: x86_64 atomic.type: system authoritative-source-url: registry.access.redhat.com build-date: 2017-11-22T18:15:26.566179 com.redhat.build-host: rcm-img-docker02.build.eng.bos.redhat.com com.redhat.component: ovirt-guest-agent-docker description: The ovirt-guest-agent is providing information about the virtual machine and allows to restart / shutdown the machine via the RHV Portal. This image is intended to be used with virtual machines running RHEL 7 Atomic Host. distribution-scope: public io.k8s.description: The ovirt-guest-agent is providing information about the virtual machine and allows to restart / shutdown the machine via the RHV Portal. This image is intended to be used with virtual machines running RHEL 7 Atomic Host. io.k8s.display-name: oVirt Guest Agent io.openshift.tags: base rhel7 license: ASL 2.0 maintainer: Tomas Golembiovsky <tgolembi> name: rhev4/ovirt-guest-agent release: 40 summary: The oVirt Guest Agent url: https://access.redhat.com/containers/#/registry.access.redhat.com/rhev4/ovirt-guest-agent/images/1.0.13-40 vcs-ref: 4cc91717604b2ed1e495c2001dcefe9a73309388 vcs-type: git vendor: Red Hat, Inc. version: 1.0.13 and vdsm-4.20.8-1.el7ev.x86_64 For some reason, ovirt-guest-agent does not fill the log(I believe container issue), so I just provide snapshot from journalctl -u ovirt-guest-agent.service. You can start looking from: 2017-11-27 15:24:58,154+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-10) [] EVENT_ID: VM_DOWN_ERROR(119), VM atomic-vm is down with error. Exit message: VM has been terminated on the host. Also I can provide the environment, so please just ping me. It seems we kill the container before the agent can send the session-shutdown message. which would mean the atomic os is not set up correctly on shutdown and it does not wait for ovirt-ga-docker to terminate cleanly. Is that possible to configure somehow? It's a bug in our container not in atomic. Easy to fix though. looks ok # atomic containers list --no-trunc CONTAINER ID IMAGE NAME COMMAND CREATED STATE BACKEND RUNTIME ovirt-guest-agent-rhevm-4.1-rhel-7-docker-candidate-59820-20171218213722 brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/rhev4/ovirt-guest-agent:rhevm-4.1-rhel-7-docker-candidate-59820-20171218213722 ovirt-guest-agent-rhevm-4.1-rhel-7-docker-candidate-59820-20171218213722 /usr/bin/python /usr/share/ovirt-guest-agent/ovirt-guest-agent.py 2017-12-19 15:27 running ostree runc no tuned inside container. ok, ovirt-guest-agent-docker-1.0.14-3 2018-01-03 12:40:51,380+01 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzScheduler2) [55e0befd] VM '027ef96e-544c-4ced-a267-7ea89cc9464a'(jbelka-atomic-02) moved from 'PoweringUp' --> 'Up' 2018-01-03 12:40:51,407+01 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler2) [55e0befd] EVENT_ID: USER_RUN_VM(32), Correlation ID: f57ac60c-bbb8-4f07-802d-39076a2f57a5, Job ID: 75ca4d30-4960-47c1-94cd-be69dcd1924b, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM jbelka-atomic-02 started on Host slot-7c 2018-01-03 12:42:54,926+01 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzScheduler6) [50c96e8b] VM '027ef96e-544c-4ced-a267-7ea89cc9464a'(jbelka-atomic-02) moved from 'Up' --> 'Down' 2018-01-03 12:42:55,118+01 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler6) [50c96e8b] EVENT_ID: VM_DOWN(61), Correlation ID: null, Call Stack: null, Custom ID: null, Custom Event ID: -1, Message: VM jbelka-atomic-02 is down. Exit message: User shut down from within the guest from brew task id 635712 # runc list ID PID STATUS BUNDLE CREATED OWNER ovirt-guest-agent-rhevm-4.1-rhel-7-docker-candidate-96798-20180102144148 831 running /var/lib/containers/atomic/ovirt-guest-agent-rhevm-4.1-rhel-7-docker-candidate-96798-20180102144148.0 2018-01-03T11:52:12.875767314Z root Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0049 BZ<2>Jira re-sync |