Bug 1844965
| Summary: | engine logs are not copied | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-ansible-collection | Reporter: | Yedidyah Bar David <didi> |
| Component: | hosted-engine-setup | Assignee: | Yedidyah Bar David <didi> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Nikolai Sednev <nsednev> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | unspecified | CC: | bugs, mavital, nsednev, pelauter |
| Target Milestone: | ovirt-4.4.3-1 | Keywords: | ZStream |
| Target Release: | 1.2.1 | Flags: | pm-rhel:
ovirt-4.4+
pelauter: planning_ack+ sbonazzo: devel_ack+ mavital: testing_ack+ |
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ovirt-ansible-collection-1.2.1 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-11-06 14:01:26 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Integration | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1892378 | ||
| Bug Blocks: | |||
|
Description
Yedidyah Bar David
2020-06-08 06:54:30 UTC
https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/pull/325 was merged yesterday. I now tried deploy with it, and it still did not collect logs - it did not call 'sync'. Flow seems to have been: - otopi calls: 2020-06-16 13:15:27,659+0300 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:188 ansible-playbook: cmd: ['/bin/ansible-playbook', '--module-path=/usr/share/ovirt-hosted-engine-setup/ansible', '--inventory=localhost,didi-centos8-he-engine.lab.eng.tlv2.redhat.com', '--extra-vars=@/tmp/tmpru00p_pu', '--tags=bootstrap_local_vm', '--skip-tags=always', '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml'] engine-setup fails (due to another bug), and this playbook terminates (and does not call sync) - otopi calls: 2020-06-16 13:38:49,812+0300 DEBUG otopi.ovirt_hosted_engine_setup.ansible_utils ansible_utils.run:188 ansible-playbook: cmd: ['/bin/ansible-playbook', '--module-path=/usr/share/ovirt-hosted-engine-setup/ansible', '--inventory=localhost,', '--extra-vars=@/tmp/tmp30niqg_9', '--tags=final_clean', '--skip-tags=always', '/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml'] and this (in final_clean.yml -> fetch_engine_logs.yml) collects the logs, but does not call sync either. Places running stuff on the engine do this by: delegate_to: "{{ groups.engine[0] }}" but above last run has only 'localhost' in its inventory. Perhaps I should somehow add also the engine there, delegate to it. Need to see how to do that. For now, keeping the bug on NEW, as the most important collection, in the end, is still empty. Moving to 4.4.4 since we reached development freeze for 4.4.3 and this is not marked as blocker. Found at least one significant reason for not being able to collect the logs: Relevant steps, in their run order: 1. otopi [1] calls ansible [2], which also creates the local vm and sets otopi_localvm_dir, to be returned to otopi and used by it [3] 2. otopi reads the result, gets otopi_localvm_dir and sets OVEHOSTED_CORE/localVMDir in its own env. 3. Later, otopi calls ansible [4], passes OVEHOSTED_CORE/localVMDir as ansible var he_local_vm_dir. 4. [4] also tries to collect engine logs, checking the local vm dir. If ansible [2] fails in the middle, before setting otopi_localvm_dir, otopi won't get it, so will not be able to pass it, so we fail to collect logs. I didn't yet check how this is working when deploying from cockpit. [1] src/plugins/gr-he-ansiblesetup/core/misc.py:_closeup [2] bootstrap_local_vm [3] bootstrap_local_vm/02_create_local_vm.yml [4] final_clean.yml QE: This, together with bug 1892378, should handle several more cases of a failed hosted-engine deploy with empty engine-logs directories. For reproduction/verification, you should deploy hosted-engine as usual, but make it fail in the middle, after the local engine machine is up. One flow I tried that does not work is pressing ^C :-(, so don't use that. One that does work, in ovirt-system-tests, is: https://gerrit.ovirt.org/111926. It's a simple patch, which you can adapt to manual or other automated testing. From now on, if you notice any failed hosted-engine deployment that does not provide engine-logs on the host, please open a bug. Ideally, I'd like to cover all relevant flows. Yes, I agree that pressing ^C is a relevant flow, although I didn't open a bug for it. It's not high priority, IMO - if users pressed ^C, and deploy failed, they should know what it failed. $ git tag --contains 7bde5f3 1.2.0-1 1.2.1-1 This can be tested with 4.4.3. QE: I don't mind that current bug would not be explicitly verified. I'd like, though, to raise your awareness about it. If you do a hosted-engine deployment, and engine-logs-* directory is empty, please open a bug. Ideally, I'd like this to never happen. Nikolai - setting needinfo on you as QE owner, but it's not only for you :-). Not sure who else should be aware. Thanks. This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. alma07 ~]# ll -lsha /var/log/ovirt-hosted-engine-setup/ total 2.5M 4.0K drwx------. 4 root root 4.0K Nov 3 18:12 . 4.0K drwxr-xr-x. 20 root root 4.0K Nov 15 03:27 .. 4.0K drwx------. 3 root root 4.0K Nov 3 18:12 engine-logs-2020-11-03T16:04:34Z 4.0K drwx------. 2 root root 4.0K Nov 3 18:12 engine-logs-2020-11-03T16:12:16Z 616K -rw-r--r--. 1 root root 609K Nov 3 18:12 ovirt-hosted-engine-setup-20201103173400-nra9bo.log 804K -rw-r--r--. 1 root root 800K Nov 3 17:57 ovirt-hosted-engine-setup-ansible-bootstrap_local_vm-20201103173807-gseya8.log 144K -rw-r--r--. 1 root root 140K Nov 3 18:01 ovirt-hosted-engine-setup-ansible-create_storage_domain-20201103175958-qh0eed.log 416K -rw-r--r--. 1 root root 412K Nov 3 18:12 ovirt-hosted-engine-setup-ansible-create_target_vm-20201103180431-1540en.log 120K -rw-r--r--. 1 root root 116K Nov 3 18:12 ovirt-hosted-engine-setup-ansible-final_clean-20201103181213-rkih66.log 104K -rw-r--r--. 1 root root 100K Nov 3 17:34 ovirt-hosted-engine-setup-ansible-get_network_interfaces-20201103173410-p1h5g8.log 252K -rw-r--r--. 1 root root 245K Nov 3 17:38 ovirt-hosted-engine-setup-ansible-initial_clean-20201103173702-5ych7f.log ovirt-hosted-engine-setup-2.4.8-1.el8ev.noarch ovirt-hosted-engine-ha-2.4.5-1.el8ev.noarch alma07 ~]# ll -lsha /var/log/ovirt-hosted-engine-setup/engine-logs-2020-11-03T16:04:34Z total 176K 4.0K drwx------. 3 root root 4.0K Nov 3 18:12 . 4.0K drwx------. 4 root root 4.0K Nov 3 18:12 .. 164K -rw-r--r--. 1 root root 158K Nov 3 18:12 messages 4.0K drwx------. 12 108 108 4.0K Nov 3 17:54 ovirt-engine alma07 ~]# ll -lsha /var/log/ovirt-hosted-engine-setup/engine-logs-2020-11-03T16:04:34Z/ovirt-engine total 2.0M 4.0K drwx------. 12 108 108 4.0K Nov 3 17:54 . 4.0K drwx------. 3 root root 4.0K Nov 3 18:12 .. 4.0K drwx------. 2 108 108 4.0K Sep 14 18:43 ansible 1.4M -rw-r--r--. 1 108 108 1.4M Nov 3 17:56 ansible-runner-service.log 8.0K -rw-r--r--. 1 108 108 5.3K Nov 3 17:53 boot.log 4.0K drwx------. 2 108 108 4.0K Sep 14 18:43 brick-setup 4.0K drwx------. 2 108 108 4.0K Sep 14 18:43 cinderlib 4.0K -rw-r--r--. 1 108 108 669 Nov 3 17:53 console.log 4.0K drwx------. 2 108 108 4.0K Sep 14 18:43 db-manual 4.0K drwx------. 2 108 108 4.0K Sep 14 18:43 dump 448K -rw-r--r--. 1 108 108 444K Nov 3 18:06 engine.log 4.0K drwx------. 2 108 108 4.0K Nov 3 17:54 host-deploy 4.0K drwx------. 2 108 108 4.0K Sep 14 18:43 notifier 4.0K drwx------. 2 108 108 4.0K Sep 14 18:43 ova 4.0K drwxr-xr-x. 2 root root 4.0K Jul 29 09:39 ovirt-log-collector 100K -rw-r--r--. 1 108 108 96K Nov 3 18:06 server.log 4.0K drwx------. 2 root root 4.0K Nov 3 17:48 setup 0 -rw-r--r--. 1 108 108 0 Nov 3 17:51 ui.log |