Bug 1303316
Summary: | vm.conf does not get updated if hosted engine is installed on block storage | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Martin Tessun <mtessun> |
Component: | ovirt-hosted-engine-ha | Assignee: | Simone Tiraboschi <stirabos> |
Status: | CLOSED ERRATA | QA Contact: | Artyom <alukiano> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.6.0 | CC: | amureini, bobby.prins, dfediuck, gklein, juwu, lsurette, nsoffer, sbonazzo, stirabos, tnisan, ykaul, ylavi |
Target Milestone: | ovirt-3.6.3 | Keywords: | Triaged |
Target Release: | 3.6.3 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-03-09 19:50:34 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Integration | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1283458 |
Description
Martin Tessun
2016-01-30 17:46:15 UTC
Just some additional observations: After Maintenance-Mode and restart of the hypervisor, the HV is able to get the vm.conf: MainThread::INFO::2016-01-30 18:50:24,137::hosted_engine::710::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Ensuring lease for lockspace hosted-engine, host id 1 is acquired (file: /var/run/vdsm/storage/6de103bc-84b1-404d-bf32-126ce75984d1/57787128-ae20-4a74-9dda-2608a9ef6b4d/e09865c4-b7ad-4a01-a51f-0fc2bd08c2fa) MainThread::INFO::2016-01-30 18:50:45,153::hosted_engine::744::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock) Acquired lock on host id 1 MainThread::INFO::2016-01-30 18:50:45,154::upgrade::944::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(upgrade) Host configuration is already up-to-date MainThread::INFO::2016-01-30 18:50:45,154::hosted_engine::424::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) Reloading vm.conf from the shared storage domain MainThread::INFO::2016-01-30 18:50:45,154::config::205::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Trying to get a fresher copy of vm configuration from the OVF_STORE MainThread::INFO::2016-01-30 18:50:46,432::ovf_store::101::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found OVF_STORE: imgUUID:7736c611-be93-4fd1-9f82-73d4f804dabe, volUUID:68ad3016-a133-4d51-a58f-27ce000061f1 MainThread::INFO::2016-01-30 18:50:47,903::ovf_store::101::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found OVF_STORE: imgUUID:576e711d-2432-49e0-a1e8-b85788c9528d, volUUID:513cc1ff-7c4a-4ec6-a039-67627c5b87c9 MainThread::INFO::2016-01-30 18:50:51,235::ovf_store::110::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE MainThread::INFO::2016-01-30 18:50:51,236::ovf_store::117::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) OVF_STORE volume path: /rhev/data-center/mnt/blockSD/6de103bc-84b1-404d-bf32-126ce75984d1/images/576e711d-2432-49e0-a1e8-b85788c9528d/513cc1ff-7c4a-4ec6-a039-67627c5b87c9 MainThread::INFO::2016-01-30 18:50:51,266::config::225::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Found an OVF for HE VM, trying to convert MainThread::INFO::2016-01-30 18:50:51,270::config::230::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Got vm.conf from OVF_STORE This happens because after a reboot we don't call prepareImage on the OVF_STORE images since we don't know their UUIDs and getImagesList is failing since we are still not connected to the storagePool. We need to have the fix for 1274622 backported to 3.6 in order to be able to fix this one. I don't think we can fix for 3.6.3 if getImagesList is broken there. Allon, can someone have a look for a solution for 3.6? (In reply to Yaniv Dary from comment #6) > Allon, can someone have a look for a solution for 3.6? Could someone actually explain the flow and what exactly fails instead of trying to ask for specific patches backports? (In reply to Allon Mureinik from comment #7) > Could someone actually explain the flow and what exactly fails instead of > trying to ask for specific patches backports? The flow: https://bugzilla.redhat.com/show_bug.cgi?id=1274622#c9 The issue: https://bugzilla.redhat.com/show_bug.cgi?id=1274622 (In reply to Simone Tiraboschi from comment #8) > (In reply to Allon Mureinik from comment #7) > > Could someone actually explain the flow and what exactly fails instead of > > trying to ask for specific patches backports? > > The flow: > https://bugzilla.redhat.com/show_bug.cgi?id=1274622#c9 > > The issue: > https://bugzilla.redhat.com/show_bug.cgi?id=1274622 You should import the domain, and force an update of the OVF_STORE, like Martin suggested in https://gerrit.ovirt.org/#/c/51842/. Once you do that, store the UUID. I don't see any reason to read the OVF_STORE in a domain outside of the pool. (In reply to Allon Mureinik from comment #9) > You should import the domain, and force an update of the OVF_STORE, like > Martin suggested in https://gerrit.ovirt.org/#/c/51842/. > Once you do that, store the UUID. Here we have two distinct components involved in this flow: the engine and ovirt-ha-agent The engine auto-imports the hosted-engine storage domain, only after that, it generates the OVF_STORE and it knows its UUID, ovirt-ha-agent doesn't know. Now let's see what happens when we reboot the host: we have just ovirt-ha-agent, still no engine. We are still not connected to any storagePool. ovirt-ha-agent has to fetch the latest engine VM configuration from the OVF_STORE so it has to call prepareImage on it but it doesn't know the OVF_STORE UUID. We can call getImagesList to discover it but it's failing due to https://bugzilla.redhat.com/show_bug.cgi?id=1274622 OK, I think I'm starting to get there... A few more questions though - isn't the HA agent supposed to be aware of the pool? Why aren't we starting up VDSM and connecting it to the pool straight away? Also, bug 1274622 (and its fix) are specifically about file storage. This one is about block storage, so I don't see how backporting it (assuming it were possible) would help. (In reply to Allon Mureinik from comment #11) > OK, I think I'm starting to get there... > > A few more questions though - isn't the HA agent supposed to be aware of the > pool? Why aren't we starting up VDSM and connecting it to the pool straight > away? No, it's not. The hosted-engine storage domain will get attached to the storagePool of the datacenter which contains the hosted-engine cluster only when engine will auto import it (it will do that only when the datacenter will be up and for that the user has to add at least one additional storagedomain). So the HA agent is agnostic against the storage pool. > Also, bug 1274622 (and its fix) are specifically about file storage. This > one is about block storage, so I don't see how backporting it (assuming it > were possible) would help. This is true, 1274622 is file based specific but for that we avoid calling getImagesList and prepareImage on the image we don't know about. On NFS (and on iSCSI too) it seams that the image becomes available as a side effect of something else also if we don't prepare them. A possibile, but weird and not that smart, solution is to always call getImagesList ignoring all the failure there, if and only if, we get something back (as it should happen on block devices AFAIK) we can scan for OVF_STORE images in order to prepare them. On file based devices, we'll simply cross our fingers hoping that accessing the OVF_STORE will continue working as today without the need to prepare them. If this bug requires doc text for errata release, please provide draft text in the doc text field in the following format: Cause: Consequence: Fix: Result: The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -. Verified on ovirt-hosted-engine-ha-1.3.4.3-1.el7ev.noarch(over ISCSI storage) MainThread::INFO::2016-02-25 17:23:50,491::config::205::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Trying to get a fresher copy of vm configuration from the OVF_STORE MainThread::INFO::2016-02-25 17:23:50,701::ovf_store::100::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found OVF_STORE: imgUUID:a5e49dd0-be39-4374-9e49-8dce6395b758, volUUID:6ba4dd03-8903-415b-b0d6-b46c21e3e96f MainThread::INFO::2016-02-25 17:23:51,270::ovf_store::100::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) Found OVF_STORE: imgUUID:8147492c-b641-4790-8eae-c9301f7a5e31, volUUID:e8af7c51-dc7e-4791-96f8-5904dc5d62c6 MainThread::INFO::2016-02-25 17:23:51,271::ovf_store::109::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) Extracting Engine VM OVF from the OVF_STORE MainThread::INFO::2016-02-25 17:23:51,272::ovf_store::116::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) OVF_STORE volume path: /rhev/data-center/mnt/blockSD/4bf73cdc-7ee1-4309-8d69-94a29d9fe36d/images/8147492c-b641-4790-8eae-c9301f7a5e31/e8af7c51-dc7e-4791-96f8-5904dc5d62c6 MainThread::INFO::2016-02-25 17:23:51,281::config::225::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Found an OVF for HE VM, trying to convert MainThread::INFO::2016-02-25 17:23:51,284::config::230::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_local_conf_file) Got vm.conf from OVF_STORE Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0422.html |