Created attachment 1470868 [details] Logs (as requested by Nir) I'm doing some "torture testing" for oVirt 4.2.4 - powering off HE, the host (there is only 1 node) and then rebooting everything. One of the weird thing that happens is that when the machine boots and it starts the HE, it mounts the storage domains and everything works. However, after few moments, 3 of my 4 storage domains (ISO, export, and another storage domain, but not the hosted_engine storage domain) is being automatically deactivted, with the following errors: VDSM command GetFileStatsVDS failed: Storage domain does not exist: (u'f241db01-2282-4204-8fe0-e27e36b3a909',) Refresh image list failed for domain(s): ISO (ISO file type). Please check domain activity. Storage Domain ISO (Data Center HetzLabs) was deactivated by system because it's not visible by any of the hosts. Storage Domain data-NAS3 (Data Center HetzLabs) was deactivated by system because it's not visible by any of the hosts. Storage Domain export (Data Center HetzLabs) was deactivated by system because it's not visible by any of the hosts. However, when I see those message and I'm manually re-activating those storage domains, all of them getting the status "UP" and there are no errors and I can see disks, images, etc... Related log files are attached as ZIP.
Just to clarify - my "torture test" is powering off both the HE VM and after it's off - powering off the node (using the command, not yanking the power cable). I just changed the severity to "high"
Hets, thanks for reporting this. From your description, I understand that the flow is: 1. Setup HE with one node 2. Add ISO, Data, and Export domains on NFS 3. Wait until all domains are up 4. Power off the hosted engine vm (how do you power it off?) 5. Power off the single node (how?) 6. Boot the single node 7. hosted engine starts 8. connecting to engine UI - everything is UP are you sure your singe node is SPM? most likely it is in in "Contending" mode are you sure the DC is up? are all storage domains in UP state or DOWN state? 9. After several minutes, domains are activated Storage Domain ISO (Data Center HetzLabs) was deactivated by system because it's not visible by any of the hosts. Storage Domain data-NAS3 (Data Center HetzLabs) was deactivated by system because it's not visible by any of the hosts. Storage Domain export (Data Center HetzLabs) was deactivated by system because it's not visible by any of the hosts. 10. Wait (how much time?) 11. activating the storage domains manually, system is UP I guess that you expect that after step 8 the system is UP. Please fill in the missing details, see my questions above. I did not look at the logs yet, but what you did was unclean shutdown of the node. In this case sanlock has stale data in the lockspace, and activating the storage domains will take more time. After clean shutdown, sanlock takes 20-50 seconds to activate a storage domain (multiple storage domains are activate in parallel). After unclean shutdown, it can take several minutes. To shutdown a node cleanly, you need to put it in maintenance mode before powering down the host. I guess what happens is that engine connect to the storage, but since sanlock is blocked for new minutes because of the stale data in the lockspace from the unclean shutdown, engine decided that the storage domain are not visible and deactivates them. Engine will try to recover and reconnect the storage domain every 5 minutes. I guess that if you want about 5 minutes after the domains were deactivated, they will be activated automatically by the system. If the domain do not activate automatically by the system, this is a high severity bug. If they do activate, this is the expected behavior with the current system.
4. Power of on the HE and the node was done using ssh and then issuing the "poweroff" command. First to the HE, wait few minutes, and then ssh to the node and run the "poweroff" command. 8. The issue happens even where the setup is only 1 node, so it's also SPM. 10. The wait (on this slow server) is for 2-4 minutes after I logged in to the HE. 11. I expect that once the HE is up, all the domains should be up (which they are) and should remain in "up". As for clean shut down - there is no way to do that. Lets say that I have only 1 ovirt node running the HE. How do I cleanly shut down the node including the HE? the documentation doesn't mention anything.
(In reply to Hetz Ben Hamo from comment #3) > 4. Power of on the HE and the node was done using ssh and then issuing the > "poweroff" command. First to the HE, wait few minutes, and then ssh to the > node and run the "poweroff" command. Ok, this is not the recommend way to shutdown oVirt node. We support unclean shutdown of course, but the user experience cannot be slick as clean shutdown. > 8. The issue happens even where the setup is only 1 node, so it's also SPM. If you had another host still running, or shut down cleanly, this host will be used for the SPM, and all the domains will be UP quickly, since the other host can access them. The issue in your setup with 1 host, is that all the domains are not accessible to sanlock in the first few minutes, because of stale data left after unclean shutdown. > 10. The wait (on this slow server) is for 2-4 minutes after I logged in to > the HE. Try to wait at least 5 minutes, since storage recovery try to recover inactive domains every 5 minutes. > 11. I expect that once the HE is up, all the domains should be up (which > they are) and should remain in "up". Maybe the UI show the domains as UP - but actually they are not since sanlock needs several minutes to join the lockspace after unclear shutdown. > As for clean shut down - there is no way to do that. Lets say that I have > only 1 ovirt node running the HE. How do I cleanly shut down the node > including the HE? the documentation doesn't mention anything. Simone, can you add more details on this?
(In reply to Nir Soffer from comment #4) > > As for clean shut down - there is no way to do that. Lets say that I have > > only 1 ovirt node running the HE. How do I cleanly shut down the node > > including the HE? the documentation doesn't mention anything. > > Simone, can you add more details on this? We have https://github.com/oVirt/ovirt-ansible-shutdown-env just for that. Nir, as latest step there the latest HE host is going to be shutdown with 'shutdown -h now'. Do we have also have to manually disconnect all the storage domains before that?
Simone, is this will be part of oVirt/RHV 4.3? Are there going to be any GUI modifications for the HE (or Cockpit) to use this playbook? any docs?
(In reply to Hetz Ben Hamo from comment #6) > Simone, is this will be part of oVirt/RHV 4.3? 4.2.6 > Are there going to be any GUI modifications for the HE (or Cockpit) to use > this playbook? GUI integration will come with 4.3, now you can simply directly run an ansible playbook that triggers that role. > any docs? https://github.com/oVirt/ovirt-ansible-shutdown-env/blob/master/README.md
(In reply to Simone Tiraboschi from comment #5) > We have https://github.com/oVirt/ovirt-ansible-shutdown-env just for that. > > Nir, as latest step there the latest HE host is going to be shutdown with > 'shutdown -h now'. Do we have also have to manually disconnect all the > storage domains before that? Yes. hosted engine is responsible for bringing engine up, so it does during boot: 1. connect to hosted engine storage domain 2. prepare hosted engine disk 3. start engine 4. engine connects to other storage domains During shutdown, we need to do: 1. disconnect from all storage domains except hosted engine storage domain 2. shutdown engine 3. teardown hosted engine disk 4. disconnect from hosted engine storage domain At this point the user can use poweroff safely.
(In reply to Nir Soffer from comment #9) > During shutdown, we need to do: > > 1. disconnect from all storage domains except hosted engine storage domain The question is how to do it from engine APIs (and maybe with the ansible wrapper). AFAIK we don't have engine APIs to directly deal with low level storage connections but just with storage domains. For instance in https://docs.ansible.com/ansible/2.6/modules/ovirt_storage_domains_module.html#ovirt-storage-domains-module We can set the SD to maintenance or either unattached but both will have side effects on restarts. > 2. shutdown engine > 3. teardown hosted engine disk OK > 4. disconnect from hosted engine storage domain OK > At this point the user can use poweroff safely. In there anything we can do locally on the host without side effects on engine DB?
(In reply to Simone Tiraboschi from comment #10) > (In reply to Nir Soffer from comment #9) > > During shutdown, we need to do: > > > > 1. disconnect from all storage domains except hosted engine storage domain > > The question is how to do it from engine APIs (and maybe with the ansible > wrapper). > AFAIK we don't have engine APIs to directly deal with low level storage > connections but just with storage domains. This is basically what is done when you move a host to maintenance. The difference is not modifying the hosted engine storage domain. Maor, can you explain how to this from engine api? > For instance in > https://docs.ansible.com/ansible/2.6/modules/ovirt_storage_domains_module. > html#ovirt-storage-domains-module > > We can set the SD to maintenance or either unattached but both will have > side effects on restarts. You don't want to do this. You need to deactivate the storage domain only on the host you want to shut down. > In there anything we can do locally on the host without side effects on > engine DB? It does not matter if engine DB is modified or not, only that you modify it correctly.
(In reply to Nir Soffer from comment #11) > (In reply to Simone Tiraboschi from comment #10) > > (In reply to Nir Soffer from comment #9) > > > During shutdown, we need to do: > > > > > > 1. disconnect from all storage domains except hosted engine storage domain > > > > The question is how to do it from engine APIs (and maybe with the ansible > > wrapper). > > AFAIK we don't have engine APIs to directly deal with low level storage > > connections but just with storage domains. > > This is basically what is done when you move a host to maintenance. The > difference > is not modifying the hosted engine storage domain. > > Maor, can you explain how to this from engine api? I'm not sure what is the issue, but I found the following examples of engine-api which I hope can be helpful: Deactivate storage domain which should disconnect the storage server from all the hosts: https://www.rubydoc.info/gems/ovirt-engine-sdk/OvirtSDK4/AttachedStorageDomainService#deactivate-instance_method Remove storage server connection: https://www.rubydoc.info/gems/ovirt-engine-sdk/OvirtSDK4/StorageServerConnectionService#remove-instance_method > > > For instance in > > https://docs.ansible.com/ansible/2.6/modules/ovirt_storage_domains_module. > > html#ovirt-storage-domains-module > > > > We can set the SD to maintenance or either unattached but both will have > > side effects on restarts. > > You don't want to do this. You need to deactivate the storage domain only > on the host you want to shut down. > > > In there anything we can do locally on the host without side effects on > > engine DB? > > It does not matter if engine DB is modified or not, only that you modify it > correctly.
(In reply to Maor from comment #12) > (In reply to Nir Soffer from comment #11) > I'm not sure what is the issue, but I found the following examples of > engine-api which I hope can be helpful: > > Deactivate storage domain which should disconnect the storage server from > all the hosts: > https://www.rubydoc.info/gems/ovirt-engine-sdk/OvirtSDK4/ > AttachedStorageDomainService#deactivate-instance_method This is what we should *not* do. We want to deactivate all storage domains only on one host, basically moving host to maintenance. But since engine running on this host using the hosted engine storage domain, we cannot put the host to maintenance.
(In reply to Nir Soffer from comment #11) > > In there anything we can do locally on the host without side effects on > > engine DB? > > It does not matter if engine DB is modified or not, only that you modify it > correctly. Not really: if I set an SD in maintenace mode via REST APIs it will remain in that status also when the user power on again the system and the user has to manually exit the maintenance mode. (In reply to Maor from comment #12) > I'm not sure what is the issue, but I found the following examples of > engine-api which I hope can be helpful: > > Deactivate storage domain which should disconnect the storage server from > all the hosts: > https://www.rubydoc.info/gems/ovirt-engine-sdk/OvirtSDK4/ > AttachedStorageDomainService#deactivate-instance_method If I'm, not wrong this is the the equivalent of setting state=maintenance in ovirt_storage_domains_module ansible module ( https://docs.ansible.com/ansible/2.6/modules/ovirt_storage_domains_module ) and it's not what we want. > Remove storage server connection: > https://www.rubydoc.info/gems/ovirt-engine-sdk/OvirtSDK4/ > StorageServerConnectionService#remove-instance_method This will try to delete a storage connection from the DB and, according to it's documentation "A storage connection can only be deleted if neither storage domain nor LUN disks reference it. The host name or id is optional; providing it disconnects (unmounts) the connection from that host." which is not our case. (In reply to Nir Soffer from comment #14) > This is what we should *not* do. We want to deactivate all storage domains > only > on one host, basically moving host to maintenance. For non-he host we are calling ovirt_hosts_module with state=stopped to basically fence the host via IPMI https://docs.ansible.com/ansible/2.6/modules/ovirt_hosts_module.html#ovirt-hosts-module I hope that the engine will disconnect the storage cleanly in that case without explicitly passing though maintenance mode but in the worst case we have just to set in maintenance before fencing. The issue is with HE host (or hosts for the hyper converged case) where we have to manually disconnect on shutdown.
For the engine side, engine developers should provide a solution for deactivating storage domains on the host running hosted engine. This may be very tricky as engine is trying hard to connect storage domains to all hosts. If we cannot get engine side API that will do what we need, we can solve this on vdsm side like this. Flow using vdsm side disconnection: 1. Using oVirt SDK, make sure host does not run any vms or performing storage jobs. 2. Shut down engine At this point the host is connected to multiple storage domains, and the host is likely to be the SPM, since it is the last host. 3. Stop the SPM using StoragePool.spmStop: https://github.com/oVirt/vdsm/blob/866a984520583d8ba061a77900401ae4078f33d3/lib/vdsm/api/vdsm-api.yml#L9773 This may need to handle the case when the host is not the SPM at this point. 4. Disconnect from all storage domains using StoragePool.disconnect https://github.com/oVirt/vdsm/blob/866a984520583d8ba061a77900401ae4078f33d3/lib/vdsm/api/vdsm-api.yml#L9526 The host is still connected to storage servers (e.g. mounts, iscsi sessions) but since we are powering off, we don't care about these. At this point sanlock has removed all lockspaces on this host, and you can poweroff the host.
Sandro, I suggest to move this to hosted engine, as this is hosted engine specific issue.
Is this issue expected to happen even after the lease expires? Meaning startup after more than 5 minutes of hosts being down.
(In reply to Yaniv Lavi from comment #18) > Is this issue expected to happen even after the lease expires? > Meaning startup after more than 5 minutes of hosts being down. Yes. sanlock cannot use the host timestamp in storage to tell if the host is dead. it must watch the host for about 140 second to be sure that the host is not updating the delta lease. The timeouts are documented here: https://pagure.io/sanlock/blob/master/f/src/timeouts.h While sanlock is waiting, the host sanlock initialization cannot complete, so engine mark the storage as inactive. After 5 minutes engine try to recover the storage and succeed. I wonder if we can optimize this by configuring sanlock with unique host uuid (wanted for other reasons). If sanlock find the current host has stale data in storage, it can safely clear the stale data, if we assume that oVirt is responsible for making the host id unique. Sanlock cannot optimize this today since it uses a random UUID on every run, so there is no way to detect if data on storage belongs to this host, or to another host in the cluster, that got the same host id. David, what do you think?
I haven't grasped the details well enough to comment on possible changes, but I do think it's worthwhile to try setting persistent host names. We know that can reduce wait times in some cases, although not all. To try this, edit sanlock.conf and set our_host_name = <some unique string identifying the host> you may need to restart things a couple of times to see a difference, because the effect of this setting is that if a host sees its own host name as the failed lease owner from a previous generation, then it can skip some of the waiting.
Thanks David! we already have RFE for configuring sanlock host name. So lets make this bug depend on bug 1508098. We have 2 ways to solve this: - Provide easy way to do a clean shutdown. - Configure sanlock host name, hopefully can avoid the messy startup even after unclean shutdown (e.g power failure). The second solution looks like the best option since it handle any case. Sandro, are you going to provide an easy way to do a clean shutdown for hosted engine?
(In reply to Nir Soffer from comment #21) > Sandro, are you going to provide an easy way to do a clean shutdown for > hosted > engine? Yes, https://github.com/oVirt/ovirt-ansible-shutdown-env is designed for that. Quick question, do you think that a systemctl stop vdsmd; sanlock client shutdown -f 1 executed on the last host just before its shutdown could be enough to properly remove the lock without dealing with vdsm commands to fetch the list of the connected SDs, disconnect each of them, stop the SPM and on?
(In reply to Simone Tiraboschi from comment #22) > Quick question, do you think that a > systemctl stop vdsmd; sanlock client shutdown -f 1 > executed on the last host just before its shutdown could be enough to > properly remove the lock? Myabe if can work for data domains, but it will not work for export domain (using safelease) or local storage (using local cluster lock). I think you should avoid depending on the vdsm cluster locking implementation details. I also don't think you need to get list of connected SD and disconnect each of them. Instead, I think you should stop the SPM (will release the SPM lease) and disconnect the entire pool (will clear the delta leases), see comment 16.
(In reply to Nir Soffer from comment #21) > Sandro, are you going to provide an easy way to do a clean shutdown for > hosted engine? Simone already answered in comment #22
Nir, Simone, what's the plan here?
Moving out to 4.2.8 not being identified as blocker for 4.2.7
If this was solved with an automated procedure to power off a host, I think we can close this bug. Improve sanlock configuration may avoid some of the delays even if a user did not use the recommend procedure for powering off a host, but I don't think we have the capacity to work on this for 4.2.
Re-targeting to 4.3 since there's no capacity for fixing this in 4.2
(In reply to Sandro Bonazzola from comment #28) > Re-targeting to 4.3 since there's no capacity for fixing this in 4.2 This bug depends on bug #1508098 which is in NEW state without a target release. reopening.
that's not a reason for reopen. If it offends you then don't look at them...
(In reply to Michal Skrivanek from comment #30) > that's not a reason for reopen. If it offends you then don't look at them... How can this bug be verified if this depend on a fix which should be made on bug 1508098 ? ON_QA status for this bug is deceiving as QE can not verify this without the fix for bug 1508098. Nir can we verify this bug currently without bug 1508098 fix? If so can you please provide an clear scenario?
This is not engine storage bug but hosted engine bug. hosted engine should do the steps specified in comment 16 when shutting down the last host. According to Simone, this was solved by ansible script, see comment 22. Sandro, can you move this bug to the right component and right status?
(In reply to Avihai from comment #31) > How can this bug be verified if this depend on a fix which should be made on > bug 1508098 ? We can remove the dependency on bug 1508098. I already wrote in comment 27 about the sanlock configuration change.
(In reply to Nir Soffer from comment #32) > This is not engine storage bug but hosted engine bug. > > hosted engine should do the steps specified in comment 16 when shutting down > the last host. > > According to Simone, this was solved by ansible script, see comment 22. > > Sandro, can you move this bug to the right component and right status? According to comment 32, as this is a HE bug and not a storage bug I'm moving this bug to Meital's team which tests HE with different scenarios.
I tried to power-off the HE host with ISO and Export domains and didn't seen any of described errors after host got started and remained stable for a while. Vdsm log was clear. Works just fine on these components: ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch rhvm-4.4.0-0.33.master.el8ev.noarch Linux 4.18.0-193.el8.x86_64 #1 SMP Fri Mar 27 14:35:58 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux release 8.2 (Ootpa) Moving to verified.
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.