Description of problem: scenario: you got a datacenter with one host (say local storage) with running vms. the hosts crashes fatally (disc defect) and can't be restored. you want to remove the whole dc/cluster/host/"running vms" (they are "running" in the db, but the host is gone". there is no easy way to do this. I had to tweak the DB in order to achieve this. even awels on IRC couldn't find a tool (there maybe is something somewhere..) to do this. if there _is_ a tool to do this in a clean way, please document this better, because even RH employees don't find it ;) Version-Release number of selected component (if applicable): latest master? How reproducible: always Steps to Reproduce: 1. the host was in status "non responsive" with 2 "running" vms on it (actually it wasn't even powered on as the hard discs got replaced) 2. so I wanted to delete everything, I couldn't put the host in maintenance, because ovirt complains: "Error while executing action: Cannot switch Host to Maintenance mode. Host still has running VMs on it and is in Non Responsive state." 3. so i deleted by hand in the database: select deletevm('VM1_UUID'); select deletevm('VM2_UUID'); engine=# select vds_id from vds_static where vds_name = 'MYBROKENHOST'; vds_id =$BROKEN_HOST_UUID delete from vds_statistics where vds_id='$BROKEN_HOST_UUID'; delete from vds_dynamic where vds_id='$BROKEN_HOST_UUID'; delete from vds_static where vds_id='$BROKEN_HOST_UUID'; Than I could force-remove the DC, and the cluster Actual results: long and error prone and risky actions to remove stuff which is _gone_ Expected results: provide a tool (maybe even GUI, but not necessary) to clean up broken hosts, gone vms etc. Additional info: I encountered this in ovirt 3.3.3, but I think there was no improvement in this area. I'd like to hear if there is a cmd tool to do this! the component might be wrong (webadmin), but I can't find "ovirt-engine-core" which did exist in the past. kind regards Sven
Have you tried right clicking on the host and press on the "confirm host has been rebooted" option? That should get into a state that allows safe removal.
in addition to comment 1 - this flow should allow you to clean up easily using the UI, this will clean the db, any storage on the host (if still exists) should be removed manually: again this flow relies on the user that the host is really down, because engine cannot verify this 1. right click the host and select "confirm host has been rebooted" - this will clean the "SPM" status of the host and will move all running vms to down. 2. assuming storage is not reachable: right click the Data Center and select "Force remove" - this will remove vm/templates related to this DC and also storage domains, finally the DC itself now you should be left with the cluster and the host: 3. move host to maintenance - now can be removed 4. remove cluster
Based on the flow in Comment #2, and my comment, closing as NOTABUG.
I'm 99% sure that the option "confirm host has been rebooted" was grayed out, thus could not be selected. afaik you can only select this option, if you are able to bring the host into maintenance state, which was not the case. I'd like to reopen therefore. I'll try to reproduce, but this could take some time..
(In reply to Sven Kieske from comment #4) > I'm 99% sure that the option "confirm host has been rebooted" was grayed > out, thus could not be selected. > > afaik you can only select this option, if you are able to bring the host into > maintenance state, which was not the case. > > I'd like to reopen therefore. > > I'll try to reproduce, but this could take some time.. It shouldn't be, so it it was then it could be a different bug Please open a new bug if you get to reproduce this.
I can now confirm that this works almost as described here: (In reply to Omer Frenkel from comment #2) > in addition to comment 1 - this flow should allow you to clean up easily > using the UI, this will clean the db, any storage on the host (if still > exists) should be removed manually: > again this flow relies on the user that the host is really down, because > engine cannot verify this > > 1. right click the host and select "confirm host has been rebooted" - this > will clean the "SPM" status of the host and will move all running vms to > down. > > 2. assuming storage is not reachable: > right click the Data Center and select "Force remove" - this will remove > vm/templates related to this DC and also storage domains, finally the DC > itself > -> could not remove DC because of hosts in it which are not in maintenance! "Error while executing action: Cannot remove Data Center while there are Hosts that are not in Maintenance mode." --> I did put the host into maintenance, then it worked! > now you should be left with the cluster and the host: > 3. move host to maintenance - now can be removed > 4. remove cluster