Description of problem: Users migrated VM disks to new storage domain vmstore2, causing it to become full. The storage domain was then deactivated by RHV: [root@rhvm ovirt-engine]# zgrep -i vmstore2.*deactivated engine.log* engine.log-20201218.gz:2020-12-17 08:47:30,387-05 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-500789) [4159363a] EVENT_ID: SYSTEM_DEACTIVATED_STORAGE_DOMAIN(970), Storage Domain vmstore2 (Data Center Default) was deactivated by system because it's not visible by any of the hosts. To recover, we had to overwrite one of the VM disks to free up some space: [root@dell-r640-01 ~]# echo 1 > /rhev/data-center/mnt/glusterSD/dell-r640-01.gluster.tamlab.rdu2.redhat.com:_vmstore2/04ad3ba4-4459-4f56-a73f-c07fcaa1617e/images/eedf329f-1f84-4845-928b-9284fbfb363c/862784b8-0512-4531-a3de-8562f23c8535 We also had to override the critical_space_action_blocker: [root@rhvm ~]# /usr/share/ovirt-engine/dbscripts/engine-psql.sh -c "update storage_domain_static set critical_space_action_blocker = '1' where storage_name = 'vmstore2';" UPDATE 1 We were then able to activate the SD (storage/domains/vmstore2/data center/activate) and migrate VM disks from it to another SD. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
What is actually the issue you want to report? Deactivating SD when there's no disk space available is expected/correct behavior. Recovering SD once free disk space is available should happen automatically - I wasn't able to reproduce a situation when I have to remove anything from the DB (of course have to free some disk space). Are you able to reproduce it?
Donald, Can you please add some more information on this case? What kind of environment you are using (hyper-converged/hosted engine etc..)? Do you have the steps to reproduce this case? How the storage domain is configured (create_checkpoint_xml, create_checkpoint_xml, create_checkpoint_xml)? And if you have engine and VDSM logs please add those to the bug.
Hi Eyal, this is RHHI-V 1.7 (RHV 4.3), HE. You can log in to this internal env using the info in comment 1. Vojtech says he can reproduce it, and I think to reproduce it you would just make a SD full. The logs have been overwritten (see c2, c4). I am saying we should not deactivate the SD when it becomes full, because then we can't migrate VM disks off of it. I think there were some warning messages in the GUI that it was getting full, but it was not clear from those that the SD would be deactivated. Are email notifications an option? Don
Raising insights rule flag to add an insight rule do detect storage domains that got to low space indicator. Some more details about this parameter can be found in bz#1667783.
Created attachment 1789766 [details] screen sharing after steps has performed
Closing this bug as i tried reproducing this but couldn't. Please try this with latest version once and if you encounter the same issue again feel free to re-open this case.
"Specifically, there was 13gb left in a block-based SD. I created a preallocated 12gb disk and it created it without a problem and without warning" - that's the flow we are going to address
QE doesn't have the capacity to verify this bug during 4.5.1.
For QE: The space validation for copying disks was fixed to avoid exceeding available size As for not allowing new disks to get into the critical space blocker, it does not seem like the intended original behavior as the critical space blocker is intended to block operations when the SD is in it, not before.
1. Try to copy a disk to a storage domain that doesn't have enough space for it - should be blocked
Verified. Copying the disk to a storage domain that doesn't have enough space finished with an error and operation was cancelled. SD works well after it. Versions: engine-4.5.1.2-0.11.el8ev vdsm-4.50.1.3-1.el8ev.x86_64
This bugzilla is included in oVirt 4.5.1 release, published on June 22nd 2022. Since the problem described in this bug report should be resolved in oVirt 4.5.1 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.