Steps to reproduce: 1. Create a block storage (iSCSI/FCP) domain of 10G. 2. Create 1 preallocated disk of vitrtualSize=actualSize = 4G Expected result without the fix: - warning in the audit log Expected result with the fix: - no warning
Created attachment 998595 [details] server, engine and vdsm logs Added logs
Kevin, what is the FreeSpaceCriticalLow's value? It's set to 5GB by default, so from your test it seems we have 2g<5G, which is OK. If you have 6G free space and want to test this bug, you should preallocate a 1G disk. Allon, I don't understand your comment#2 as well, looks like there shouldn't be a warning at all with this scenario.
(In reply to Vered Volansky from comment #5) > Kevin, what is the FreeSpaceCriticalLow's value? > It's set to 5GB by default, so from your test it seems we have 2g<5G, which > is OK. > If you have 6G free space and want to test this bug, you should preallocate > a 1G disk. > > Allon, I don't understand your comment#2 as well, looks like there > shouldn't be a warning at all with this scenario. In comment #2, the scenario will result in (6GB - epsilon) free space. Since we use ints, that's evaluated as 5, which will produce a warning if we use <= instead of <.
Kevin, to conclude: Please verify no alerts using a domain with 5GB <= freeSPace < 6GB . Any freeSpace < 5GB should yield a warning.
I ran the following scenario: Created a LUN of 11g on the storage server Created a Block storage domain using the 11g LUN Storage domain displays Virtual size as 10G, Free space 6g (2 OVF disks were created on the LUN) Created a 1g Preallocated block disk - PASSED Storage domain displays Virtual size as 10G, Free space 5g (No warning is displayed) Created a 1g Preallocated block disk - PASSED Storage domain displays Virtual size as 10G, Free space 4g (No warning is displayed) Created a 1g Preallocated block disk - FAILS Storage domain displays Virtual size as 10G, Free space 4g (Lo disk space error displayed) THEN Deleted one of the 2 1g disks previously created - PASSED Storage domain displays Virtual size as 10G, Free space 5g (No warning is displayed) Created a 2g Preallocated block disk - PASSED Storage domain displays Virtual size as 10G, Free space 3g (No warning is displayed) (This is problematic)
Adding logs: (In reply to Kevin Alon Goldblatt from comment #8) > I ran the following scenario: > > Created a LUN of 11g on the storage server > > Created a Block storage domain using the 11g LUN > > Storage domain displays Virtual size as 10G, Free space 6g (2 OVF disks were > created on the LUN) > > Created a 1g Preallocated block disk - PASSED > Storage domain displays Virtual size as 10G, Free space 5g (No warning is > displayed) > > Created a 1g Preallocated block disk - PASSED > Storage domain displays Virtual size as 10G, Free space 4g (No warning is > displayed) > > Created a 1g Preallocated block disk - FAILS > Storage domain displays Virtual size as 10G, Free space 4g (Lo disk space > error displayed) > ERROR in ENGINE.LOG................................................. 2015-03-08 17:36:43,457 WARN [org.ovirt.engine.core.bll.AddDiskCommand] (ajp-/127.0.0.1:8702-7) [2318e604] CanDoAction of action AddDisk failed for user admin@inte rnal. Reasons: VAR__ACTION__ADD,VAR__TYPE__VM_DISK,ACTION_TYPE_FAILED_DISK_SPACE_LOW_ON_STORAGE_DOMAIN,$storageName block_11g 2015-03-08 17:38:49,508 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-30) [749faddb] Correlation ID: n ull, Call Stack: null, Custom Event ID: -1, Message: Critical, Low disk space. nfs2 domain has 4 GB of free space > > THEN > > Deleted one of the 2 1g disks previously created - PASSED > Storage domain displays Virtual size as 10G, Free space 5g (No warning is > displayed) FROM ENGINE.LOG.......................... 2015-03-08 17:39:34,571 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-22) [2ff6b5a0] Correlation ID: 503b611e, Job ID: 0c0d01e1-cfe4-4252-b3a6-dcd646846b3d, Call Stack: null, Custom Event ID: -1, Message: Disk 1g_b was successfully removed from domain block_11g (User admin@internal). > > > Created a 2g Preallocated block disk - PASSED > Storage domain displays Virtual size as 10G, Free space 3g (No warning is > displayed) (This is problematic) FROM ENGINE.LOG............................... 2015-03-08 17:55:51,972 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-7) Correlation ID: 44629d9f, Job ID: 4d4306b9-bb85-4913-8da3-226da28f2393, Call Stack: null, Custom Event ID: -1, Message: The disk '2g' was successfully added. SEVERAL HOURS LATER DURING during ProcessOvf_For_StorageDomainCommand the storage domain is reported as having low disk space. FROM ENGINE.LOG................................. 2015-03-09 00:11:30,824 INFO [org.ovirt.engine.core.bll.ProcessOvfUpdateForStorageDomainCommand] (DefaultQuartzScheduler_Worker-64) [b525ba2] Lock freed to object EngineLock [exclusiveLocks= key: 88eb14b4-a7fc-40ab-be6b-e2ab7bc31dbc value: STORAGE , sharedLocks= key: 6d96f52d-d791-4f66-83bd-2553ca0f3012 value: OVF_UPDATE ] 2015-03-09 00:23:46,058 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-33) [6d58ef36] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Warning, Low disk space.block_11g domain has 3 GB of free space
Created attachment 999520 [details] server, engine and vdsm logs Adding logs
Works for me, discussed with Kevin, who verified. The issue with the verification was caused by the anti-flooding of log messages mechanism. The behaviour was OK, slightly different verification method and monitoring was conducted. Kevin will fill in and move to verified.
Vered, this BZ implies a (slight) behavior change in RHEV. Please provide doctext that explains the change. Thanks!
Tested the following scenario: Created a LUN of 11g on the storage server Created a Block storage domain using the 11g LUN Storage domain displays Virtual size as 10G, Free space 6g (2 OVF disks were created on the LUN) Created a 1g Preallocated block disk - PASSED Storage domain displays Virtual size as 10G, Free space 5g (No warning is displayed) Created a 1g Preallocated block disk - PASSED Storage domain displays Virtual size as 10G, Free space 4g - warning displayed as follows: Warning, Low disk space.block1 domain has 4 GB of free space The reason I had no warning in the previous scenario is due to the following: When a storage domain reports low disk space for the first time, the "flooding mechanism" prevents recurring messages from being displayed. Moving to verified
Hi Vered, I have updated the doc text. Please let me know if it is correct or not. Kind regards, Julie
Hi Julie, I'd like to get rid of the threshold term her, since other changes are on their way since I wrote this doc-text. We DIDN'T change number presentation, we're still using integers. The only change was from <= to <. The reason this solves the issue is the truncation is no longer an issue. For example, if before we had 5.5GB free space, this would be truncated to 5, and when compared with 5 using <=, the answer was yes. So for 5.5GB free space we would send an alert, though we meant to do so only under 5GB. Now it's still truncated to 5, but comparison to 5 using < yields false, meaning no alert is generated. I suggest the following change, not including the </<=/truncation I added before, I'll leave that to you since I don't know how deep you want to dive into this... Previously, less than or equal to (<=) was used when monitoring storage free space. In addition, integer numbers were used and caused fractions to be truncated. This triggered alerts for low disk space when it shouldn't have. With this update, when checking storage free space, less than (<) is now used. Alerts for low disk space are now generated appropriately.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0888.html