Description of problem: engine.log is flooded with messages of this kind. All values look greater than Integer.MAX_VALUE ~~~ 2021-02-12 04:57:12,979+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [] Failed to parse 'writeOps' value '3930677226' to integer: For input string: "3930677226" 2021-02-12 04:57:19,508+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-38) [] Failed to parse 'readOps' value '8116138202' to integer: For input string: "8116138202" 2021-02-12 04:57:19,974+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-36) [] Failed to parse 'writeOps' value '2771677512' to integer: For input string: "2771677512" 2021-02-12 04:57:19,974+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-36) [] Failed to parse 'writeOps' value '5088569863' to integer: For input string: "5088569863" 2021-02-12 04:57:20,339+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-95) [] Failed to parse 'readOps' value '46185802590' to integer: For input string: "46185802590" 2021-02-12 04:57:21,435+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84) [] Failed to parse 'writeOps' value '3141675359' to integer: For input string: "3141675359" 2021-02-12 04:57:21,436+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-84) [] Failed to parse 'writeOps' value '3628042759' to integer: For input string: "3628042759" 2021-02-12 04:57:21,552+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-67) [] Failed to parse 'writeOps' value '2284978555' to integer: For input string: "2284978555" 2021-02-12 04:57:22,223+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-68) [] Failed to parse 'readOps' value '6345688634' to integer: For input string: "6345688634" 2021-02-12 04:57:22,223+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-68) [] Failed to parse 'readOps' value '6147464429' to integer: For input string: "6147464429" 2021-02-12 04:57:22,224+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-68) [] Failed to parse 'readOps' value '6402747154' to integer: For input string: "6402747154" 2021-02-12 04:57:23,208+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-95) [] Failed to parse 'writeOps' value '3166448596' to integer: For input string: "3166448596" 2021-02-12 04:57:24,869+01 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-18) [] Failed to parse 'readOps' value '2299747062' to integer: For input string: "2299747062" ~~~ $ grep "Failed to parse" /var/log/ovirt-engine/engine.log|wc -l 51968 Some VMs report these high numbers: ~~~ # vdsm-client Host getAllVmStats "disks": { "vdb": { "readLatency": "1025106.39766", "writtenBytes": "22041295298048", "writeOps": "753416638", <----------- "apparentsize": "1071594340352", "readOps": "7324395556", <----------- "writeLatency": "889729.787007", "imageID": "cc25e611-bfdf-46be-8c25-f67a46a7cf1e", "readBytes": "97406204247040", "flushLatency": "223080.341074", "readRate": "12150237.8667", "truesize": "1071594340352", "writeRate": "4725589.33333" }, ~~~ Version-Release number of selected component (if applicable): ovirt-engine-4.4.3.12-0.1.el8ev.noarch How reproducible: Constantly in customer environment with flash storage. I haven't reproduced it locally because it needs a high performance storage. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
A consequence of the fix for bz 1880424
Please provide a verification scenario, thanks.
I trink the easiest way is to: 1) Create VM with disk 2) Do massive disk actions on the VM, to make sure read or/and writeOps grow over Int value (2147483647). 3) If the value is bigger than 2147483647, and there are no errors logged, it should be fine :)
After advising with Arik, I performed the following on rhv-4.4.6-7 So if no one objects, I will move this bug to 'verified' In order to tweak the values returned by VDSM, Alter the values of stats['writeOps'] AND stats['readOps'] inside of: /usr/lib/python3.6/site-packages/vdsm/virt/vmstats.py As so (just before the 'return' around line 400: stats['readOps'] = str(2147483648) stats['writeOps'] = str(2147483648) return stats By hard-coding any value above 2147483647 we are mimicking writeOps and readOps grow over Int value (2147483647) Restart the VDSMD service Observe the logs Expected: No Errors related to the issue reported in this BZ in the engine + vdsm log after successful restart of the VDSMD Actual: As expected Observe the altered values on the discs 'statistics' with a help of REST API: GET ovirt-engine/api/disks/29bb52d7-34a2-429f-be95-8727bd4f31a8/statistics Expected: New values are written for the disk 'write_ops' and 'read_ops' Actual: As expected [1] [1]: <statistics> <statistic href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e/statistics/33b9212b-f9cb-3fd0-b364-248fb61e1272" id="33b9212b-f9cb-3fd0-b364-248fb61e1272"> <name>data.current.read</name> <description>Read data rate</description> <kind>gauge</kind> <type>decimal</type> <unit>bytes_per_second</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e" id="c358ba9b-7978-4065-bbff-5e849400a33e"/> </statistic> <statistic href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e/statistics/467c9655-64a5-3abd-bff7-bdd69cfae38a" id="467c9655-64a5-3abd-bff7-bdd69cfae38a"> <name>data.current.read_ops</name> <description>Read data ops</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>2147483648</datum> </value> </values> <disk href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e" id="c358ba9b-7978-4065-bbff-5e849400a33e"/> </statistic> <statistic href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e/statistics/2f23addd-4ebd-3d82-a449-c28778bc33eb" id="2f23addd-4ebd-3d82-a449-c28778bc33eb"> <name>data.current.write</name> <description>Write data rate</description> <kind>gauge</kind> <type>decimal</type> <unit>bytes_per_second</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e" id="c358ba9b-7978-4065-bbff-5e849400a33e"/> </statistic> <statistic href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e/statistics/fc6ba9f6-199a-3c17-b989-489cecbe7b2f" id="fc6ba9f6-199a-3c17-b989-489cecbe7b2f"> <name>data.current.write_ops</name> <description>Write data ops</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>2147483648</datum> </value> </values> <disk href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e" id="c358ba9b-7978-4065-bbff-5e849400a33e"/> </statistic> <statistic href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e/statistics/3a7b3f72-d035-3bb9-b196-e86a4eb34993" id="3a7b3f72-d035-3bb9-b196-e86a4eb34993"> <name>disk.read.latency</name> <description>Read latency</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e" id="c358ba9b-7978-4065-bbff-5e849400a33e"/> </statistic> <statistic href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e/statistics/b1e75c7b-cea4-37d2-8459-f7d68efc69a3" id="b1e75c7b-cea4-37d2-8459-f7d68efc69a3"> <name>disk.write.latency</name> <description>Write latency</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e" id="c358ba9b-7978-4065-bbff-5e849400a33e"/> </statistic> <statistic href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e/statistics/9c17ad7b-9ef1-3e8d-ad0a-ff8bee3925f0" id="9c17ad7b-9ef1-3e8d-ad0a-ff8bee3925f0"> <name>disk.flush.latency</name> <description>Flush latency</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/c358ba9b-7978-4065-bbff-5e849400a33e" id="c358ba9b-7978-4065-bbff-5e849400a33e"/> </statistic> </statistics>
(In reply to Ilan Zuckerman from comment #12) > After advising with Arik, I performed the following on rhv-4.4.6-7 > So if no one objects, I will move this bug to 'verified' lgtm
@sradko I repeated the steps as you requested, and i checked the engine DB this time to make sure that the values there were changed as well. The answer is yes. When checking a VM disk with image_id 4007d318-3a5b-4ecd-b093-35ff78d39d46 , we see that the 'write_ops' and 'read_ops' values are updated both, on API and in the DB: DB: engine=# select * from disk_image_dynamic where image_id = '4007d318-3a5b-4ecd-b093-35ff78d39d46'; -[ RECORD 1 ]---------+------------------------------------- image_id | 4007d318-3a5b-4ecd-b093-35ff78d39d46 read_rate | 0 write_rate | 171459 actual_size | 10871635968 read_latency_seconds | 0.000000000 write_latency_seconds | 0.001412015 flush_latency_seconds | 0.000207095 _update_date | 2021-05-03 15:15:05.083915+03 read_ops | 2147483648 write_ops | 2147483648 API: Finding the disk 'image_id' for disk 9869d66b-3a0b-469d-a7ea-1420c4f62ecb GET ovirt-engine/api/disks/ <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"> <actions> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/reduce" rel="reduce"/> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/copy" rel="copy"/> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/sparsify" rel="sparsify"/> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/export" rel="export"/> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/move" rel="move"/> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/refreshlun" rel="refreshlun"/> </actions> <name>latest-rhel-guest-image-8.3-infra</name> <description>latest-rhel-guest-image-8.3-infra (c5705fc)</description> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/disksnapshots" rel="disksnapshots"/> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/permissions" rel="permissions"/> <link href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics" rel="statistics"/> <actual_size>10871635968</actual_size> <alias>latest-rhel-guest-image-8.3-infra</alias> <backup>none</backup> <content_type>data</content_type> <format>cow</format> <image_id>4007d318-3a5b-4ecd-b093-35ff78d39d46</image_id> <propagate_errors>false</propagate_errors> <provisioned_size>10737418240</provisioned_size> <qcow_version>qcow2_v3</qcow_version> <shareable>false</shareable> <sparse>true</sparse> <status>ok</status> <storage_type>image</storage_type> <total_size>10871635968</total_size> <wipe_after_delete>false</wipe_after_delete> <disk_profile href="/ovirt-engine/api/diskprofiles/5c98868a-abf8-414f-bf52-e63e51fbf66a" id="5c98868a-abf8-414f-bf52-e63e51fbf66a"/> <quota href="/ovirt-engine/api/datacenters/14625613-3e8e-439f-b488-1c65925a0465/quotas/c7e0b319-74bb-4514-a522-edb1699b51f8" id="c7e0b319-74bb-4514-a522-edb1699b51f8"/> <storage_domains> <storage_domain href="/ovirt-engine/api/storagedomains/5ae0cd37-688f-49a9-8048-1bdc8f35617a" id="5ae0cd37-688f-49a9-8048-1bdc8f35617a"/> </storage_domains> </disk> Querying the stats for that image ID: GET /api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics This XML file does not appear to have any style information associated with it. The document tree is shown below. <statistics> <statistic href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics/33b9212b-f9cb-3fd0-b364-248fb61e1272" id="33b9212b-f9cb-3fd0-b364-248fb61e1272"> <name>data.current.read</name> <description>Read data rate</description> <kind>gauge</kind> <type>decimal</type> <unit>bytes_per_second</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"/> </statistic> <statistic href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics/467c9655-64a5-3abd-bff7-bdd69cfae38a" id="467c9655-64a5-3abd-bff7-bdd69cfae38a"> <name>data.current.read_ops</name> <description>Read data ops</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>2147483648</datum> </value> </values> <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"/> </statistic> <statistic href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics/2f23addd-4ebd-3d82-a449-c28778bc33eb" id="2f23addd-4ebd-3d82-a449-c28778bc33eb"> <name>data.current.write</name> <description>Write data rate</description> <kind>gauge</kind> <type>decimal</type> <unit>bytes_per_second</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"/> </statistic> <statistic href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics/fc6ba9f6-199a-3c17-b989-489cecbe7b2f" id="fc6ba9f6-199a-3c17-b989-489cecbe7b2f"> <name>data.current.write_ops</name> <description>Write data ops</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>2147483648</datum> </value> </values> <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"/> </statistic> <statistic href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics/3a7b3f72-d035-3bb9-b196-e86a4eb34993" id="3a7b3f72-d035-3bb9-b196-e86a4eb34993"> <name>disk.read.latency</name> <description>Read latency</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"/> </statistic> <statistic href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics/b1e75c7b-cea4-37d2-8459-f7d68efc69a3" id="b1e75c7b-cea4-37d2-8459-f7d68efc69a3"> <name>disk.write.latency</name> <description>Write latency</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"/> </statistic> <statistic href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb/statistics/9c17ad7b-9ef1-3e8d-ad0a-ff8bee3925f0" id="9c17ad7b-9ef1-3e8d-ad0a-ff8bee3925f0"> <name>disk.flush.latency</name> <description>Flush latency</description> <kind>gauge</kind> <type>decimal</type> <unit>seconds</unit> <values> <value> <datum>0</datum> </value> </values> <disk href="/ovirt-engine/api/disks/9869d66b-3a0b-469d-a7ea-1420c4f62ecb" id="9869d66b-3a0b-469d-a7ea-1420c4f62ecb"/> </statistic> </statistics>
Since the verification flow was reviewed by dev, and no objections / rejects came out, moving this to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: RHV Manager security update (ovirt-engine) [ovirt-4.4.6]), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2179