Bug 2203132 - NullPointerException when creating a image transfer after a RHV-M reboot
Summary: NullPointerException when creating a image transfer after a RHV-M reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.5.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.5.3-async
: ---
Assignee: Arik
QA Contact: Shir Fishbain
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-05-11 09:40 UTC by Juan Orti
Modified: 2023-06-21 19:54 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-06-21 19:54:24 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 850 0 None open Fix NPE when picking host for transfer-image 2023-05-21 06:21:16 UTC
Github oVirt ovirt-engine pull 853 0 None open Backport NPE fixes 2023-05-21 14:04:35 UTC
Red Hat Knowledge Base (Solution) 7012613 0 None None None 2023-05-11 10:06:17 UTC
Red Hat Product Errata RHSA-2023:3771 0 None None None 2023-06-21 19:54:39 UTC

Description Juan Orti 2023-05-11 09:40:08 UTC
Description of problem:
After a reboot of the RHV-M machine and while ovirt-engine was still initializing and connecting to the hosts, a new image transfer was requested and failed with a java.lang.NullPointerException

This caused the image transfer command to be stuck in phase 0 in the image_transfers table and the disk locked.

Version-Release number of selected component (if applicable):
ovirt-engine-4.5.3.7-1.el8ev.noarch

How reproducible:
Happened several times in customer environment. Not reproduced locally

Steps to Reproduce:
I guest the issue will be easier to reproduce with a large number of hosts

1. Reboot RHV-M
2. Right after ovirt-engine starts, and while not all hosts are still connected and Up, create a image transfer

Actual results:

~~~
2023-05-11 03:21:42,550+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] Starting image transfer: ImageTransfer:{id='9ef9defd-c9fc-483f-829d-7810f38b8cf6', phase='Initializing', type='Download', active='false', lastUpdated='Thu May 11 03:21:41 CEST 2023', message='null', vdsId='null', diskId='null', imagedTicketId='null', proxyUri='null', bytesSent='null', bytesTotal='2281701376', clientInactivityTimeout='3600', timeoutPolicy='legacy', imageFormat='COW', transferClientType='Transfer via API', shallow='false'}
2023-05-11 03:21:42,561+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] START, GetVolumeInfoVDSCommand(HostName = host1.example.com, GetVolumeInfoVDSCommandParameters:{hostId='09b256a2-8a6b-4fe0-b825-9b9248ba1c6d', storagePoolId='268c347f-d249-411f-a806-9ec0f50eb418', storageDomainId='4484db8c-4e14-4f86-852b-b2cffe079c02', imageGroupId='7ce86012-dce2-4ca7-9dfb-e93c33f688be', imageId='d1298146-e125-48d5-be67-f62257a3d001'}), log id: 2284cc75
2023-05-11 03:21:42,601+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVolumeInfoVDSCommand] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] FINISH, GetVolumeInfoVDSCommand, return: org.ovirt.engine.core.common.businessentities.storage.DiskImage@84fc36da, log id: 2284cc75
2023-05-11 03:21:42,637+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] Successfully added Download disk 'foo.example.com_Disk1' (disk id: '7ce86012-dce2-4ca7-9dfb-e93c33f688be', image id: 'd1298146-e125-48d5-be67-f62257a3d001') for image transfer '9ef9defd-c9fc-483f-829d-7810f38b8cf6'
2023-05-11 03:21:42,692+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] Command 'org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand' failed: null
2023-05-11 03:21:42,692+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] Exception: java.lang.NullPointerException
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.lambda$checkForActiveVds$2(TransferDiskImageCommand.java:1141)
	at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:176)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
[...]
2023-05-11 03:21:42,758+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] EVENT_ID: TRANSFER_IMAGE_INITIATED(1,031), Image Download with disk foo.example.com_Disk1 was initiated by admin@internal-authz.
2023-05-11 03:21:42,773+02 INFO  [org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand] (default task-4) [fd0145f9-2b02-4369-a2fe-711a77da1681] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[7ce86012-dce2-4ca7-9dfb-e93c33f688be=DISK, efe27b6c-115e-4cc9-bfbe-a3732258061e=VM]'}'
~~~

Image transfer present in the database:

engine=> select command_id, disk_id, imaged_ticket_id, phase, last_updated, proxy_uri, bytes_sent, bytes_total from image_transfers;
              command_id              |               disk_id                | imaged_ticket_id | phase |        last_updated        | proxy_uri | bytes_sent | bytes_total 
--------------------------------------+--------------------------------------+------------------+-------+----------------------------+-----------+------------+-------------
 9ef9defd-c9fc-483f-829d-7810f38b8cf6 | 7ce86012-dce2-4ca7-9dfb-e93c33f688be |                  |     1 | 2023-05-11 01:21:42.639+00 |           |          0 |  2281701376


Volume locked (imagestatus=2)

Expected results:
NullPointerException handled gracefully and transfer cancelled.

Additional info:

Comment 8 Shir Fishbain 2023-06-21 19:18:23 UTC
Verified
After discussing with Arik the verification covered by Automation Regression tests (tier1, tier2, and tier3).

Versions:
ovirt-engine-4.5.3.9-0.zstream.20230605102913.git3bfac2e0eac.el8.noarch
vdsm-4.50.3.8-1.el8ev.x86_64

Comment 10 errata-xmlrpc 2023-06-21 19:54:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Virtualization security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3771


Note You need to log in before you can comment on or make changes to this bug.