Created attachment 1788466 [details] Logs Description of problem: There is a NullPointerException on engine log when trying to remove the disks which were earlier uploaded to the engine NFS storage type with usage of transfer_url. The attempt to delete disks is done AFTER taking the checksums of the uploaded disks. So the reason for the NPE might be that. This NPE wasnt seen when deleting the disks uploaded with proxy_url, This NPE Was seen also when deleting the disks (with proxy) on a block storage type (ISCSI). Although NPE on the engine log, the actual disks removal is finished successfully, and there is no sign of a failure on the engine UI as well. Marking this as a regression, because this NPE wasnt seen earlier. Although this is a regression bug, I am marking it as 'medium' because this doesnt look like it affects the disk upload or removal of the disks. However, this could affect some other areas that we are not yet aware of. And also the NPE shouldn't be there if all works fine. Version-Release number of selected component (if applicable): rhv-4.4.6-10 How reproducible: 100% Steps to Reproduce: 1. Starting parallel upload of disks ['/root/upload/qcow2_v2_rhel74_ovirt42_guest_disk_1G.qcow2', '/root/upload/qcow2_v3_cow_sparse_disk_1G.qcow2', '/root/upload/test_raw_to_delete.raw', '/root/upload/1G_Fedora-Workstation-Live-x86_64-25-1.3.iso'] 2. Check image checksums for local images: ['qcow2_v2_rhel74_ovirt42_guest_disk_1G.qcow2', 'qcow2_v3_cow_sparse_disk_1G.qcow2', 'test_raw_to_delete.raw', '1G_Fedora-Workstation-Live-x86_64-25-1.3.iso'] 3. Check disk checksums for the uploaded disks: ['qcow2_v2_rhel74_ovirt42_guest_disk_1G.qcow2', 'qcow2_v3_cow_sparse_disk_1G.qcow2', 'test_raw_to_delete.raw', '1G_Fedora-Workstation-Live-x86_64-25-1.3.iso'] Actual results: NPE 2021-06-01 13:42:58,077+03 INFO [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-1) [a74c219e-622e-45a0-b05c-c80437802c13] Exception in invoking callback of command TransferDiskImage (922f35a5-a542-4c79-9a0e-0be0d920f84d): NullPointerException: 2021-06-01 13:42:58,077+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-1) [a74c219e-622e-45a0-b05c-c80437802c13] Error invoking callback method 'doPolling' for 'ACTIVE' command '922f35a5-a542-4c79-9a0e-0be0d920f84d' 2021-06-01 13:42:58,077+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-1) [a74c219e-622e-45a0-b05c-c80437802c13] Exception: java.lang.NullPointerException at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.getImageAlias(TransferDiskImageCommand.java:326) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.getTransferDescription(TransferDiskImageCommand.java:1401) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.handleFinishedSuccess(TransferDiskImageCommand.java:904) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.executeStateHandler(TransferDiskImageCommand.java:535) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.proceedCommandExecution(TransferDiskImageCommand.java:492) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommandCallback.doPolling(TransferImageCommandCallback.java:21) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) Expected results: Shouldnt be NPE Additional info: Attaching engine + vdsm + daemon logs from all involved.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
The steps listed above are not reproducing the bug, the only way to reproduce it was by debugging and adding breakpoints to the transfer thread. need to check the following scenario: 1. initiate disk upload - > as soon as the disk status becomes OK try to delete it. 2. initiate disk download - > as soon as the disk status becomes OK try to delete it. No NPEs errors should be raised for both of the scenarios.
Verified. Versions: ovirt-engine-4.4.8.4-0.7.el8ev.noarch Steps to Reproduce: The same steps as described in the bug Description which was tested in the automation test "TestUploadImages". First tried to reproduce the NullPointerException in an older version without the fix (ovirt-engine-4.4.8.3-0.10.el8ev.noarch): 2021-08-16 16:18:47,423+03 ERROR [org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-5) [b5864677-df89-4554-9a92-b2ad41ff3164] Exception: java.lang.NullPointerException at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.getImageAlias(TransferDiskImageCommand.java:326) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.getTransferDescription(TransferDiskImageCommand.java:1401) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.handleFinishedSuccess(TransferDiskImageCommand.java:904) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.executeStateHandler(TransferDiskImageCommand.java:535) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferDiskImageCommand.proceedCommandExecution(TransferDiskImageCommand.java:492) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.disk.image.TransferImageCommandCallback.doPolling(TransferImageCommandCallback.java:21) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) Then, tried to run it on the fixed version (ovirt-engine-4.4.8.4-0.7.el8ev.noarch) twice successfully with no NPE.
This bugzilla is included in oVirt 4.4.8 release, published on August 19th 2021. Since the problem described in this bug report should be resolved in oVirt 4.4.8 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.