Bug 1875951 - Disk hot-unplug fails on engine side with NPE in setDiskVmElements after unplugging from the VM.
Summary: Disk hot-unplug fails on engine side with NPE in setDiskVmElements after unpl...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.3.9
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ovirt-4.4.4
: 4.4.4
Assignee: Ahmad Khiet
QA Contact: Ilan Zuckerman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-04 17:24 UTC by Anitha Udgiri
Modified: 2024-03-25 16:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-02 13:57:12 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ssl_access_log of RHV-M's sosreport (12.83 MB, text/plain)
2020-09-11 16:08 UTC, Jeongtae Kim
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5411901 0 None None None 2020-09-17 01:00:56 UTC
Red Hat Product Errata RHSA-2021:0381 0 None None None 2021-02-02 13:57:41 UTC
oVirt gerrit 111203 0 master MERGED engine: NPE when detach a disk from VM with REST API 2021-02-02 12:40:59 UTC

Description Anitha Udgiri 2020-09-04 17:24:40 UTC
Description of problem:

Customer is running RHVM 4.3.9 and gets a NPE(NullPointerException) in ovirt-engine when some disks are detached(hot-unplug) from VM via RESTapi.

Version-Release number of selected component (if applicable):

rhvm-4.3.9.4-15.bz1830762.el7.noarch

How reproducible:

Intermittent

Comment 4 Eyal Shenitzky 2020-09-07 14:16:52 UTC
Hi Anitha,

Can you please add the steps to reproduce the issue?

Comment 5 Anitha Udgiri 2020-09-08 16:57:11 UTC
(In reply to Eyal Shenitzky from comment #4)
> Hi Anitha,
> 
> Can you please add the steps to reproduce the issue?

Eyal,
   Have the Customer for the steps they followed when they saw this error ( it is intermittent as per their update ) . Will update as as soon as they respond.

Comment 13 Jeongtae Kim 2020-09-11 16:08:03 UTC
Created attachment 1714584 [details]
ssl_access_log of RHV-M's sosreport

Comment 14 Anitha Udgiri 2020-09-14 18:44:30 UTC
(In reply to Eyal Shenitzky from comment #4)
> Hi Anitha,
> 
> Can you please add the steps to reproduce the issue?

Eyal,

Jeongtae Kim has provided you the details and Bimal also has provided some steps for reproducing the issue.

I have added another Customer facing the issue.

Let me know if there is anything else that you would like.

Thanks,
Anitha

Comment 17 Germano Veit Michel 2020-09-17 00:37:32 UTC
Ahmad, I've reproduced the NPE at BaseDisk.java:88 in 4.4. 
It seems to be caused by simultaneous access to vm_disk_element.

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.1.10-0.1.el8ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create 10 Disks
2. Attach all of them to a VM
3. Detach them in parallel
4. 1 or 2 will fail with NPE at BaseDisk.java:88

Example:
~~~
# cat reproducer.sh 
#!/bin/bash
ENGINE='engine.kvm'
USER='admin'
DOMAIN='internal'
PASS='redhat'
DATA_VM='Ubuntu'
DISK_IDS=(
'3d666ced-288f-4ddb-9c31-a533af191469'
'105d8c0b-566d-47ef-8cda-d7add5253f08'
'22fe589e-0257-4232-bbd7-0ee7d2a8ab20'
'17baf778-b1d9-4061-9537-a0de8113f5e7'
'2245d18d-95bd-4966-b60d-38e82f8d574c'
'dbf65c1a-9dbf-49c6-9bb4-3b461d6d862f'
'a43885df-2855-4262-ab78-dfece6708924'
'a8e17a6f-2725-4aa9-8501-f472f3aa3962'
'7fc44709-89b2-40ee-8418-fda7fc984561'
'8b79e0d5-2997-4c2b-a79c-5678fac4fea1'
)

VM_URL=$(curl -k  -u "${USER}@${DOMAIN}:${PASS}" -X GET https://${ENGINE}/ovirt-engine/api/vms?search=name%3D${DATA_VM} | xmllint --xpath 'string(/vms/vm/@href)' -)

for ID in ${DISK_IDS[@]}
do
curl --cacert /etc/pki/ovirt-engine/ca.pem -u "${USER}@${DOMAIN}:${PASS}" -X POST -H "Accept: application/xml" -H "Content-type: application/xml" https://${ENGINE}${VM_URL}/diskattachments/ --data-binary @- << EOF
<disk_attachment>
  <active>true</active>
  <interface>virtio_scsi</interface>
  <disk id="${ID}"/>
</disk_attachment>
EOF
done

for ID in ${DISK_IDS[@]}
do
curl -k -u "${USER}@${DOMAIN}:${PASS}" -X DELETE -H "Accept: application/xml" -H "Content-type: application/xml" https://${ENGINE}${VM_URL}/diskattachments/${ID}?detach_only=true &
done

~~~

The NPEs are the same as in comment #1.

2020-09-17 10:20:39,330+10 ERROR [org.ovirt.engine.core.bll.storage.disk.DetachDiskFromVmCommand] (default task-23) [cd7e5534-0d1d-438f-b609-305da6e004ff] Command 'org.ovirt.engine.core.bll.storage.disk.DetachDiskFromVmCommand' failed: null
2020-09-17 10:20:39,330+10 ERROR [org.ovirt.engine.core.bll.storage.disk.DetachDiskFromVmCommand] (default task-23) [cd7e5534-0d1d-438f-b609-305da6e004ff] Exception: java.lang.NullPointerException
        at org.ovirt.engine.core.common//org.ovirt.engine.core.common.businessentities.storage.BaseDisk.lambda$setDiskVmElements$0(BaseDisk.java:88)
        at java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:177)
        at java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
        at java.base/java.util.Collections$2.tryAdvance(Collections.java:4747)
        at java.base/java.util.Collections$2.forEachRemaining(Collections.java:4755)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
        at org.ovirt.engine.core.common//org.ovirt.engine.core.common.businessentities.storage.BaseDisk.setDiskVmElements(BaseDisk.java:88)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.VmHandler.updateDisksVmDataForVm(VmHan

Comment 20 Ilan Zuckerman 2020-12-08 14:36:00 UTC
Verified on rhv-release-4.4.4-4-001.noarch with the steps and script from comment #17

1. Create 10 Disks
2. Attach all of them to a VM
3. Detach them in parallel

All the detachments succeeded without any ERRORs.
I repeated the steps with VM up/down with disks thin and, with VM up/down with disks raw

Comment 24 errata-xmlrpc 2021-02-02 13:57:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Low: RHV-M(ovirt-engine) 4.4.z security, bug fix, enhancement update [ovirt-4.4.4]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0381

Comment 25 meital avital 2022-08-08 08:48:44 UTC
Due to QE capacity, we are not going to cover this issue in our automation


Note You need to log in before you can comment on or make changes to this bug.