1940118 – abrt-hook-ccpp[57365]: Process * (qemu-kvm) of user 107 killed by SIGABRT in live storage migration process

Bug 1940118 - abrt-hook-ccpp[57365]: Process * (qemu-kvm) of user 107 killed by SIGABRT in live storage migration process

Summary: abrt-hook-ccpp[57365]: Process * (qemu-kvm) of user 107 killed by SIGABRT in ...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	vdsm
Classification:	oVirt
Component:	General
Sub Component:
Version:	4.40.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	ovirt-4.4.6
Target Release:	---
Assignee:	Ritesh Chikatwar
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Depends On:	1947772
Blocks:
TreeView+	depends on / blocked

Reported:	2021-03-17 15:56 UTC by Mikhail Gusev
Modified:	2023-09-15 01:03 UTC (History)
CC List:	20 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-05-17 13:14:37 UTC
oVirt Team:	Gluster
Embargoed:
Dependent Products:
Flags:	sbonazzo: ovirt-4.4+

Attachments	(Terms of Use)
Logs and stack trace (11.85 KB, text/plain) 2021-03-17 15:56 UTC, Mikhail Gusev	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
oVirt gerrit	114184	0	master	MERGED	storageServer: Added validation for gluster volume type	2021-04-15 06:03:42 UTC

Description Mikhail Gusev 2021-03-17 15:56:16 UTC

Created attachment 1764108 [details]
Logs and stack trace

Created attachment 1764108 [details]
Logs and stack trace

After start disk migration between storage domains (NFS -> GlusterFS), qemu process was killed by SIGABRT. No errors in vdsm log. 
RHV Manager did not know about this, and status of virtual machine was "UP". 
The migration process endlessly cannot complete. Uncompleted tasks: "Merging snapshots (<UNKNOWN> into <UNKNOWN>) of disk <UNKNOWN>" "Cloning image's structure" "Migrating Disk" "Removing Snapshot test-vm-01_Disk1 Auto-generated for Live Storage Migration of VM"
Stack trace in attachment

Problem occurs only to live migrate "Non os" disks NFS Domain to GlusterFS domain and only to Disperse-Distribute gluster volume 
Migration GlusterFS -> NFS and migration OS disks NFS Domain to GlusterFS domain and migrations to Replica gluster volume wokrs fine

Problem occurs I think with qemu and gluster errors: 

Qemu:
Unexpected error in raw_check_lock_bytes() at /builddir/build/BUILD/qemu-5.1.0/block/file-posix.c:911:                             │[2021-03-19 05:08:05.194444] I [MSGID: 109066] [dht-rename.c:1953:dht_rename] 0-gv0-dht: renaming /75ce2254-35ad-44fa-adeb-e8dfbc4
2021-03-19T05:08:05.301126Z qemu-kvm: Failed to get shared "write" lock 

Gluster:
[2021-03-19 05:08:05.288127] W [MSGID: 122019] [ec-helpers.c:401:ec_loc_gfid_check] 0-gv0-disperse-2: Mismatching GFID's in loc
[2021-03-19 05:08:05.288164] W [MSGID: 109002] [dht-rename.c:1021:dht_rename_links_create_cbk] 0-gv0-dht: link/file /75ce2254-35ad
-44fa-adeb-e8dfbc4ba4fa/images/033c73a2-8556-40c2-8d14-7ffbd64dcc89/8795c59c-15c2-479d-8684-3328be686a75.meta on gv0-disperse-2 fa
iled [Input/output error]

Comment 1 RHEL Program Management 2021-03-18 00:54:23 UTC

The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

Comment 2 Mikhail Gusev 2021-03-18 11:39:51 UTC

VDSM log on issue:
2021-03-18 13:49:17,066+0300 INFO  (jsonrpc/2) [api.host] START getStats() from=::ffff:10.195.6.134,51248 (api:48)
2021-03-18 13:49:17,085+0300 INFO  (jsonrpc/2) [vdsm.api] START repoStats(domains=()) from=::ffff:10.195.6.134,51248, task_id=f71b3bf9-0a18-4ac9-be88-e5eacdd5eafb (api:48)
2021-03-18 13:49:17,085+0300 INFO  (jsonrpc/2) [vdsm.api] FINISH repoStats return={'a8ff2546-f8af-4edb-90ca-1bcdb4c262be': {'code': 0, 'lastCheck': '0.7', 'delay': '0.00580853', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}, '75ce2254-35ad-44fa-adeb-e8dfbc4ba4fa': {'code': 0, 'lastCheck': '2.1', 'delay': '0.00342767', 'valid': True, 'version': 5, 'acquired': True, 'actual': True}} from=::ffff:10.195.6.134,51248, task_id=f71b3bf9-0a18-4ac9-be88-e5eacdd5eafb (api:54)
2021-03-18 13:49:17,085+0300 INFO  (jsonrpc/2) [vdsm.api] START multipath_health() from=::ffff:10.195.6.134,51248, task_id=1ead6065-dbd9-4dbe-bc5f-05e8eec74ea8 (api:48)
2021-03-18 13:49:17,085+0300 INFO  (jsonrpc/2) [vdsm.api] FINISH multipath_health return={} from=::ffff:10.195.6.134,51248, task_id=1ead6065-dbd9-4dbe-bc5f-05e8eec74ea8 (api:54)
2021-03-18 13:49:17,089+0300 INFO  (jsonrpc/2) [api.host] FINISH getStats return={'status': {'code': 0, 'message': 'Done'}, 'info': (suppressed)} from=::ffff:10.195.6.134,51248 (api:54)
2021-03-18 13:49:18,396+0300 INFO  (jsonrpc/3) [api.host] START getAllVmStats() from=::1,45388 (api:48)
2021-03-18 13:49:18,397+0300 INFO  (jsonrpc/3) [api.host] FINISH getAllVmStats return={'status': {'code': 0, 'message': 'Done'}, 'statsList': (suppressed)} from=::1,45388 (api:54)
2021-03-18 13:49:18,401+0300 INFO  (jsonrpc/1) [api.host] START getAllVmIoTunePolicies() from=::1,45388 (api:48)
2021-03-18 13:49:18,401+0300 INFO  (jsonrpc/1) [api.host] FINISH getAllVmIoTunePolicies return={'status': {'code': 0, 'message': 'Done'}, 'io_tune_policies_dict': {'5255b072-53a8-492d-9cfa-248ce9886b64': {'policy': [], 'current_values': [{'name': 'sda', 'path': '/rhev/data-center/mnt/glusterfs-test.mgmt.tensor.ru:_var_local_rhv-test-storage/a8ff2546-f8af-4edb-90ca-1bcdb4c262be/images/b5ed61dc-8008-4ba7-ac19-7077a636db3d/1d07b75d-40ce-4be9-a7d6-0fed5dd12aff', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}, {'name': 'sdb', 'path': '/rhev/data-center/mnt/glusterSD/sds0101:_gv0/75ce2254-35ad-44fa-adeb-e8dfbc4ba4fa/images/d71cf39d-9b0e-49bb-88c6-3a3a515c18e4/9612572d-1a85-410c-a888-992184d08284', 'ioTune': {'total_bytes_sec': 0, 'read_bytes_sec': 0, 'write_bytes_sec': 0, 'total_iops_sec': 0, 'write_iops_sec': 0, 'read_iops_sec': 0}}]}}} from=::1,45388 (api:54)
2021-03-18 13:49:25,700+0300 WARN  (periodic/0) [virt.periodic.VmDispatcher] could not run <class 'vdsm.virt.periodic.BlockjobMonitor'> on ['5255b072-53a8-492d-9cfa-248ce9886b64'] (periodic:289)
2021-03-18 13:49:25,702+0300 INFO  (qgapoller/0) [vds] Failed to get guest info for vm=5255b072-53a8-492d-9cfa-248ce9886b64, error: Guest agent is not responding: QEMU guest agent is not connected (qemuguestagent:414)
2021-03-18 13:49:26,224+0300 INFO  (jsonrpc/4) [api.host] START getAllVmStats() from=::ffff:10.195.6.134,51248 (api:48)
2021-03-18 13:49:26,224+0300 WARN  (jsonrpc/4) [virt.vmstats] Missing stat: 'net.0.rx.errs' for vm 5255b072-53a8-492d-9cfa-248ce9886b64 (vmstats:490)
2021-03-18 13:49:26,224+0300 WARN  (jsonrpc/4) [virt.vmstats] Missing stat: 'net.0.rx.bytes' for vm 5255b072-53a8-492d-9cfa-248ce9886b64 (vmstats:490)
2021-03-18 13:49:26,225+0300 INFO  (jsonrpc/4) [api.host] FINISH getAllVmStats return={'status': {'code': 0, 'message': 'Done'}, 'statsList': (suppressed)} from=::ffff:10.195.6.134,51248 (api:54)

Comment 4 Michal Skrivanek 2021-03-22 11:12:04 UTC

For RHV assistance please open a support case

moving to upstream

Comment 5 Eyal Shenitzky 2021-03-22 11:15:39 UTC

Hi Mikhail,

Can you please add the engine and VDSM versions?

Comment 6 Eyal Shenitzky 2021-03-22 15:15:50 UTC

Nir, can you please have a look?

Comment 7 Nir Soffer 2021-03-22 15:35:21 UTC

qemu aborting is qemu bug, or an issue with storage.

Kevin, should be move the bug to qemu?

Comment 8 Nir Soffer 2021-03-22 15:42:06 UTC

(In reply to Mikhail Gusev from comment #0)
> After start disk migration between storage domains (NFS -> GlusterFS), qemu
> process was killed by SIGABRT. No errors in vdsm log. 

Based on the rest of the description, I guess you mean live storage migration
(moving disk between storage domains when the vm is running).

> RHV Manager did not know about this, and status of virtual machine was "UP". 
> The migration process endlessly cannot complete. Uncompleted tasks: "Merging
> snapshots (<UNKNOWN> into <UNKNOWN>) of disk <UNKNOWN>" "Cloning image's
> structure" "Migrating Disk" "Removing Snapshot test-vm-01_Disk1
> Auto-generated for Live Storage Migration of VM"

This sounds like engine bug, it will be better to file engine bug for this.

> Stack trace in attachment
> 
> Problem occurs only to live migrate "Non os" disks NFS Domain to GlusterFS
> domain and only to Disperse-Distribute gluster volume 
> Migration GlusterFS -> NFS and migration OS disks NFS Domain to GlusterFS
> domain and migrations to Replica gluster volume wokrs fine

I don't know if we support Disperse-Distribute gluster disks. I know that 
replica 3 or replica 2 + arbiter are supported configurations.

It looks like we have 2 issues:
- qemu aborts - should be handled by qemu developers, may be caused
  by gluster issue, but qemu should not abort because of runtime errors.
- engine not handling the flow - should be handled by ovirt-engine

I don't see an issue in vdsm.

Comment 9 Nir Soffer 2021-03-22 15:47:45 UTC

Regarding the attached logs - please provide full vdsm and engine
logs, and the complete core dump. The traceback from abrt report
is not useful.

Comment 10 Kevin Wolf 2021-03-22 17:00:41 UTC

(In reply to Nir Soffer from comment #7)
> Kevin, should be move the bug to qemu?

If QEMU aborts, there is a QEMU bug involved. Whether or not there is an additional bug in the management layer is something you would have to tell.

Specifically, this looks like completion of the mirror job tries to remove the mirror-top node and this fails with some locking problem. In theory, removing this node shouldn't require any permissions that weren't already taken before, so there shouldn't even be an attempt to lock something. A reproducer on the QEMU level or at least a core dump would be helpful if you're moving the bug to qemu-kvm.

Comment 11 Hanna Czenczek 2021-03-23 17:07:17 UTC

I can reproduce something like the given stack trace as follows:

$ qemu-img create -f raw src.img 64M
Formatting 'src.img', fmt=raw size=67108864
$ qemu-img create -f raw tgt.img 64M
Formatting 'tgt.img', fmt=raw size=67108864

$ qemu-system-x86_64 \
    -blockdev file,node-name=source,filename=src.img \
    -blockdev file,node-name=target,filename=tgt.img \
    -device virtio-blk,drive=source \
    -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 5}, "package": ""}, "capabilities": ["oob"]}}

{"execute":"qmp_capabilities"}
{"return": {}}

{"execute":"blockdev-mirror","arguments":{"job-id":"mirror","device":"source","target":"target","sync":"full","filter-node-name":"mirror-top"}}
{"timestamp": {"seconds": 1616518637, "microseconds": 489079}, "event": "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "mirror"}}
{"timestamp": {"seconds": 1616518637, "microseconds": 489180}, "event": "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "mirror"}}
{"return": {}}
{"timestamp": {"seconds": 1616518637, "microseconds": 490552}, "event": "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "mirror"}}
{"timestamp": {"seconds": 1616518637, "microseconds": 490594}, "event": "BLOCK_JOB_READY", "data": {"device": "mirror", "len": 67108864, "offset": 67108864, "speed": 0, "type": "mirror"}}


Then, in another shell:

$ qemu-io -f raw src.img
>


Back to QMP:

{"execute":"block-job-cancel","arguments":{"device":"mirror"}}
{"return": {}}
{"timestamp": {"seconds": 1616518653, "microseconds": 675887}, "event": "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "mirror"}}
{"timestamp": {"seconds": 1616518653, "microseconds": 675926}, "event": "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "mirror"}}
Unexpected error in raw_check_lock_bytes() at ../qemu-5.2.0/block/file-posix.c:908:
qemu-system-x86_64: Failed to get shared "write" lock
[1]    110029 abort (core dumped)  qemu-system-x86_64 -blockdev file,node-name=source,filename=src.img -blockdev


Stack trace is the same mirror_exit_common() -> bdrv_replace_node() -> bdrv_check_perm() -> raw_check_lock_bytes() as in comment 0.

The problem in this case is that the mirror filter node doesn’t take proper permissions, so the image file can freely be used by other qemu processes while the job is running.  If they still access the file when the job completes, the VM will crash because it wants its permissions back.

I don’t know whether that’s cause for the crash described in comment 0.  It seems plausible.  But it would require some other process to open the image file while the mirror job is running, which I would have thought to be unlikely.  On the other hand, the stack trace only leaves the conclusion that exactly that is happening: Some other process holds locks on the file when mirror completes (or raw_check_lock_bytes() wouldn’t give an error).


I think applying https://lists.nongnu.org/archive/html/qemu-block/2021-02/msg00933.html would fix at least what I’ve described here, because it makes the mirror job retain all permissions it needs while it’s running, so you cannot do the “launch qemu-io concurrently in another shell” step.  (qemu-io will give an error that it can’t take the locks it needs.)

Max

Comment 12 Eyal Shenitzky 2021-04-05 14:15:19 UTC

Do we have a QEMU bug that we can track using this bug?

Comment 13 Kevin Wolf 2021-04-06 09:50:57 UTC

We should probably create a separate QEMU bug report then to backport the fix that Max mentioned.

However, if a second process accessing the image at the same time is the reason, we have a management layer bug as well because that shouldn't happen. Once we fix QEMU, trying to access the image from the second process will fail because locking will prevent this. So we should probably use this bug for the RHV part.

The next step would then be verifying that the image is indeed opened twice by RHV, and finding out why this happens.

Comment 14 Nir Soffer 2021-04-06 10:25:47 UTC

(In reply to Kevin Wolf from comment #13)
> The next step would then be verifying that the image is indeed opened twice
> by RHV, and finding out why this happens.

We never access active images with "qemu-img convert" or qemu-nbd.

What happens in live storage migration:

1. ovirt creates a live snapshot

    src: A <- B (top)

2. ovirt creates the target image chain on the destination storage

    dst: A <- B (top)

3. vdsm starts blockCopy in libvirt for the active image
4. qemu starts mirroring data from top image to destination

    qemu: mirror src:A to dst:A

    libvirt should make sure dst:B is not opened by qemu since
    this image is used by vdsm at this point.
 
5. vdsm copy image A on src to image A on dst

    vdsm: copy src:B to dst:B with qemu-img convert

6. when copying B finish and mirroring A is ready, vdsm ask libvirt
   to pivot from src:A to dst:A

   at this point qemu should open dst:B.

So at no point we have qemu-img convert writing to an image opened by qemu.

I think this issue is most likey related to gluster Disperse-Distribute
volumes. I don't know if they are supported and tested. Replica 3 volumes
and Replica 2 + arbiter are supported configurations.

Comment 15 Kevin Wolf 2021-04-06 11:13:15 UTC

You mean a gluster bug related to file locking? So that maybe it would fail an attempt to lock the image even though the image isn't actually locked by anyone? Or it doesn't unlock the file when we drop the lock and then the old lock prevents us from taking a new one?

If there is nobody else who could be taking a lock, I guess a bug in the file locking mechanism itself is the only other option.

Comment 16 Nir Soffer 2021-04-06 11:16:55 UTC

I think this issue is likely an unexpected behavior when using gluster
Disperse-Distribute volumes, so it should be handled by gluster team.

Comment 17 Nir Soffer 2021-04-06 11:26:04 UTC

Update from Sahina:
> We do not support Disperse volumes in oVirt for storing VM images.

So the question is how vdsm enabled working with unsupported volume
type. We may need to improve validation when connecting to gluster
storage domain.

In the past we added code to warn about unsupported replica count, we
probably need to warn or reject volume with unsupported types.
https://github.com/oVirt/vdsm/blob/6688ebfc0aaec8ea50d54a749d42950386dcda49/lib/vdsm/storage/storageServer.py#L288

Comment 18 Sahina Bose 2021-04-06 11:28:34 UTC

Ritesh, can you implement this validation?

Comment 19 Nir Soffer 2021-04-06 11:31:28 UTC

Kevin, can you file the qemu bug related to comment 11?
I don't fully understand the issue to create a good bug.

Comment 20 Ritesh Chikatwar 2021-04-08 07:49:57 UTC

I raised PR for this to handle this in vdsm

https://github.com/oVirt/vdsm/pull/22

@Nir Can you review.

Comment 21 Eyal Shenitzky 2021-04-08 07:58:05 UTC

(In reply to Ritesh Chikatwar from comment #20)
> I raised PR for this to handle this in vdsm
> 
> https://github.com/oVirt/vdsm/pull/22
> 
> @Nir Can you review.

Please submit the patch to - https://gerrit.ovirt.org/

The GitHub repository is a mirror of Gerrit and shouldn't be used for enhancements.

Comment 22 Ritesh Chikatwar 2021-04-08 10:49:32 UTC

(In reply to Eyal Shenitzky from comment #21)
> (In reply to Ritesh Chikatwar from comment #20)
> > I raised PR for this to handle this in vdsm
> > 
> > https://github.com/oVirt/vdsm/pull/22
> > 
> > @Nir Can you review.
> 
> Please submit the patch to - https://gerrit.ovirt.org/
i Did
> 
> The GitHub repository is a mirror of Gerrit and shouldn't be used for
> enhancements.

Comment 23 Hanna Czenczek 2021-04-09 08:47:14 UTC

(In reply to Eyal Shenitzky from comment #12)
> Do we have a QEMU bug that we can track using this bug?

As was said, we should probably open a new bug.  So:

(In reply to Nir Soffer from comment #19)
> Kevin, can you file the qemu bug related to comment 11?

I’ve created https://bugzilla.redhat.com/show_bug.cgi?id=1947772.

Comment 24 Evelina Shames 2021-04-26 10:23:07 UTC

Can you pls elaborate on this issue?
What is exactly the issue? what is the fix? How can I reproduce the issue and test the fix? 
Thanks

Comment 28 SATHEESARAN 2021-05-17 11:29:02 UTC

While trying to make use of disperse gluster volume as storage domain in ovirt-4.4.6, this is still allowed.
Warning message is logged in /var/log/vdsm.log as expected.

<snip>
2021-05-17 11:23:38,299+0000 WARN  (jsonrpc/7) [storage.StorageServer.MountConnection] Unsupported volume type, volume: 'dispvol', volume type: 'DISPERSE'. Please use the replicate type.To recover existing migrate it to supported type. (storageServer:306)
</snip>

The change to disallow storage domain creation with disperse gluster volume is tracked as part of the
bug - https://bugzilla.redhat.com/show_bug.cgi?id=1951894

Comment 29 Red Hat Bugzilla 2023-09-15 01:03:34 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.

arachman
bugs
cshao
eshenitz
godas
hreitz
kwolf
lsvaty
lveyde
mavital
michal.skrivanek
nsoffer
peyu
rchikatw
sabose
sbonazzo
sfishbai
shlei
weiwang
yaniwang