Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 601643 - 2.2.z - VDSM: possible race in 'cont' command
2.2.z - VDSM: possible race in 'cont' command
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Hypervisor
Classification: Retired
Component: vdsm (Show other bugs)
5.5-2.2
All Linux
high Severity urgent
: rc
: ---
Assigned To: Dan Kenigsberg
yeylon@redhat.com
Storage
:
Depends On:
Blocks: 603792
  Show dependency treegraph
 
Reported: 2010-06-08 07:43 EDT by Yaniv Kaul
Modified: 2016-04-18 02:33 EDT (History)
12 users (show)

See Also:
Fixed In Version: vdsm22-4.5-62.8
Doc Type: Bug Fix
Doc Text:
A race condition could arise when the allocated storage space was exceeded.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-07-29 14:50:25 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
vdsm log (1.05 MB, application/x-gzip)
2010-06-08 07:43 EDT, Yaniv Kaul
no flags Details
patchset to sync lvextend, cont, and migration. (17.03 KB, patch)
2010-06-29 16:21 EDT, Dan Kenigsberg
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0537 normal SHIPPED_LIVE vdsm22 bug fix update 2010-07-29 14:49:15 EDT

  None (edit)
Description Yaniv Kaul 2010-06-08 07:43:22 EDT
Description of problem:
Part of a bigger issue of storage problems, it appears there is a race aroun the cont command in VDSM.
My scenario was a virtio-blk VM that was paused as a result of no space on the storage domain. Multiple 'abnormal vm stop device' errors appeared (as it is virtio-blk, it may have multiple concurrent writes), caused multiple lvextend commands, but more severe is the fact that due to the race / lack of lock, the VM was 'cont' few times - only to get more no space errors, which caused more lvextend commands.

Version-Release number of selected component (if applicable):
vdsm22-4.5-62.el5rhev

How reproducible:


Steps to Reproduce:
1. Run a VM with virtio-blk - make sure you run out of storage
2.
3.
  
Actual results:


Expected results:


Additional info:

QMon-128247::INFO::2010-06-08 11:13:16,531::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:16,533::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:16,534::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:16,536::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:16,538::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:16,540::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:16,543::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
eb5151d1-16ec-4f1a-8c56-6e31e0349838::DEBUG::2010-06-08 11:13:17,942::vm::1144::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::cont
QMon-128247::INFO::2010-06-08 11:13:17,982::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:17,983::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:17,984::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:17,986::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:17,987::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:17,988::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:17,989::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
162d5b30-cdae-4693-93d5-4f34598a2d45::DEBUG::2010-06-08 11:13:20,538::vm::1144::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::cont
QMon-128247::INFO::2010-06-08 11:13:20,576::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:20,578::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:20,579::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:20,580::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:20,581::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:20,582::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
QMon-128247::INFO::2010-06-08 11:13:20,583::vm::958::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::abnormal vm stop device virtio0 error No space left on device
4d89cdcb-653a-4f5f-a137-e1c31e8bd2eb::DEBUG::2010-06-08 11:13:20,808::vm::1144::vds.vmlog.b3e72085-2405-4612-b979-398ecfed05b8::cont
Comment 1 Yaniv Kaul 2010-06-08 07:43:49 EDT
Created attachment 422153 [details]
vdsm log
Comment 3 Haim 2010-06-20 11:51:08 EDT
happened to me as well as when testing lvextend scenario which in the end caused image corruption. 
get lots of the below errors:

result: qcow image corruption if vm is migrated during lvextend operation. 

QMon-398::INFO::2010-06-20 17:38:58,362::vm::958::vds.vmlog.c19186a9-1279-4214-9411-ef580d85f170::abnormal vm stop device ide0-hd0 error Invalid argument
QMon-398::INFO::2010-06-20 17:40:35,946::vm::958::vds.vmlog.c19186a9-1279-4214-9411-ef580d85f170::abnormal vm stop device ide0-hd0 error No space left on device

running qemu-image check on the vm revealed lots of errors.. Dan is aware of it. 

scenario: create a vm with 25G disk (thinly provisioned) while vm storage domain has less size (15G).

1) 2 hosts on iscsi storage type 
2) vm (live cd) running on non-spm machine 
3) start dd on guest machine dd if=/dev/zero of=/dev/sda bs=1M
4) wait for several lvextend messages

currently testing danken's new fix.
Comment 4 Yaniv Kaul 2010-06-20 14:24:44 EDT
(In reply to comment #3)
> happened to me as well as when testing lvextend scenario which in the end
> caused image corruption. 
> get lots of the below errors:
> 
> result: qcow image corruption if vm is migrated during lvextend operation. 
> 
> QMon-398::INFO::2010-06-20
> 17:38:58,362::vm::958::vds.vmlog.c19186a9-1279-4214-9411-ef580d85f170::abnormal
> vm stop device ide0-hd0 error Invalid argument
> QMon-398::INFO::2010-06-20
> 17:40:35,946::vm::958::vds.vmlog.c19186a9-1279-4214-9411-ef580d85f170::abnormal
> vm stop device ide0-hd0 error No space left on device
> 
> running qemu-image check on the vm revealed lots of errors.. Dan is aware of
> it. 
> 
> scenario: create a vm with 25G disk (thinly provisioned) while vm storage
> domain has less size (15G).
> 
> 1) 2 hosts on iscsi storage type 
> 2) vm (live cd) running on non-spm machine 
> 3) start dd on guest machine dd if=/dev/zero of=/dev/sda bs=1M
> 4) wait for several lvextend messages
> 
> currently testing danken's new fix.    

What KVM version have you used?
Comment 5 Haim 2010-06-20 15:51:13 EDT
kvm-83-164.el5 on silver-vdsc and kvm-83-164.el5_5.9 on cyan-vdsc. 
what KVM version you have ?
Comment 6 Alan Pevec 2010-06-20 17:22:01 EDT
(In reply to comment #5)
> kvm-83-164.el5 on silver-vdsc and kvm-83-164.el5_5.9 on cyan-vdsc. 
> what KVM version you have ?    

Please use latest kvm-83-164.el5_5.12
Comment 7 Haim 2010-06-21 11:58:07 EDT
moved to latest kvm-83-164.el5_5.12 and continue testing there.
Comment 8 Bat-hen Nagel 2010-06-29 07:42:22 EDT
Verified sm75
Comment 9 Dan Kenigsberg 2010-06-29 12:12:53 EDT
(In reply to comment #8)
> Verified sm75    

Please try to give a more detailed description of the verification process.
This bug could not have been verified, since it is not solved yet.

See you tomorrow.
Comment 11 Dan Kenigsberg 2010-06-29 16:21:07 EDT
Created attachment 427786 [details]
patchset to sync lvextend, cont, and migration.
Comment 12 Jaromir Hradilek 2010-07-11 09:57:58 EDT
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
A race condition could arise when the allocated storage space was exceeded.
Comment 13 Bat-hen Nagel 2010-07-13 10:20:42 EDT
Verified by Haim steps.
vdsm22-4.5-62.8.el5_5rhev2_2
Comment 15 errata-xmlrpc 2010-07-29 14:50:25 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0537.html

Note You need to log in before you can comment on or make changes to this bug.