Hide Forgot
The idea is to add a bdrv_pause/bdrv_resume pair, where: - bdrv_pause checks if there is an operation blocker for PAUSE. if it is there, it fails - otherwise, bdrv_pause invokes a notifier list if this is the outermost call. otherwise it does nothing - bdrv_resume does the same, but does not need a blocker - dataplane registers pause/resume notifiers for its BlockBackend, which respectively disconnect/reconnect the ioeventfd. - callers of bdrv_drain/bdrv_drain_all are a good place where to add pause/resume. Because bdrv_pause_all() makes little sense (e.g. if doing something related to the CPU, we do not care about pausing block job targets), bdrv_drain_all() in many cases should be replaced by bdrv_drain(). - for transactions, bdrv_drain_all() probably should be replaced by bdrv_drain() invocations in the prepare callback, so that the callback can also invoke bdrv_pause() before bdrv_drain(). Optionally: - block jobs can register pause/resume notifiers for their targets, which pause/resume the job, or they can block PAUSE on their targets. In particular, drive-backup should block PAUSE on its target, everything else can register notifiers. ------------------------------------------------------------------------------- IRC conversation: <bonzini> stefanha: iirc we never implemented "stop ioeventfd while the main context is acquiring an additional aiocontext", have we? <stefanha> bonzini: The necessary AioContexts need to be acquired for the entire duration of the transaction, <bonzini> stefanha: yes, but the VM can still write in the meanwhile <stefanha> bonzini: I think AioContext acquire/release is oversimplifying the problem, <stefanha> there are actually multiple mechanisms we need. <stefanha> bonzini: In this case I think a device callback might make sense, <bonzini> stefanha: yes, like a pause/restart notifier on AioContext <stefanha> so the monitor can tell the device associated with a BlockBackend to stop submitting requests. <bonzini> stefanha: exactly, pausing between acquire and bdrv_drain, and restarting before release <stefanha> Yes <stefanha> bonzini: there are two different concepts here: <stefanha> 1. AioContext acquire/release is about thread-safety <stefanha> 2. Device pause/resume is about quiescing I/O from the guest <stefanha> I still think #1 is useful <bonzini> stefanha: yes it definitely is. the important point for me is that thread-safety can be made as fine-grained as you want <bonzini> stefanha: but pause/resume cannot be made fine-grained <bonzini> stefanha: so far we had coarse-grained critical sections, so we could coalesce the two <bonzini> stefanha: (could, but didn't since we never implemented #2 :D) <stefanha> bonzini: We need to be careful about pause/resume <stefanha> There are also timers and blockjobs <stefanha> both could cause additional I/O <stefanha> So even if we stop the device from submitting requests, we also need to pause blockjobs and maybe also timers. <stefanha> We should do the same for QMP 'transaction'. <bonzini> stefanha: it's in qemu-devel, i don't recall you merged it <bonzini> stefanha: timers should be included in bdrv_drain_all, or disabled by other notifiers <bonzini> stefanha: block jobs are interesting <stefanha> e.g. drive-backup <stefanha> It hooks into the write callback <stefanha> The write request is paused until drive-backup has stashed away the data, <stefanha> but we might hit a deadlock there if the blockjob is paused... <stefanha> This is hard * stefanha goes to write fizzbuzz instead <bonzini> stefanha: perhaps pause/resume should have blockers too <bonzini> stefanha: drive-backup can block pause/resume on the destination <stefanha> bonzini: Yes, we can use op blockers <bonzini> stefanha: and transaction can limit the pause/resume to things it has to operate on <bonzini> stefanha: but streaming, for example, is compatible with pause/resume, because its writes do not change disk contents <stefanha> So it's more nuanced than just 'writes are forbidden'
Fix included in qemu-kvm-rhev-2.3.0-24.el7
Hi Paolo, From KVM QE POV, could you please tell me how to verify this bug? Thanks.
KVM QE tested: Do three live snapshots in a transaction, and make the third one fail. qemu-kvm-rhev-2.3.0-23.el7.x86_64 and qemu-kvm-rhev-2.3.0-24.el7.x86_64 have the same results: all the live snapshots failed. Versions: kernel:3.10.0-318.el7.x86_64 qemu-kvm-rhev: (1)qemu-kvm-rhev-2.3.0-23.el7.x86_64 (2)qemu-kvm-rhev-2.3.0-24.el7.x86_64 Steps: 1. boot guest with two data disks, and all disks with dataplane # /usr/libexec/qemu-kvm -name rhel6.7 -machine pc-i440fx-rhel7.2.0,accel=kvm \ -cpu SandyBridge -m 2G,slots=256,maxmem=40G -numa node \ -smp 4,sockets=2,cores=2,threads=1 \ -uuid 82b1a01e-5f6c-4f5f-8d27-3855a74e6b6b \ -netdev tap,id=hostnet0 \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=12:54:00:5c:88:6d \ -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 \ -spice port=5900,addr=0.0.0.0,disable-ticketing,image-compression=off,seamless-migration=on \ -monitor stdio \ -serial unix:/tmp/monitor,server,nowait \ -qmp tcp:0:5555,server,nowait \ -object iothread,id=iothread0 \ -drive file=/home/rhel6.7_virtio.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0,iothread=iothread0 \ -object iothread,id=iothread1 \ -drive file=/home/data1.qcow2,format=qcow2,if=none,id=drive-virtio-blk1,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-blk1,id=virtio-blk1,iothread=iothread1 \ -object iothread,id=iothread2 \ -drive file=/home/data2.qcow2,format=qcow2,if=none,id=drive-virtio-blk2,werror=stop,rerror=stop \ -device virtio-blk-pci,drive=drive-virtio-blk2,id=virtio-blk2,iothread=iothread2 \ 2. do 3 live snapshot in a transaction, and set the third one fail(unvalid path) { "execute": "transaction", "arguments": { "actions": [ { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive-virtio-blk0", "snapshot-file": "/home/snapshotA.qcow2", "format": "qcow2" } }, { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive-virtio-blk1", "snapshot-file": "/home/snapshotB.qcow2", "mode": "absolute-paths", "format": "qcow2" } }, { 'type': 'blockdev-snapshot-sync', 'data' : { "device": "drive-virtio-blk2", "snapshot-file": "/homex/snapshotC.qcow2", "format": "qcow2" } }] } } 3. all 3 live snapshot failed {"error": {"class": "GenericError", "desc": "Could not create file: No such file or directory"}} # ls /home/ snapshotA.qcow2 snapshotB.qcow2 ... (qemu) info block drive-virtio-blk0: /home/rhel6.7_virtio.qcow2 (qcow2) Cache mode: writeback drive-virtio-blk1: /home/data1.qcow2 (qcow2) Cache mode: writeback drive-virtio-blk2: /home/data2.qcow2 (qcow2) Cache mode: writeback ...
According to comment6 and comment7, set this issue as verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2546.html