Bug 612089
Summary: | virtio-blk: Possible loss of flushes after werror/rerror stop | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Kevin Wolf <kwolf> |
Component: | qemu-kvm | Assignee: | Kevin Wolf <kwolf> |
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | low | ||
Version: | 6.0 | CC: | chellwig, juzhang, mkenneth, tburke, virt-maint |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-07-16 08:57:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Kevin Wolf
2010-07-07 10:01:18 UTC
Analysis: If we stop the VM on an error in virtio_blk_handle_rw_error we put the request we stop on in front of the s->rq queue, which we then serialize in virtio_blk_save. Once the VM is restarted we de-serialize them again in virtio_blk_load, and finally resume them in virtio_blk_dma_restart_bh. We do this for all requests, including the flush requests. So we defintively do handle flushes on VM abort / restart. Now the questions is if we handle the requests (flush or not) in the right order. - in virtio_blk_handle_rw_error we add the newly failing request to the front of the queue, effectively reversing the order of the requests. - in virtio_blk_save we serialize the requests in the order found in the queue - in virtio_blk_load each request deserialized is added to the front of the s->rq queue, effectively reversing the order in which is was serialized - virtio_blk_dma_restart_bh then handles the requests in order. So I think we are fine, but the way \ the s->rq queue is manipulated smells like this happend more by accident than by intention. (In reply to comment #2) > Analysis: > > If we stop the VM on an error in virtio_blk_handle_rw_error we put the request > we stop on in front > of the s->rq queue, which we then serialize in virtio_blk_save. > > Once the VM is restarted we de-serialize them again in virtio_blk_load, and > finally resume them in virtio_blk_dma_restart_bh. We do this for all requests, > including the flush requests. > > So we defintively do handle flushes on VM abort / restart. Well, yes and no. If a flush fails, we put it into the queue, that's right. But that's not the whole truth, the flush may succeed and a prior write request may fail. Which should not be a problem as long as we signal the failure back to the guest, but when we restart the write request outselves the write succeeds from the guest POV, but it has moved across the barrier. So I think we need to flush once again after the restarted write request. I think this is actually a pretty common case with RHEV: Writes won't succeed because of ENOSPC, but a flush doesn't need more space and still works. (In reply to comment #3) > Well, yes and no. If a flush fails, we put it into the queue, that's right. But > that's not the whole truth, the flush may succeed and a prior write request may > fail. Which should not be a problem as long as we signal the failure back to > the guest, but when we restart the write request outselves the write succeeds > from the guest POV, but it has moved across the barrier. So I think we need to > flush once again after the restarted write request. I/O ordering is a host problem. Linux guests will never start a flush request before the queue hasn't been drained to avoid dealing with the complex problem of out of order completions for cache flushes. While I can't verify this for other guests I can't see any other way to deal with the typical error scenarios. As Christoph explained in IRC, the guest not only does, but it even has to drain the queue before flushes to get the right semantics even during normal operation. Therefore this is not a problem, closing. |