Bug 1382139

Summary: Write cache loss on shutdown of virtual machines with virtual devices (e.g. ZVOL)
Product: [Community] Virtualization Tools Reporter: Ruben Kelevra <ruben>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: NEW --- QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: libvirt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Ruben Kelevra 2016-10-05 20:48:50 UTC
Description of problem:
I think I've discovered that bug a year or two ago in CentOS 7 while I use LVM on a mdadm raid 1, but since LVM has several other non-reproducible bugs I've changed the setup to raw files on a ZFS.

I've discovered that issue on system updates, my virtual machine runs ArchLinux which uses pacman as update manager. Pacman does not send any sync commands, so there's a huge speed advantage. But when you shut down the system, the underlying filesystem must be in a completely clean state, else your system would not boot up again.

Since my change to raw and zfs, everything works fine here - so I've changed my behavior to send a manual sync after pacman-updates and never had that issue again.

Now I did another setup and used zfs' internal virtual block devices - ZVOL -  for the virtual test machine. I've done a kernel update and actually the system completely destroys itself on reboot.

Since recovering is pretty easy on ArchLinux, I've reinstalled all installed packages while running from a LiveCD and discovered that the files which are broken are all at 0 Bytes of size.

So the journaling for Metadata worked fine and saved my info about the empty files, but the data which was meant to be written async hasn't been written at all.

When I setup a virtual machine, I use the "Hypervisor-Standard" options, displayed as this on virt-manager. I would guess there are "Writethrough caching" and "threads".

I haven't tried other setting than "Hypervisor-Standard".
I haven't set any non-standard options to the zvol nor to the mirroring zpool.

Version-Release number of selected component (if applicable):

I haven't changed anything on the hypervisor kernel, which is 4.7.6.
I haven't changed anything on the guest-os kernel, which is 4.7.6 as well.
Libvirt: 2.2.0
Qemu: 2.7.0