Bug 1359325
Summary: | 37 Petabyte, and corrupted, qcow2 following qemu-system-x86 dumped core | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Chris Murphy <bugzilla> | ||||||
Component: | qemu | Assignee: | Fedora Virtualization Maintainers <virt-maint> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 24 | CC: | amit.shah, berrange, cfergeau, crobinso, dwmw2, extras-qa, itamar, pbonzini, rjones, virt-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | 1359324 | Environment: | |||||||
Last Closed: | 2016-07-26 21:29:42 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1359324 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Description
Chris Murphy
2016-07-22 20:30:13 UTC
Created attachment 1182953 [details]
filefrag output
It gets suspicious at line 9753, between fragments 9746 and 0747.
Filesystem type is: 9123683e
File size of uefi_opensuseleap42.2a3-1.qcow2 is 40537894204538880 (9896946827280 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
[...snip...]
9746: 13207824.. 13212799: 153115392.. 153120367: 4976:
9747: 2279017021440..2279017021455: 150018353.. 150018368: 16: 153120368:
OK so it's a sparse file with a rather large gap near the end.
kernel-4.6.4-301.fc24.x86_64 btrfs-progs-4.6.1-1.fc25.x86_64 Mounts, reads, writes, without Btrfs or device errors, before, during and after the qemu crash. Scrub comes up clean. Offline btrfs check has no complaints. Mount options /dev/sda5 on / type btrfs (rw,noatime,seclabel,ssd,space_cache,subvolid=583,subvol=/root24w) There are four subvolumes (two are read only snapshots). Anyway, if it's a Btrfs problem, it's not in the usual category that includes noisy+scary Btrfs messages. Asked upstream anyway: http://article.gmane.org/gmane.comp.file-systems.btrfs/58557 Created attachment 1183144 [details] bz1359325.pl I wasn't able to reproduce this. My reproducer (attached) is still running however so I'll see how it goes. What I did to attempt to recreate this situation in a reproducible manner. (1) Created a 200GB logical volume on a Fedora 24 host: $ sudo lvcreate -L 200G -n bz1359325-btrfs vg Logical volume "bz1359325-btrfs" created. (2) Formatted it with btrfs, all default options: $ sudo mkfs.btrfs /dev/vg/bz1359325-btrfs btrfs-progs v4.5.2 See http://btrfs.wiki.kernel.org for more information. Label: (null) UUID: c941bb1b-f8b0-482a-977a-1df47b5225bb Node size: 16384 Sector size: 4096 Filesystem size: 200.00GiB Block group profiles: Data: single 8.00MiB Metadata: DUP 1.01GiB System: DUP 12.00MiB SSD detected: no Incompat features: extref, skinny-metadata Number of devices: 1 Devices: ID SIZE PATH 1 200.00GiB /dev/vg/bz1359325-btrfs (3) Mounted it up: $ mkdir /tmp/mnt $ sudo mount /dev/vg/bz1359325-btrfs /tmp/mnt (4) Create 2 x qcow2 nocow files: $ sudo qemu-img create -f qcow2 -o nocow=on /tmp/mnt/disk1.qcow2 50G Formatting '/tmp/mnt/disk1.qcow2', fmt=qcow2 size=53687091200 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 nocow=on $ sudo qemu-img create -f qcow2 -o nocow=on /tmp/mnt/disk2.qcow2 50G Formatting '/tmp/mnt/disk2.qcow2', fmt=qcow2 size=53687091200 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 nocow=on (5) Used the attached guestfs script to format and hammer these disks with writes. Note this uses virtio-scsi, not virtio-blk. $ sudo /tmp/bz1359325.pl So bug 1359324 determined that the corruption is due to user error causing qemu to write to the same backing file. Ending up with a 37 Petabyte sparse file certainly sounds strange but maybe we just chalk it up to the unpredictability of simultaneously writing to a single disk image? Of course the qemu crash in bug 1359324 should be investigated I've tried reproducing this but it never grows beyond the qemu-img create specified size. For now this could be closed with insufficient data I guess. |