Bug 969109

Summary: guest crashes after transferring a large file using drag'n'drop
Product: [Fedora] Fedora Reporter: Zbigniew Jędrzejewski-Szmek <zbyszek>
Component: qemuAssignee: Fedora Virtualization Maintainers <virt-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: amit.shah, berrange, cfergeau, crobinso, dwmw2, hdegoede, itamar, marcandre.lureau, pbonzini, rjones, sandmann, scottt.tw, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-1.4.2-3.fc19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-06-15 03:08:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zbigniew Jędrzejewski-Szmek 2013-05-30 16:39:42 UTC
Description of problem:
Running testday-20130530-i686.iso for the Fedora spice test day. The guest is FC19 i686, host is FC19 x86_64. Drag'n'drop works fine for small files (some PDFs a few MB), but the guest hangs after dragging an ISO image.

Tried five times: 
archlinux-2013.04.01-dual.iso 530579456 bytes
testday-20130530-i686.iso 770703360 bytes ×4

With the smaller ISO, a "files" dialog was actually shown, with the ISO displayed as 530MB (iirc), so it seems that the file is actually transfered,
and the hangs happens afterwards.

With the bigger ISO, the hang was "immediate".

In both cases, the result is that the VM display first "freezes" showing the desktop, but unresponsive, and after a few seconds blacks out. I think that the VM is not entirely dead, because the CPU monitor in virt-manager shows some activity, but I wasn't able to get any response.

Interestingly, after vm hard reset, the contents of the desktop are shown in flashes during reboot. This might be an unrelated bug, I'd not expect the contents of video RAM to persist across reboots.

With the fourth try, I connected through ssh from the host to monitor the logs. Actually no crash this time, but the transfer failed:

May 30 12:32:52 localhost spice-vdagentd[491]: chunk port 2065 out of range
May 30 12:32:52 localhost spice-vdagentd[491]: AIIEEE lost spice client connection, reconnecting
May 30 12:32:52 localhost spice-vdagent[2318]: file-xfer: Removing task 5 and file /home/liveuser/Downloads/testday-20130530-i686.iso due to error
May 30 12:32:53 localhost kernel: [drm] client monitors list will be truncated: 1 < 16

Trying again I get a hard crash as before. The ssh session is also dead, which means that it's not just a display problem.

Version-Release number of selected component (if applicable):

host:
qemu-kvm-1.4.2-2.fc19.x86_64
spice-server-0.12.3-1.fc19.x86_64 

guest:
spice-vdagent-0.14.0-2-fc19.i686
spice-server-0.12.2-5.fc19.i686
xorg-x11-drv-qxl-0.1.1-0.8-20130514git77a1594.fc19.i686

How reproducible:
so far 100% — the transfer either fails, or kills the guest

Comment 1 Marc-Andre Lureau 2013-05-30 16:48:05 UTC
Doesn't look like client related, moving to Hans, who has been working more than I did on file-xfer.

Comment 2 Zbigniew Jędrzejewski-Szmek 2013-05-30 21:13:32 UTC
OK, I played around with the VM a bit more, and it seems that it'll hang if I fill the filesystem using any means (dd if=/dev/zero of=... also works). Most probably the crash has nothing to do with spice. Sorry for the misleading bug report.

I'll try to investigate what exactly is happening, but not today, I've had enough testing for the moment.

Comment 3 Hans de Goede 2013-05-31 11:36:14 UTC
Hi,

So it seems that we've 2 different issues here, 1 is the client and the agent loosing sync, as shown by this message:

> May 30 12:32:52 localhost spice-vdagentd[491]: chunk port 2065 out of range
> May 30 12:32:52 localhost spice-vdagentd[491]: AIIEEE lost spice client
> connection, reconnecting
> May 30 12:32:52 localhost spice-vdagent[2318]: file-xfer: Removing task 5
> and file /home/liveuser/Downloads/testday-20130530-i686.iso due to error

This means the file drag and drop will fail, but it does not explain the guest hang / crash. I've spend quite a time debugging the above issue a while back, as I also hit it in my own testing, so this is fixed upstream by this commit:
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=75c439bc65c07d76f5e74c734ed5432bc6114a3b

I did not think about adding this fix to the Fedora packages, since it will only trigger when using file xfers but since we support file xfers now in Fedora, we should add this fix to at-least F-19 qemu, and it certainly can't hurt for F-18 either.  Changing component to qemu.

(In reply to Zbigniew Jędrzejewski-Szmek from comment #2)
> OK, I played around with the VM a bit more, and it seems that it'll hang if
> I fill the filesystem using any means (dd if=/dev/zero of=... also works).
> Most probably the crash has nothing to do with spice. Sorry for the
> misleading bug report.

Well it was a useful bug report, see above. Please file a new bug for tracking the hang on file-system filling issue (once you've more details). Could it be that you're using qcow2 and the host fs is getting full ?

Regards,

Hans

Comment 4 Zbigniew Jędrzejewski-Szmek 2013-05-31 21:01:36 UTC
(In reply to Hans de Goede from comment #3)
> Could it be that you're using qcow2 and the host fs is getting full ?
I'm running from the LiveCD, so I don't think the disk image matters. Anyway,
the disk image has raw format, and the host fs has plenty of space.

> Well it was a useful bug report, see above. Please file a new bug for
> tracking the hang on file-system filling issue (once you've more details).
I get the same behaviour for both the i686 and x86_64 variants, but I can't
get kdump to work, so it's unlikely that I'll be able to provide more details.
I'll file a bug if something useful turns up.

Comment 5 Zbigniew Jędrzejewski-Szmek 2013-05-31 21:48:06 UTC
On the serial console (unfortunately the top has scrolled off):
[  444.400054] Aborting journal on device dm-0-8.
[  444.402344] Buffer I/O error on device dm-0, logical block 13757
[  444.403658] JBD2: Error -5 detected when updating journal superblock for dm-0-8.
[  444.405691] Buffer I/O error on device dm-0, logical block 0
[  444.406702] EXT4-fs error (device dm-0): __ext4_journal_start_sb:60: Detected aborted journal
[  444.408079] EXT4-fs (dm-0): Remounting filesystem read-only
[  444.408955] EXT4-fs (dm-0): previous I/O error to superblock detected
[  444.409966] Buffer I/O error on device dm-0, logical block 0
[  470.339068] Buffer I/O error on device dm-0, logical block 524295
[  470.340092] Buffer I/O error on device dm-0, logical block 524296

dm-0 is /dev/mapper/live-rw
dm-1 is /dev/mapper/live-osimg-min

Seems like a bug :)

Comment 6 Fedora Update System 2013-06-11 21:35:48 UTC
qemu-1.4.2-3.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/qemu-1.4.2-3.fc19

Comment 7 Fedora Update System 2013-06-12 19:10:35 UTC
Package qemu-1.4.2-3.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing qemu-1.4.2-3.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-10680/qemu-1.4.2-3.fc19
then log in and leave karma (feedback).

Comment 8 Fedora Update System 2013-06-15 03:08:39 UTC
qemu-1.4.2-3.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.