Description of problem: Occasionally I boot Fedora 21 Live and spice-vdagentd.service is not running. But the problem is not that it crashed. It hasn't even been started! Neither spice-vdagentd.target was started. I have no idea why this happens, and it's very hard to reproduce. But when it does happen, it's inconvenient, because my shared clipboard doesn't work. Usually, the service status should look like this: [liveuser@localhost ~]$ systemctl status spice-vdagentd.service ● spice-vdagentd.service - Agent daemon for Spice guests Loaded: loaded (/usr/lib/systemd/system/spice-vdagentd.service; enabled) Active: active (running) since Mon 2014-10-27 10:23:49 EDT; 1h 17min ago Process: 828 ExecStart=/usr/sbin/spice-vdagentd $SPICE_VDAGENTD_EXTRA_ARGS (code=exited, status=0/SUCCESS) Process: 778 ExecStartPre=/bin/rm -f /var/run/spice-vdagentd/spice-vdagent-sock (code=exited, status=0/SUCCESS) Main PID: 832 (spice-vdagentd) CGroup: /system.slice/spice-vdagentd.service └─832 /usr/sbin/spice-vdagentd Oct 27 10:23:56 localhost spice-vdagentd[832]: opening vdagent virtio channel [liveuser@localhost ~]$ systemctl status spice-vdagentd.target ● spice-vdagentd.target - Agent daemon for Spice guests Loaded: loaded (/usr/lib/systemd/system/spice-vdagentd.target; static) Active: active since Mon 2014-10-27 10:23:49 EDT; 1h 17min ago But when this bug occurs, it looks like this: [liveuser@localhost ~]$ systemctl status spice-vdagentd.service ● spice-vdagentd.service - Agent daemon for Spice guests Loaded: loaded (/usr/lib/systemd/system/spice-vdagentd.service; enabled) Active: inactive (dead) [liveuser@localhost ~]$ systemctl status spice-vdagentd.target ● spice-vdagentd.target - Agent daemon for Spice guests Loaded: loaded (/usr/lib/systemd/system/spice-vdagentd.target; static) Active: inactive (dead) Below, I attach two sets of logs. One of them is "OK", one of them is "fail". The "OK" one was taken from a VM where spice-vdagentd started properly. The "fail" one was taken from a VM where the described bug occurred and spice-vdagentd was not running. The VMs are almost identical, I attach their libvirt configuration file as well. It was easier to have two VMs side by side when gathering logs, but of course I could have used the same VM for gathering everything. Version-Release number of selected component (if applicable): spice-glib-0.25-6.fc21.x86_64 spice-gtk3-0.25-6.fc21.x86_64 spice-server-0.12.5-5.fc21.x86_64 spice-vdagent-0.15.0-3.fc21.x86_64 systemd-216-5.fc21.x86_64 logs collected from Fedora-Live-Workstation-x86_64-21_Beta-1.iso (Beta RC1) http://dl.fedoraproject.org/pub/alt/stage/21_Beta_RC1/Workstation/x86_64/iso/ host system: Fedora 20 libvirt-1.1.3.6-1.fc20.x86_64 libvirt-client-1.1.3.6-1.fc20.x86_64 libvirt-daemon-1.1.3.6-1.fc20.x86_64 virt-manager-1.0.1-4.fc20.noarch How reproducible: very rarely Steps to Reproduce: 1. run Fedora 21 Live image in virt-manager and have a bad luck 2. see that your shared clipboard doesn't work 3. check that spice-vdagentd.service is enabled, but "inactive (dead)", and that in journalctl there is no mention of ever starting that service Additional info: I have saved a VM snapshot when spice-vdagentd failed to start, so hopefully I can retrieve more information if necessary. What is very hard to do is to reboot the VM (e.g. with systemd debug info enabled) and hit this bug again - it really occurs quite rarely. (But not too rarely enough to not annoy me;).
Created attachment 953563 [details] journal.fail
Created attachment 953564 [details] journal.ok
Created attachment 953565 [details] status-all.fail
Created attachment 953566 [details] status-all.ok
Created attachment 953567 [details] vdagentd-rev-deps.fail
Created attachment 953568 [details] vdagentd-rev-deps.ok
Created attachment 953569 [details] rpm-qa.fail
Created attachment 953570 [details] rpm-qa.ok
Created attachment 953571 [details] lspci.fail
Created attachment 953572 [details] lspci.ok
Created attachment 953574 [details] libvirt xml config for "fail" VM
Created attachment 953575 [details] libvirt xml config for "ok" VM
I find comment 5 and comment 6 very interesting, in the OK case there was sys-devices-pci0000:00-0000:00:06.0-virtio1-virtio\x2dports-vport1p2.device which was required by spice-vdagentd.target, in the fail case there is no such device listed, even though it exists in the system (see status-all files). Might be the source of the problem?
spice-vdagentd.target is supposed to be started by udev through 70-spice-vdagentd.rules : ACTION=="add", SUBSYSTEM=="virtio-ports", ENV{DEVLINKS}=="/dev/virtio-ports/com.redhat.spice.0", ENV{SYSTEMD_WANTS}="spice-vdagentd.target" I'd tend to assume this rule fires as the symlink seems to be there. Maybe adding -dd to spice-vdagentd command line in the service file would give us more info about that failure.
When I restore the snapshot (running VM where spice-vdagentd seems to not have started), and start the service, there is no error and no output. Even when I add "-dd" beforehand: [liveuser@localhost ~]$ sudo vi /usr/lib/systemd/system/spice-vdagentd.service [liveuser@localhost ~]$ sudo systemctl start spice-vdagentd.service [liveuser@localhost ~]$ sudo systemctl status spice-vdagentd.service ● spice-vdagentd.service - Agent daemon for Spice guests Loaded: loaded (/usr/lib/systemd/system/spice-vdagentd.service; enabled) Active: active (running) since Mon 2014-10-27 11:50:14 EDT; 4s ago Process: 2789 ExecStart=/usr/sbin/spice-vdagentd $SPICE_VDAGENTD_EXTRA_ARGS -dd (code=exited, status=0/SUCCESS) Process: 2787 ExecStartPre=/bin/rm -f /var/run/spice-vdagentd/spice-vdagent-sock (code=exited, status=0/SUCCESS) Main PID: 2791 (spice-vdagentd) CGroup: /system.slice/spice-vdagentd.service └─2791 /usr/sbin/spice-vdagentd -dd No output at all. When I execute the command directly, it behaves the same. PS: Neither way I get a functional shared clipboard. I don't know whether that's expected (and I have to restart X or something), or whether it's a different bug. Anyway, I'm not 100% sure, but I think spice-vdagentd was really not originally triggered at all. If it crashed, systemd would tell us so. Is it possible that it exited with 0 and printed no output? In that case systemd could report that as "inactive (dead)". If there's a way to distinguish an exited service with no output from a service that hasn't been started at all, I'd love to know. That would help us debug it more.
(In reply to Christophe Fergeau from comment #14) > spice-vdagentd.target is supposed to be started by udev through > 70-spice-vdagentd.rules : > ACTION=="add", SUBSYSTEM=="virtio-ports", > ENV{DEVLINKS}=="/dev/virtio-ports/com.redhat.spice.0", > ENV{SYSTEMD_WANTS}="spice-vdagentd.target" When looking at this more, doesn't comment 5 indicate that this udev rule has in fact *not* been run? Compare to comment 6.
status-all.fail has: ● dev-virtio\x2dports-com.redhat.spice.0.device - /dev/virtio-ports/com.redhat.spice.0 Follow: unit currently follows state of sys-devices-pci0000:00-0000:00:06.0-virtio1-virtio\x2dports-vport1p2.device Loaded: loaded Active: active (plugged) since Mon 2014-10-27 09:43:36 EDT; 1h 58min ago Device: /sys/devices/pci0000:00/0000:00:06.0/virtio1/virtio-ports/vport1p2 so I would assume that this means the udev rule has created the /dev/virtio-ports/com.redhat.spice.0 symlink (?) I'm not very familiar with systemd/udev/... and spice-vdagentd failing to start seems more likely than an udev issue :) But I agree that nothing rules out udev/an issue with the udev rule, especially if systemd would return a different status or give us some logs for 'starting vdagentd failed' vs 'did not try to start vdagentd'
Yes, that symlink is there: [root@localhost ~]# ll /dev/virtio-ports/ total 0 lrwxrwxrwx. 1 root root 11 Oct 27 09:43 com.redhat.spice.0 -> ../vport1p2 lrwxrwxrwx. 1 root root 11 Oct 27 09:43 org.qemu.guest_agent.0 -> ../vport1p1 Reassigning to systemd then. Systemd devs, could you please help us debug this issue? The question is, how can we find out whether spice-vdagentd.service has been started and exited without any output logged, or whether it hasn't been started at all? And if the latter, why it hasn't been started? We need to find this out *from a running system*, there's no option to reboot it (and e.g. enable systemd debug mode). Thanks!
I don't think there's a way to distinguish a service which exited returning success status from a service which has never run. For services without error status systemd unloads the unit completely from memory, so no "state" is kept. I think that the only way that spice-vdagentd.service would be if it pulled in by spice-vdagentd.target, and spice-vdagentd.target is only started by the udev rule. So if it is not there in reverse dependencies, I don't expect it to have run. What does 'udevadm info /dev/virtio-ports/com.redhat.spice.0' say?
(In reply to Zbigniew Jędrzejewski-Szmek from comment #19) > I don't think there's a way to distinguish a service which exited returning > success status from a service which has never run. For services without > error status systemd unloads the unit completely from memory, so no "state" > is kept. Thanks for info. I tried to reboot the machine about 20 times and I haven't hit the bug again, so there's no way I could reproduce it with systemd debug mode enabled. > I think that the only way that spice-vdagentd.service would be if it pulled > in by spice-vdagentd.target, and spice-vdagentd.target is only started by > the udev rule. So if it is not there in reverse dependencies, I don't expect > it to have run. This confirms my suspicion that even the .target hasn't been run, and therefore the .service hasn't been run. So now we need to find out why sys-devices-pci0000:00-0000:00:06.0-virtio1-virtio\x2dports-vport1p2.device didn't want spice-vdagentd.target to start. > > What does 'udevadm info /dev/virtio-ports/com.redhat.spice.0' say? [root@localhost ~]# udevadm info /dev/virtio-ports/com.redhat.spice.0 P: /devices/pci0000:00/0000:00:06.0/virtio1/virtio-ports/vport1p2 N: vport1p2 S: virtio-ports/com.redhat.spice.0 E: DEVLINKS=/dev/virtio-ports/com.redhat.spice.0 E: DEVNAME=/dev/vport1p2 E: DEVPATH=/devices/pci0000:00/0000:00:06.0/virtio1/virtio-ports/vport1p2 E: MAJOR=249 E: MINOR=2 E: SUBSYSTEM=virtio-ports E: TAGS=:systemd: E: USEC_INITIALIZED=13311
(In reply to Kamil Páral from comment #20) > > What does 'udevadm info /dev/virtio-ports/com.redhat.spice.0' say? > > [root@localhost ~]# udevadm info /dev/virtio-ports/com.redhat.spice.0 > P: /devices/pci0000:00/0000:00:06.0/virtio1/virtio-ports/vport1p2 > N: vport1p2 > S: virtio-ports/com.redhat.spice.0 > E: DEVLINKS=/dev/virtio-ports/com.redhat.spice.0 > E: DEVNAME=/dev/vport1p2 > E: DEVPATH=/devices/pci0000:00/0000:00:06.0/virtio1/virtio-ports/vport1p2 > E: MAJOR=249 > E: MINOR=2 > E: SUBSYSTEM=virtio-ports > E: TAGS=:systemd: > E: USEC_INITIALIZED=13311 To confirm: this is for the affected boot?
(In reply to Zbigniew Jędrzejewski-Szmek from comment #21) > To confirm: this is for the affected boot? Yes. I have a live VM snapshot that I can restore and gather data from. I hope the snapshot restore function does not affect the guest system in any way (triggering udev resets or something). But I cannot vouch for that. But the systemd status of all those services is still the same.
Zbigniew, do you have some advice what to try next, or perhaps whom to CC into this ticket? Thanks.
Today I've seen this bug twice. It's not an exceptionally rare issue.
Kamil, can you provide output of udevadm info /dev/virtio-ports/com.redhat.spice.0 from boot when bug does not occur?
(In reply to Michal Sekletar from comment #25) > Kamil, can you provide output of udevadm info > /dev/virtio-ports/com.redhat.spice.0 from boot when bug does not occur? This is from an unaffected boot: [root@localhost ~]# udevadm info /dev/virtio-ports/com.redhat.spice.0 P: /devices/pci0000:00/0000:00:06.0/virtio1/virtio-ports/vport1p2 N: vport1p2 S: virtio-ports/com.redhat.spice.0 E: DEVLINKS=/dev/virtio-ports/com.redhat.spice.0 E: DEVNAME=/dev/vport1p2 E: DEVPATH=/devices/pci0000:00/0000:00:06.0/virtio1/virtio-ports/vport1p2 E: MAJOR=249 E: MINOR=2 E: SUBSYSTEM=virtio-ports E: TAGS=:systemd: E: USEC_INITIALIZED=63111
*** Bug 1158665 has been marked as a duplicate of this bug. ***
I see this occasionally when booting F21 Workstation media, even the final one. Putting back needinfo for Lennart, which I removed by accident. Lennart, any help here? I still have the snapshot of the running Live system. Thanks.
Test compilation: http://goo.gl/Gm4ffO iso/Rawhide-Xfce-Live-*.iso Result: spice-vdagent(d) passed 101%, w/ both LiveCD/DVD and installed system. Testing was done intensively, in greater quantity. on Fedora 22 host - qemu-2.3.1-7.fc22.x86_64 and Rawhide host - qemu-2.4.0.1-2.fc24.x86_64 I'm even a little improved service for my need w/ so-called "Lureau-Goede" combination. See: http://lists.freedesktop.org/archives/spice-devel/2015-October/022516.html Ref. http://cgit.freedesktop.org/spice/linux/vd_agent/log/data/spice-vdagentd.service
I saw exactly this one time recently with an F23 live.
This message is a reminder that Fedora 23 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 23. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '23'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 23 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
Fedora 23 changed to end-of-life (EOL) status on 2016-12-20. Fedora 23 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days