Bug 1331577

Summary: hypervvssd, hypervfcopyd, hypervkvpd cause a timeout and notice when started on non-microsoft systems
Product: [Fedora] Fedora Reporter: Zbigniew Jędrzejewski-Szmek <zbyszek>
Component: hyperv-daemonsAssignee: Vitaly Kuznetsov <vkuznets>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: antoine, bughunt, cristian.ciupitu, eliteknipser, icarohoff, jeremy9856, johannbg, kevin, lnykryn, mmuzila, msekleta, muadda, samuel-rhbugs, s, systemd-maint, thozza, vkuznets, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: hyperv-daemons-0-0.15.20160728git.fc26 hyperv-daemons-0-0.15.20160728git.fc24 hyperv-daemons-0-0.15.20160728git.fc25 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-25 13:53:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch to implement option b) from comment #1
none
patch to implement option b) from comment #1
none
patch to implement option b) from comment #1 none

Description Zbigniew Jędrzejewski-Szmek 2016-04-28 20:49:06 UTC
Description of problem:
Quoting from discussion on fedora-devel:

> > ID: 14899   Test: x86_64 Workstation-live-iso base_selinux
> > URL: https://openqa.fedoraproject.org/tests/14899
> 
> Probably not worth worrying about.

Actually that's a major screw up, and it affects all systems where
hypervvssd, hypervfcopyd, or hypervkvpd are installed. The reason that
it's not always visible is that systemd cuts of the output to the
console when starting getty. So depending on the ordering of things, this
error will not be visible on the console, but it's still there in the logs:

$ systemctl status hypervvssd
● hypervvssd.service - Hyper-V VSS daemon
   Loaded: loaded (/usr/lib/systemd/system/hypervvssd.service; enabled; vendor preset: enabled\
)
   Active: inactive (dead)

Apr 28 16:09:35 rawhide systemd[1]: Dependency failed for Hyper-V VSS daemon.
Apr 28 16:09:35 rawhide systemd[1]: hypervvssd.service: Job hypervvssd.service/start failed...

$ uname -p
x86_64
hypervvssd.service and friends are enabled by default in presets
[https://bugzilla.redhat.com/show_bug.cgi?id=1279322], and pulled in by
default in comps (in @guest-desktop-agents, which is in all desktops).
  
> This is a weird one: systemd decided we were Hyper-V for some reason?
Not really. hypervvssd.service has
ConditionVirtualization=microsoft
BindsTo=sys-devices-virtual-misc-vmbus\x21hv_vss.device
After=sys-devices-virtual-misc-vmbus\x21hv_vss.device
but Condition* is checked just before the unit is started, so a timeout
occurs and a notice is logged. So everything happens according to the plan,
but it's the plan that is wrong, so to speak.

Some different solution is needed. Systemd does not provide a nice way to
silently skip a unit if a Condition is not satisfied. After describing this
in detail, I think the solution should be on systemd side: maybe we should
evaluate conditions like ConditionArchitecture= and ConditionVirtualization=
immediately, because they cannot change. This would be a change of semantics.

Version-Release number of selected component (if applicable):
systemd-229-14.fc25.x86_64
hypervvssd-0-0.14.20150702git.fc24.x86_64
fedora-release-25-0.8.noarch

Steps to Reproduce:
1. Boot machine or run 'systemctl status hypervvssd' not on hyper-v 

Actual results:
Timeout waiting for /sys/devices/virtual/misc/vmbus!hv_vss

Expected results:
Daemon is silently skipped when conditions are not satisfied.

Comment 1 Lennart Poettering 2016-05-02 13:25:29 UTC
Well, this appears to be a misunderstanding what systemd unit conditions are and are supposed to be. The idea is that they do not affect the dependency tree. i.e. if a unit "foo" pulls in another unit "bar", and "foo" fails a condition this has no effect on whether "bar" is pulled in, it will still be pulled in.

So, this is really a misuse of the condition stuff. I can see multiple ways out:

a) drop the whole ConditionVirtulization= stuff. Instead pull in microsoft-virtualization.target or so via a udev rule that sets SYSTEMD_WANTS on the hv_vss device. Then change your services to plug into that target instead of multi-user.target, via WantedBy=. See systemd.device(5) for details about SYSTEMD_WANTS in udev rules.

or:

b) the same as above, but don't bother with definining the "microsoft-virtualization" target stuff, but pull in the services directly from the udev rule. This is fine if there's really no reason why one would ever want to turn off these services.

or:

c) drop the ConditionVirtualization= stuff, instead write a small generator that pulls in the microsoft services if systemd-detect-virt returns "microsoft". The generator can be a shell script if you like. See systemd.generator(7) for details on how to write generators.

Comment 2 Zbigniew Jędrzejewski-Szmek 2016-05-09 01:26:52 UTC
Created attachment 1155125 [details]
patch to implement option b) from comment #1

Option b) seems best. It is simplest and was already implemented, so it is enough to remove the preset stuff to fix this bug.

(I think those three services should be started "unconditionally", when the right hardware is present, and the packages are installed. This is recommended by the packaging guidelines https://fedoraproject.org/wiki/Packaging:Systemd#Hardware_activation and I don't see any advantage to making them condtional. If necessary, it is always possible to uninstall the package or mask the service.)

Comment 3 Zbigniew Jędrzejewski-Szmek 2016-05-09 01:46:08 UTC
https://pagure.io/fedora-release/pull-request/45 (master)
https://pagure.io/fedora-release/pull-request/46 (f24)

Comment 4 Zbigniew Jędrzejewski-Szmek 2016-05-09 01:48:47 UTC
Created attachment 1155129 [details]
patch to implement option b) from comment #1

v2 of patch, that also disables those daemons on existing installations, to also fix the bug for people who installed previously

Comment 5 Zbigniew Jędrzejewski-Szmek 2016-07-27 00:33:08 UTC
Created attachment 1184454 [details]
patch to implement option b) from comment #1

v3: fix typo reported by Juha Leppänen <juha_efku>

Comment 6 Zbigniew Jędrzejewski-Szmek 2016-07-27 00:33:51 UTC
Oops, this wasn't assigned properly.

Comment 7 Vitaly Kuznetsov 2016-07-27 09:08:42 UTC
While I see no point in having these daemons installed on non-Hyper-V guests I agree it makes sense to drop ConditionVirtualization= stuff and remove them from dependencies switching to udev-only activation. I'll try to experiment starting with your patch, thanks!

Comment 8 Vitaly Kuznetsov 2016-07-28 13:45:20 UTC
I see no obvious flaws in the suggested solution, I tested it on both Hyper-V and non-Hyper-V installs and tested the upgrade, it works as expected. Here is a koji build:

http://koji.fedoraproject.org/koji/taskinfo?taskID=15049187

I'll test it a bit more and if no one is against the idea push it to rawhide.

Comment 9 Ícaro Hoff 2016-08-08 13:27:23 UTC
As of today, my system being up-to-date with f24 upstream, I've still encountered the same problem.
It makes my computer hang on for 1m28s-1m34s usually when the service is enabled.

Even when disabling through systemd:
sudo systemctl disable hypervvssd.service \
hypervfcopyd.service \
enable hypervkvpd.service

The status returns:
vendor preset: enable

The preset still forcing it to enable...
cat /usr/lib/systemd/system-preset/90-default.preset | grep yper
# Hyper-V guest support daemons
enable hypervvssd.service
enable hypervfcopyd.service
enable hypervkvpd.service

I'm running the latest systemd from f24 branch: systemd-229-9.fc24.x86_64

Having a udev capable kernel makes this implementation of Hyper-V deprecated and pointless as it targets all devices globally instead of Hyper-V only.

Comment 10 Vitaly Kuznetsov 2016-08-08 13:39:41 UTC
No changes were made to F24 so far. Did you test packages from http://koji.fedoraproject.org/koji/taskinfo?taskID=15049187 ?

It should be possible to install them on F24. We'll ask to remove hyperv services from the default preset for future but with the packages from the scratch build (above) it shouldn't matter.

Comment 11 Zbigniew Jędrzejewski-Szmek 2016-08-08 14:36:47 UTC
My PR to disable the presets for F24 seems to have been lost. I opened https://pagure.io/fedora-release/issue/55 to have it reinstated.

Comment 12 jeremy9856 2016-08-18 01:07:42 UTC
I think I have this problem on a fully up to date F24

août 18 03:03:41 Desktop systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_vss.device: Job sys-devices-virtual-misc-vmbus\x21hv_vss.device/start failed with result 'timeout'.
août 18 03:03:41 Desktop systemd[1]: hypervvssd.service: Job hypervvssd.service/start failed with result 'dependency'.
août 18 03:03:41 Desktop systemd[1]: Dependency failed for Hyper-V VSS daemon.
août 18 03:03:41 Desktop systemd[1]: Timed out waiting for device sys-devices-virtual-misc-vmbus\x21hv_vss.device.
août 18 03:03:41 Desktop systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_vss.device: Job sys-devices-virtual-misc-vmbus\x21hv_vss.device/start timed out.
août 18 03:03:41 Desktop systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_fcopy.device: Job sys-devices-virtual-misc-vmbus\x21hv_fcopy.device/start failed with result 'timeout'.
août 18 03:03:41 Desktop systemd[1]: hypervfcopyd.service: Job hypervfcopyd.service/start failed with result 'dependency'.
août 18 03:03:41 Desktop systemd[1]: Dependency failed for Hyper-V FCOPY daemon.
août 18 03:03:41 Desktop systemd[1]: Timed out waiting for device sys-devices-virtual-misc-vmbus\x21hv_fcopy.device.
août 18 03:03:41 Desktop systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_fcopy.device: Job sys-devices-virtual-misc-vmbus\x21hv_fcopy.device/start timed out.
août 18 03:03:41 Desktop systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start failed with result 'timeout'.
août 18 03:03:41 Desktop systemd[1]: hypervkvpd.service: Job hypervkvpd.service/start failed with result 'dependency'.
août 18 03:03:41 Desktop systemd[1]: Dependency failed for Hyper-V KVP daemon.
août 18 03:03:41 Desktop systemd[1]: Timed out waiting for device sys-devices-virtual-misc-vmbus\x21hv_kvp.device.
août 18 03:03:41 Desktop systemd[1]: sys-devices-virtual-misc-vmbus\x21hv_kvp.device: Job sys-devices-virtual-misc-vmbus\x21hv_kvp.device/start timed out.

Can it be fixed in F24 please ? It seem to make the boot way longer according to systemd-analyze

Startup finished in 2.724s (kernel) + 784ms (initrd) + 1min 30.192s (userspace) = 1min 33.702s

Thanks !

Comment 13 Vitaly Kuznetsov 2016-08-18 09:00:42 UTC
(In reply to jeremy9856 from comment #12)
> I think I have this problem on a fully up to date F24
> 
...
> 
> Can it be fixed in F24 please ? It seem to make the boot way longer
> according to systemd-analyze
> 
> Startup finished in 2.724s (kernel) + 784ms (initrd) + 1min 30.192s
> (userspace) = 1min 33.702s

The fix is currently in rawhide, I'll prepare updates for F24 and already branched F25.

Comment 14 jeremy9856 2016-08-18 09:32:30 UTC
Great ! Thank you.

Comment 15 Fedora Update System 2016-08-19 14:38:11 UTC
hyperv-daemons-0-0.15.20160728git.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-191d231dca

Comment 16 Fedora Update System 2016-08-19 14:51:43 UTC
hyperv-daemons-0-0.15.20160728git.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-90c7f9018e

Comment 17 Fedora Update System 2016-08-19 16:50:13 UTC
hyperv-daemons-0-0.15.20160728git.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-191d231dca

Comment 18 Fedora Update System 2016-08-19 23:21:27 UTC
hyperv-daemons-0-0.15.20160728git.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-90c7f9018e

Comment 19 Fedora Update System 2016-08-25 13:53:51 UTC
hyperv-daemons-0-0.15.20160728git.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 20 jeremy9856 2016-08-29 00:10:11 UTC
Thanks for the update ! It's much better now.

Startup finished in 2.719s (kernel) + 826ms (initrd) + 8.565s (userspace) = 12.111s

Comment 21 Fedora Update System 2016-09-27 00:40:07 UTC
hyperv-daemons-0-0.15.20160728git.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 22 Michael Beck 2017-01-11 08:53:20 UTC
Hello. This still happens on my CentOS 7.
Version 0.29.20160216git.el7
uname -r
3.10.0-514.2.2.el7.x86_64
only epel is "installed"


KVP runs, vssd and daemon do not.

I will try with udev now.

Comment 23 Michael Beck 2017-01-11 08:59:04 UTC
Sorry, I don't think it is the same bug. My CentOS runs indeed as VM on HyperV.

I'm so sorry

Comment 24 Vitaly Kuznetsov 2017-01-11 10:40:22 UTC
(In reply to Michael Beck from comment #22)
> Hello. This still happens on my CentOS 7.
> Version 0.29.20160216git.el7
> uname -r
> 3.10.0-514.2.2.el7.x86_64
> only epel is "installed"
> 
> 
> KVP runs, vssd and daemon do not.
> 
> I will try with udev now.

(In reply to Michael Beck from comment #23)
> Sorry, I don't think it is the same bug. My CentOS runs indeed as VM on
> HyperV.
> 

If vssd and fcopy don't run please check that you have these services enabled for your guest in Hyper-V. Please also check that it's not SELinux which prevents deaemons from starting. But, anyway, this sound like a different issue, feel free to open a BZ.