Bug 996085 - systemd-vdsmd is misguided
Summary: systemd-vdsmd is misguided
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.4.0
Assignee: Yaniv Bronhaim
QA Contact: sefi litmanovich
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-08-12 11:36 UTC by Tomasz Torcz
Modified: 2014-03-31 12:27 UTC (History)
10 users (show)

Fixed In Version: ovirt-3.4.0-alpha1
Clone Of:
Environment:
Last Closed: 2014-03-31 12:27:50 UTC
oVirt Team: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 15578 0 None None None Never

Description Tomasz Torcz 2013-08-12 11:36:22 UTC
Description of problem:
/usr/lib/systemd/systemd-vdsmd looks like SYSV init script masquerading as part of systemd unit file. It mainly ignores systemd features, works around them and sometimes breaks because of it.

1) there's a bunch of mkdir & chown in mk_*() functions; Those probably should be converted to tmpfilesd snippets. 

2) script uses "chkconfig" invocation to modify services' state. First, the script has "systemd" in it's name, so shouldn't touch chkconfig but use "systemctl enable/disable". Second, the script used for starting service shouldn't modify system configuration that way, it's unexpected for the admin.

2) if there are "conflicting services", this should be encoded in vdsmd.service unit files. Systemd provides Conflicts= for a reason.

3) "/sbin/service status" is used few times; first - this is systemd, use "systemctl". Second: use systemctl with proper commands - "is-enabled", "is-active", depending what you are checking for. And third: drop this checking altogether, use proper Requires=, After= etc. in unit file.

4) more of 3) - because "status" displays log file, and pulling logs on rotating storage is slow (tens of seconds per service), "systemctl start vdsmd.service" often fails with a timeout. Every "service FOO status" takes 15-20 seconds, and there are couple of them in systemd-vdsmd. So the script cannot finish in default time allotted.
(you can increase timeout in unit file definition, but really, use proper Wants=, Requires= etc.)

5) libvirt_should_use_upstart()... the script has "systemd" in the name, the answer should be obvious.

6) test_lo() - systemd guarantees loopback to be configured, just drop it.

7) there are some parts for modifying libvirt configuration. I think it should NOT be run everytime vdsmd is started - could you split it to external unit file? You can probably do something similar to how the SSH key generation is split to external unit.

Version-Release number of selected component (if applicable):
vdsm-4.10.3-18.fc19.x86_64

Comment 1 Yaniv Bronhaim 2013-08-12 17:17:56 UTC
The patch [1] fills part of the work. Added to external trackers.

[1] http://gerrit.ovirt.org/#/c/15578

Comment 3 Sandro Bonazzola 2014-01-13 13:56:57 UTC
oVirt 3.4.0 alpha has been released including the fix for this issue.

Comment 4 sefi litmanovich 2014-03-10 13:41:30 UTC
Please provide steps of reproduction for verification

Comment 5 Yaniv Bronhaim 2014-03-16 11:01:45 UTC
separating systemd script from vdsm old sysv scripts means we use vdsmd.service file to declare daemon details:
1. verify that all pre-start scripts run as expected (see them in vdsmd_init_common.sh)
2. verify that the service was set to start at startup
3. check that conflic services are taking down when vdsmd start (libvirt-guests.service ksmtuned.service)
4. same for require service that should get up (multipathd.service libvirtd.service time-sync.target iscsid.service rpcbind.service supervdsmd.service sanlock.service)
5. user/group of vdsm process - should be vdsm:kvm
6. restart after killed
7. coredump works if enabled
8. if /etc/sysconfig/vdsm declared, it is used for environment vars
9. check nice level of vdsm process (specify it on the verification details)

all those I verified and I think should be enough

Comment 6 sefi litmanovich 2014-03-19 09:28:51 UTC
Verified using vdsm-4.14.2-0.4.el6ev.x86_64.

Steps of verification as mentioned above by yaniv:

1. uppon restarting vdsmd service, all pre start scripts ran successfully.
2. service is set to start on startup
3. conflict services are stopped
4. required services are up
5. user/group of vdsm process is vdsm:kvm
6. service restarted after kill
7. coredump enabled works and dump file created in /var/log/core/
8. nice level of vdsm process is -20

Comment 7 Sandro Bonazzola 2014-03-31 12:27:50 UTC
this is an automated message: moving to Closed CURRENT RELEASE since oVirt 3.4.0 has been released


Note You need to log in before you can comment on or make changes to this bug.