Bug 1314515 - design of skynet daemon and systemd service is wrong
design of skynet daemon and systemd service is wrong
Status: VERIFIED
Product: Red Hat Storage Console
Classification: Red Hat
Component: agent (Show other bugs)
2
Unspecified Unspecified
unspecified Severity high
: ---
: 2
Assigned To: Timothy Asir
Martin Bukatovic
: TestBlocker
Depends On:
Blocks: 1326788 1326854
  Show dependency treegraph
 
Reported: 2016-03-03 14:51 EST by Martin Bukatovic
Modified: 2016-08-08 15:33 EDT (History)
4 users (show)

See Also:
Fixed In Version: rhscon-core-0.0.34-1.el7scon.x86_64 rhscon-ceph-0.0.33-1.el7scon.x86_64 rhscon-ui-0.0.47-1.el7scon.noarch
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Martin Bukatovic 2016-03-03 14:51:45 EST
Description of problem
======================

Design of skynet daemon and systemd service is wrong. This may lead to other
issues and subtle errors. 

Version-Release number of selected component
============================================

rhscon-agent-0.0.3-3.el7.noarch

Actual results
==============

Skynet daemon is not implemented as a proper systemd service:

~~~
# rpm -qlv rhscon-agent | grep systemd
-rwxr-xr-x    1 root    root                      244 Feb 29 18:21 /usr/lib/systemd/system/systemd-skynetd.service
# cat /usr/lib/systemd/system/systemd-skynetd.service
[Unit]
Description=SkyRing Node Eventing Daemon to push signals to SkyRing

[Service]
Type=simple
PIDFile=/var/run/skynetd.pid
ExecReload=/usr/bin/skynetd reload
ExecStart=/usr/bin/skynetd start
ExecStop=/usr/bin/skynetd stop

KillMode=process
[root@mbukatov-usm2-node1 ~]#
~~~

This approach is completely wrong. Instead of relying on systemd with
daemonization and service management, skynetd does
it on it's own, eg. PID management is a hack (see additional information
below for details).

The idea of systemd is that systemd handles service management for you so that
you don't need to reimplement it yourself.

Some other minor issues includes (this is not a full list):

 * the name of the service should not contain a 'systemd-' prefix, this
   makes sense for systemd components only. The correct name is just 
   skynet.service or skynetd.service
 * systemd ini file should not be executable

Expected results
================

Skynet service is implemented so that systemd handles all service related tasks
(daemonization, handling of pid or lock files ...).

Skynet service must follow *systemd packaging guidelines*:

https://fedoraproject.org/wiki/Packaging:Systemd

Additional info
===============

I have discussed this issue with Branislav Blaskovic who has an experience
in this area. His comments are:

> Well as I wrote before, I am afraid that the situation here is, that
> skynetd is handling PID file on its own and systemd should handle it.
> So there could be some issues.
> 
> I read the code just briefly, but I think the situation could be like
> this: (I can be also wrong) :-)
> 
> 1. systemd launches 'skynetd start' which returns PID 123.
> 2. 'skynetd start' throws another process (pid 456) and dies.
> 3. systemd wants to write PID 123 to PIDFile BUT skynetd is using the
> same pidfile
>        |-- there could be maybe race condition or something similar
> 
> It's working for you maybe because systemd writes 123 to PIDFIle and
> then skynetd rewrites it to 456, so when systemd is checking status it
> sees correct PID number. But it's not the one which systemd wrote to
> it.
> 
> So yes, it could basically works, but it would be better to rewrite
> skynetd so systemd can handle it.
Comment 2 Ken Dreyer (Red Hat) 2016-06-16 10:51:08 EDT
What is the version of the rhscon-agent package that fixes this bug?
Comment 3 Martin Bukatovic 2016-08-08 15:33:25 EDT
Checking with: rhscon-agent-0.0.18-1.el7scon.noarch

And skynetd service has a proper systemd service unit file now:

* name of the service ('skynetd') no longer contains "systemd-" prefix
* start, stop and restart is possible
* enable/disable of the service works
* restart on failure works
* checking logs via journal works

Note You need to log in before you can comment on or make changes to this bug.