Bug 820448

Summary: systemd-notify doesn't work, since it lives too short
Product: [Fedora] Fedora Reporter: Derek Higgins <derekh>
Component: openstack-keystoneAssignee: Alan Pevec (Fedora) <apevec>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: apevec, apevec, bfilippov, breu, btotty, cfeist, itamar, Jan.van.Eldik, johannbg, jonathansteffan, jose.castro.leon, jpazdziora, markmc, metherid, mschmidt, notting, p, plautrba, rbryant, rmetrich, systemd-maint, t
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 982376 (view as bug list) Environment:
Last Closed: 2014-02-11 20:14:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 812219    

Description Derek Higgins 2012-05-10 00:03:04 UTC
Using a test systemd service unit with 
Type=notify

and using a test script that contains
systemd-notify --ready
the notify is sometimes successfull and sometimes it fails with the following message in /var/log/messages
May 10 00:37:31 laptop systemd[1]: Cannot find unit for notify message of PID 3495.

when the command is successfull
> sudo systemctl start test1.service

returns almost immediately, when it fails it times out after 90 seconds

below are the files I am using along package versions

$ rpm -qa | grep -i -e systemd
systemd-units-37-19.fc16.x86_64
systemd-37-19.fc16.x86_64
systemd-sysv-37-19.fc16.x86_64

$ getenforce 
Permissive

$ cat /lib/systemd/system/test1.service 
[Unit]
Description=Tests

[Service]
User=derekh
Type=notify
ExecStart=/tmp/testservice
NotifyAccess=all

$ cat /tmp/testservice
#!/bin/bash

sleep 1

systemd-notify --ready

echo Sleeping
sleep 300
echo Done

Comment 1 Michal Schmidt 2012-05-10 08:36:44 UTC
(In reply to comment #0)
> May 10 00:37:31 laptop systemd[1]: Cannot find unit for notify message of PID
> 3495.

systemd-notify sends a message to $NOTIFY_SOCKET and then exits.

When systemd receives the notification, the systemd-notify process may have already exited and been reaped by bash.

Ideally the cgroups membership information would be delivered with the message over the socket.

From http://0pointer.de/blog/projects/plumbers-wishlist-3.html:

AF_UNIX:

* An auxiliary meta data message for AF_UNIX called SCM_CGROUPS (or something like that), i.e. a way to attach sender cgroup membership to messages sent via AF_UNIX. This is useful in case services such as syslog shall be shared among various containers (or service cgroups), and the syslog implementation needs to be able to distinguish the sending cgroup in order to separate the logs on disk. Of course stm SCM_CREDENTIALS can be used to look up the PID of the sender followed by a check in /proc/$PID/cgroup, but that is necessarily racy, and actually a very real race in real life.


As an alternative fix, the notification could be made synchronous.

Comment 2 Alan Pevec 2012-06-27 17:41:54 UTC
(In reply to comment #1)
> systemd-notify sends a message to $NOTIFY_SOCKET and then exits.
> 
> When systemd receives the notification, the systemd-notify process may have
> already exited and been reaped by bash.

So, as a work-around, daemon process itself should send READY=1 to $NOTIFY_SOCKET
instead of forking "systemd-notify" command.
We need this for openstack daemons, here's _untested_ code in Python
(after sd_notify implementation in sd-daemon.c):

import socket
import os
s = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
e = os.getenv('NOTIFY_SOCKET')
s.connect(e)
s.sendall("READY=1")
s.close()

Comment 3 Alan Pevec 2012-09-18 14:43:15 UTC
Script in comment 2 doesn't work with systemd-188-3.fc18 where NOTIFY_SOCKET is abstract namespace socket:
http://cgit.freedesktop.org/systemd/systemd/commit/?id=29252e9e5bad3b0bcfc45d9bc761aee4b0ece1da

It needs special handling if notification socket starts with @: convert to bytes and replace '@' with 0.

Comment 4 Alan Pevec 2012-09-18 22:36:01 UTC
Patch for example script in comment 2:

 e = os.getenv('NOTIFY_SOCKET')
+if e.startswith('@'):
+    # abstract namespace socket
+    e = '\0%s' % e[1:]
 s.connect(e)

Comment 5 Fedora End Of Life 2013-04-03 17:44:40 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 19 development cycle.
Changing version to '19'.

(As we did not run this process for some time, it could affect also pre-Fedora 19 development
cycle bugs. We are very sorry. It will help us with cleanup during Fedora 19 End Of Life. Thank you.)

More information and reason for this action is here:
https://fedoraproject.org/wiki/BugZappers/HouseKeeping/Fedora19

Comment 6 Alan Pevec 2014-02-11 20:14:16 UTC
This has been merged upstream and included in Fedora openstack-keystone since 2012.2 release.

Comment 7 Theodore Cowan 2016-08-17 16:54:35 UTC
a work-around with Python:

```
python -c "import systemd.daemon, time; systemd.daemon.notify('READY=1'); time.sleep(5)"
```

Used in a NodeJs application:

```
import { exec } from 'child_process'
exec('python -c "import systemd.daemon, time; systemd.daemon.notify(\'READY=1\'); time.sleep(5)"')
```