Bug 695736

Summary: Providing native systemd file
Product: [Fedora] Fedora Reporter: Jóhann B. Guðmundsson <johannbg>
Component: 389-ds-baseAssignee: Rich Megginson <rmeggins>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: edewata, lpoetter, nhosoi, nkinder, rmeggins
Target Milestone: ---Keywords: screened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 389-ds-base-1.2.10-0.4.a4.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-25 03:38:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 698755, 735013    
Bug Blocks: 713562    
Attachments:
Description Flags
Native systemd service file for dirsrv snmp daemon
none
dirsrv template v0.1
none
info about failures
none
dirsrv target v0.1
none
dirsrv template v0.2
none
/proc/PID/environ without using systemd
none
/proc/PID/environ using systemd
none
/etc/sysconfig/dirsrv
none
/etc/sysconfig/dirsrv-f15x8664
none
0002-Bug-695741-Providing-native-systemd-file-for-upcomin.patch
none
0002-Bug-695736-Providing-native-systemd-file-for-upcomin.patch
nkinder: review+
389-ds-base.spec.patch
nkinder: review+
strace -f -o systemctl.strace systemctl restart dirsrv.service
none
0001-Bug-695736-Providing-native-systemd-file-for-upcomin.patch
none
spec file changes
nkinder: review+
0001-Bug-695736-Providing-native-systemd-file-for-upcomin.patch nkinder: review+

Description Jóhann B. Guðmundsson 2011-04-12 14:45:29 UTC
Created attachment 491503 [details]
Native systemd service file for dirsrv snmp daemon

Description of problem:

The attached file is a native systemd file for upcoming F15 Feature [1]

Please read [2] on how to packaging and installing systemd Service files.

To learn more about Systemd daemon see [3].

To view old SysV with the new Systemd site by site see for your component see [4]

If you have any question dont hesitate to ask them on this bug report.

1.http://fedoraproject.org/wiki/Features/systemd

2.https://fedoraproject.org/wiki/Systemd_Packaging_Draft

3.http://0pointer.de/public/systemd-man/daemon.html

4.https://fedoraproject.org/wiki/User:Johannbg/QA/Systemd/compatability 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Jóhann B. Guðmundsson 2011-04-12 14:47:55 UTC
Will be some time before I submit the dirsv service since I need to figure out what's the best way to deal with that mess

Comment 2 Rich Megginson 2011-04-19 14:39:00 UTC
for the snmp file:

Do I call it dirsrv-snmp.service?
Does it go in /lib/systemd/system? (e.g. in the spec file should I have

%files
...
/lib/systemd/system/dirsrv-snmp.service

?
Should I use some sort of macro instead of /lib/systemd/system?

Comment 3 Jóhann B. Guðmundsson 2011-04-19 15:12:34 UTC
you call it what it used to be called with .service ending so for 389 services it would be 

dirsrv.service
dirsrv-admin.service
dirsrv-snmp.service

See 

https://fedoraproject.org/wiki/TomCallaway/Systemd_Revised_Draft

And 

https://fedoraproject.org/wiki/User:Toshio/Systemd_scriptlet_options

Note the old sysv initscript should be removed and subpackage.

You can take a look at how others have packaged systemd services like for example sssd

http://pkgs.fedoraproject.org/gitweb/?p=sssd.git;a=blob_plain;f=sssd.spec;hb=d895a5f72c49210793ec02ffc768106178521c3e

Comment 4 Rich Megginson 2011-04-19 20:28:53 UTC
For dirsrv - we could have instance creation (the setup-ds-admin.pl or setup-ds.pl scripts) create the systemd .service file for that instance e.g.
when you create slapd-instname it creates /lib/systemd/system/dirsrv-instname.service - I think we can live with using
service dirsrv-instname start
rather than
service dirsrv start instname

But what about
service dirsrv start
?  Is there any way to have a systemd service operate on a group of related services?

Comment 5 Jóhann B. Guðmundsson 2011-04-19 21:39:40 UTC
Let's use that option as a last resort I think we can handled this in a nicer way.

Let's start by creating a clean template that runs a single service an single instance.

Basically what I need to know is what's needed to start a service something like..

[Unit]
Description=389 Directory Server.
After=syslog.target network.target

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/dirsrv-localhost
ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-localhost -i /var/run/dirsrv/slapd-localhost.pid -w /var/run/dirsrv/slapd-localhost.startpid

[Install]
WantedBy=multi-user.target

Once we have created a clean working template we can see if we cant use that template in a smarter way

Comment 6 Rich Megginson 2011-04-19 21:54:06 UTC
That's it.  That will start the single instance "localhost"

Comment 7 Jóhann B. Guðmundsson 2011-04-20 20:36:44 UTC
Created attachment 493618 [details]
dirsrv template v0.1

This needs to be tested with a working fds setup on F15 

Copy the attached file into /lib/systemd/system

Then link into the relevant file like multible slapd instance would be 

ln -s /lib/systemd/system/dirsrv\@.service /etc/systemd/system/multi-user.target.wants/dirsrv

And

ln -s /lib/systemd/system/dirsrv\@.service /etc/systemd/system/multi-user.target.wants/dirsrv

then run systemctl daemon-reload

and 

systemctl start dirsrv

systemctl start dirsrv

To see if the template works correctly

Comment 8 Rich Megginson 2011-04-20 23:42:49 UTC
Created attachment 493665 [details]
info about failures

systemctl start dirsrv is not working.  The error messages don't give much to go on.

Comment 9 Jóhann B. Guðmundsson 2011-04-21 12:36:03 UTC
Apr 20 17:29:20 f15x8664 systemd[1]: Failed to load environment files: No such file or directory

Apr 20 17:29:20 f15x8664 systemd[1]: dirsrv failed to run 'start' task: No such file or directory

which tells us that 

EnvironmentFile=/etc/sysconfig/dirsrv-%i is not working 

Which is. 

A) 

Because the that file is missing  

or 

B)

EnvironmentFile=/etc/sysconfig/dirsrv-%i ( which I will neede to ping Lennart about )

Could you set EnvironmentFile=/etc/sysconfig/dirsrv ( or create the file if it's missing ) and see if the service starts then normally.

Comment 10 Rich Megginson 2011-04-21 14:07:25 UTC
The file was there, so tried B).
Changed it to say
EnvironmentFile=/etc/sysconfig/dirsrv-f15x8664
and then
systemctl start dirsrv
is working.  However,
systemctl status dirsrv
shows the Process: ExecStart with %i in the paths for the config dir and pid files.

Comment 11 Jóhann B. Guðmundsson 2011-04-21 14:51:25 UTC
Ok I'll ping Lennart about that as well and file bugs for both of these. 

I'm going to create dirsrv.target and make modifications to dirsrv@.service template to bind it to that target so that user can run systemctl start/stop dirsrv.target to start all the dirsrv@$foo.services ( the /sbin/service command does not support starting targets only services so users have to use systemctl command )

Now the the setup-ds-admin.pl or setup-ds.pl scripts should then link the dirsrv@.service into the dirsrv.target.wants directory when setting up the ds.

Using your setup mentioned here then the setup script would have performed..

ln -s /lib/systemd/system/dirsrv\@.service
/etc/systemd/system/dirsrv.target.wants/dirsrv

and reloaded the systemd daemon ( systemd daemon-reload ) 

Are there any check that should be performed before the service is started or on restart/reload? 

I also noticed that the pid files where owned by nobody

We can add to the dirsrv template 

User=dirsrv
Group=dirsrv 

To have it owned by dirsrv user ( that if it exists )

Comment 12 Rich Megginson 2011-04-21 15:51:34 UTC
(In reply to comment #11)
> Ok I'll ping Lennart about that as well and file bugs for both of these. 
> 
> I'm going to create dirsrv.target and make modifications to dirsrv@.service
> template to bind it to that target so that user can run systemctl start/stop
> dirsrv.target to start all the dirsrv@$foo.services ( the /sbin/service command
> does not support starting targets only services so users have to use systemctl
> command )

So this means users cannot use service any more?  That's going to cause headaches for QE, and packages that depend on 389 like freeipa and dogtag.
 
> Now the the setup-ds-admin.pl or setup-ds.pl scripts should then link the
> dirsrv@.service into the dirsrv.target.wants directory when setting up the ds.
> 
> Using your setup mentioned here then the setup script would have performed..
> 
> ln -s /lib/systemd/system/dirsrv\@.service
> /etc/systemd/system/dirsrv.target.wants/dirsrv
> 
> and reloaded the systemd daemon ( systemd daemon-reload ) 

Ok.  Can I try this with the existing dirsrv@.service from https://bugzilla.redhat.com/attachment.cgi?id=493618
?

> Are there any check that should be performed before the service is started or
> on restart/reload? 

Not right now.
 
> I also noticed that the pid files where owned by nobody

In fact they can be owned by nobody, dirsrv, pkiuser, etc.

We are different than (apparently every) other services in that at setup time we allow you to choose the userid for the daemon - this userid then owns the config files, pid files, log files, etc.

> 
> We can add to the dirsrv template 
> 
> User=dirsrv
> Group=dirsrv 
> 
> To have it owned by dirsrv user ( that if it exists )

Can't add that to the dirsrv template since each instance may be owned by a different userid.  Is this necessary?

Comment 13 Jóhann B. Guðmundsson 2011-04-21 16:15:45 UTC
Created attachment 493913 [details]
dirsrv target v0.1

To test 

Copy the file to /lib/systemd/system

mkdir /etc/systemd/system/dirsrv.target.wants 

ln -s /lib/systemd/system/dirsrv\@.service
/etc/systemd/system/dirsrv.target.wants/dirsrv

systemctl daemon-reload 

systemctl start dirsrv.target ( should start the dirsrv )
systemctl stop dirsrv.target ( should start the dirsrv )

Comment 14 Jóhann B. Guðmundsson 2011-04-21 16:17:06 UTC
Created attachment 493915 [details]
dirsrv template v0.2

The template that should go with the dirsrv.target

Comment 15 Rich Megginson 2011-04-21 16:34:25 UTC
do I need to remove the links I set up in https://bugzilla.redhat.com/show_bug.cgi?id=695736#c7 ?

Comment 16 Jóhann B. Guðmundsson 2011-04-21 16:35:32 UTC
(In reply to comment #12)
> (In reply to comment #11)
 > So this means users cannot use service any more?  That's going to cause
> headaches for QE, and packages that depend on 389 like freeipa and dogtag.

It wont work for targets and probably @service too ( systemd spesific ) but it will work for regular service like dirsrv-admin.service dirsrv-snmp.service so service dirsrv-admin start will start the dirsrv-admin.service 

Any package should be updated to reflect the new init systemd anyway and use it's commands and users get custom to use systemctl command instead of service and chkconfig ones.
 
> Ok.  Can I try this with the existing dirsrv@.service from
> https://bugzilla.redhat.com/attachment.cgi?id=493618
> ?

Nope you will have to use dirsrv@.service v0.2 one

> Can't add that to the dirsrv template since each instance may be owned by a
> different userid.  Is this necessary?

Nope not et all

Comment 17 Jóhann B. Guðmundsson 2011-04-21 16:44:27 UTC
(In reply to comment #15)
> do I need to remove the links I set up in
> https://bugzilla.redhat.com/show_bug.cgi?id=695736#c7 ?

Yeah you should remove it since we are going to be usind dirsrv.target.wants from now on and it might cause conflict at boot up for you. 

dirsrv.target will be started when multi-user.target starts and if working correctly all the dirsrv@.services along with it so you would end up starting the same service twice since it's linked to it both in multi-user.target.wants and dirsrv.target.wants

Comment 18 Rich Megginson 2011-04-21 18:56:27 UTC
Ok.  This is working
systemctl op dirsrv.target works on all instances
systemctl op dirsrv works on that service

For the spec file - is there a %{_systemdetcsystemdsystem} macro or is it ok to hardcode it as %{_sysconfdir}/systemd/system ?

Comment 19 Rich Megginson 2011-04-21 21:52:20 UTC
Is there a bug for getting the 
EnvironmentFile=/etc/sysconfig/dirsrv-%i
working correctly?

Is it possible to have more than one EnvironmentFile?  When a server starts up, it needs to read from the file that applies to all instances
/etc/sysconfig/dirsrv
then from the file that applies to the specific instance
/etc/sysconfig/dirsrv-%i
?

Comment 20 Jóhann B. Guðmundsson 2011-04-21 22:57:22 UTC
(In reply to comment #19)
> Is there a bug for getting the 
> EnvironmentFile=/etc/sysconfig/dirsrv-%i
> working correctly?

Yup I pinged Lennart on irc and just in case if he's like me ( forget everthing if aint in writing and I'm not regularly nagged about it ) i filed a bug #698755

> 
> Is it possible to have more than one EnvironmentFile?  When a server starts up,
> it needs to read from the file that applies to all instances
> /etc/sysconfig/dirsrv
> then from the file that applies to the specific instance
> /etc/sysconfig/dirsrv-%i
> ?

Hum... 

Test defining it twice as in for the f15x8664,service it would be..  

EnvironmentFile=/etc/sysconfig/dirsrv
EnvironmentFile=/etc/sysconfig/dirsrv-f15x8664

That might work

Comment 21 Rich Megginson 2011-04-21 23:13:39 UTC
Created attachment 494022 [details]
/proc/PID/environ without using systemd

it looks like an environment table:
VAR=VALUE
and nothing else
It reads both /etc/sysconfig/dirsrv (STARTPID_TIME) and /etc/sysconfig/dirsrv-f15x8664 (DS_CONFIG_DIR)

Comment 22 Rich Megginson 2011-04-21 23:16:57 UTC
Created attachment 494023 [details]
/proc/PID/environ using systemd

it reads in both /etc/sysconfig/dirsrv and /etc/sysconfig/dirsrv-f15x8664 - but it doesn't look like it processed them correctly

afaik, a /etc/sysconfig/file does not have to be strictly VAR=VALUE - it can use any valid bourne shell syntax e.g. the source (". filename") command is used to read in the file.  I'm not sure what systemd is doing - looks like it reads in the files verbatim then just does string replacement of any $VAR name it finds.  IMHO, it should be processing the files as bourne shell source (". filename") does.

Comment 23 Jóhann B. Guðmundsson 2011-04-21 23:59:46 UTC
Perhaps not in the old system but I'm pretty sure /etc/sysconfig/file does have to be strictly VAR=VALUE in systemd since you spesifically have to invoke bash if you want bash ( and related behavior ) with systemd as in $foo=/bin/bash $bar and afaik it's only supported envoking bash in Exec$foo= and Lennart does not like people doing that since it's an clear indication that something is seriously broken and the daemon and or the relevant code should be fixed and I'm pretty sure you will have hard time convincing him to change this.

Could you attache the /etc/sysconfig/dirsrv and /etc/sysconfig/dirsrv-f15x8664 so I can see what it looks like and what it's actually doing.

Comment 24 Rich Megginson 2011-04-22 00:46:54 UTC
Created attachment 494027 [details]
/etc/sysconfig/dirsrv

389 supports multiple platforms, many of which support something like environment files.  On all of these platforms, the environment files are read into the process by the use of the bourne shell source (".") command, which allows any valid bourne shell syntax.  We try to keep platform dependencies to a minimum to maximize portability.  So I would rather not have to have multiple sysconfig files unless absolutely necessary.

I'm also surprised (or maybe not, given how much trouble having multiple instances has been) that no one else has run into this - surely 389 is not the only project that has bourne shell code (and not just simple var=value) in the sysconfig files?

Comment 25 Rich Megginson 2011-04-22 00:47:29 UTC
Created attachment 494028 [details]
/etc/sysconfig/dirsrv-f15x8664

Comment 26 Rich Megginson 2011-04-22 01:28:21 UTC
Ok - found the docs which explain EnvironmentFile - http://0pointer.de/public/systemd-man/systemd.exec.html
it wants VAR=VALUE and nothing else.  Can I use VAR as a value for other VAR settings?  For example, is this supported?

INSTANCENAME=f15x8664
CONFIG_DIR=/etc/dirsrv/slapd-$INSTANCENAME
SERVER_DIR=/usr/lib64/dirsrv/slapd-$INSTANCENAME

?

Comment 27 Jóhann B. Guðmundsson 2011-04-22 02:02:35 UTC
To my knowledge I do belive this would not work but looking at the dirsrv-$foo file I think what you are looking for is Environment=

As in having in the template. 

Environment=CONFIG_DIR=/etc/dirsrv/slapd-%i
Environment=SERVER_DIR=/usr/lib64/dirsrv/slapd-%i

and then I guess you can deprecate /etc/sysconfig/dirsrv-$foo file. . .

Comment 28 Rich Megginson 2011-04-25 17:13:26 UTC
(In reply to comment #27)
> To my knowledge I do belive this would not work but looking at the dirsrv-$foo
> file I think what you are looking for is Environment=
> 
> As in having in the template. 
> 
> Environment=CONFIG_DIR=/etc/dirsrv/slapd-%i
> Environment=SERVER_DIR=/usr/lib64/dirsrv/slapd-%i
> 
> and then I guess you can deprecate /etc/sysconfig/dirsrv-$foo file. . .

No.  I still want to give users the ability to add items to this file.  I think editing /etc/sysconfig/dirsrv-foo is easier than editing /lib/systemd/system/dirsrv@.service && systemctl --system daemon-reload

Which brings up another point - how do users increase the number of file descriptors available to the directory server process?  The usual way is to do

ulimit -n 8192

for example, in the /etc/sysconfig/dirsrv or dirsrv-foo.  How can I do this with systemd?

Comment 29 Jóhann B. Guðmundsson 2011-04-25 19:25:38 UTC
(In reply to comment #28)
> Which brings up another point - how do users increase the number of file
> descriptors available to the directory server process?  The usual way is to do
> 
> ulimit -n 8192
> 
> for example, in the /etc/sysconfig/dirsrv or dirsrv-foo.  How can I do this
> with systemd?

You have LimitNOFILE= which is ulimit -n 

So you would put in the service file

LimitNOFILE=8192

See systemd.exec for various Limit$foo= options which control various resource limits for executed processes.

Comment 30 Rich Megginson 2011-04-26 00:56:46 UTC
Created attachment 494776 [details]
0002-Bug-695741-Providing-native-systemd-file-for-upcomin.patch

Comment 31 Rich Megginson 2011-04-26 00:57:20 UTC
Created attachment 494777 [details]
0002-Bug-695736-Providing-native-systemd-file-for-upcomin.patch

Comment 32 Rich Megginson 2011-04-26 01:04:45 UTC
Created attachment 494779 [details]
389-ds-base.spec.patch

Comment 33 Jóhann B. Guðmundsson 2011-04-26 11:16:26 UTC
I should mention that you have to make sure that the path expansion in service files work correctly. 

Systemd needs absolute paths to commands and files in it's service files or it will refuse to start something that Joe Orton the Apache maintainer well found out and fixed here https://pkgs.fedoraproject.org/gitweb/?p=httpd.git;a=commitdiff;h=df147d55d0e1710a308096f170d9c4980ff32191

( Just something to mention and check for after a package has been built as in the native systemd service file contain full paths ) 

I should mention for documentation purposes /lib/systemd is only for packages and /etc/systemd is for admins to do any custom stuff and for admins to be editing files in the /lib/systemd is considered bad practice. 

So when editing any of the dirsrv service ( or anyother service fi file they should first copy it into /etc/systemd/system

For dirsrv-admin.service

# cp /lib/systemd/system/dirsrv-admin.service /etc/systemd/system/
# vim /etc/systemd/system/dirsrv-admin.service
# systemctl daemon-reload 
# systemctl start dirsrv-admin.service

For dirsrv-snmp.service
 
# cp /lib/systemd/system/dirsrv-snmp.service /etc/systemd/system/
# vim /etc/systemd/system/dirsrv-snmp.service
# systemctl daemon-reload 
# systemctl (re)start dirsrv-snmp.service

And for the dirsrv@.service template and since it is an template it requires few more steps then simply copy the file to /etc/systemd/system/ directory and start editing it and then reload systemd  

# cp /lib/systemd/system/dirsrv\@.service /etc/systemd/system/dirsrv\@.service
# mkdir -p /etc/systemd/system/dirsrv.target.wants
# vim /etc/systemd/system/dirsrv\@.service
# ln -s /etc/systemd/system/dirsrv\@.service /etc/systemd/system/dirsrv.target.wants/dirsrv@$foo.service
# systemctl daemon-reload 
# systemctl (re)start dirsrv.target

Now the relevant setup scripts should just be working in the /etc/systemd/system directory as opposed to the /lib/systemd/system counterpart then all documentation can just refer admins to the /etc/systemd/system directory to edit the dirsrv-admin.service dirsrv-snmp.service dirsrv\@.service and reload the systemd daemon along with (re)starting the service ( and the dirsrv.target for the dirsrv@.service )

Comment 34 Jóhann B. Guðmundsson 2011-04-26 11:22:21 UTC
(In reply to comment #10)
> systemctl status dirsrv
> shows the Process: ExecStart with %i in the paths for the config dir and pid
> files.

Note that Lennart closed this bug as wont fix ( see bug 698761 for details )

Comment 35 Rich Megginson 2011-04-26 14:30:40 UTC
(In reply to comment #33)
> I should mention that you have to make sure that the path expansion in service
> files work correctly. 
> 
> Systemd needs absolute paths to commands and files in it's service files or it
> will refuse to start something that Joe Orton the Apache maintainer well found
> out and fixed here
> https://pkgs.fedoraproject.org/gitweb/?p=httpd.git;a=commitdiff;h=df147d55d0e1710a308096f170d9c4980ff32191
> 
> ( Just something to mention and check for after a package has been built as in
> the native systemd service file contain full paths ) 

Are you talking about things in the upstream patch like this:
ExecStart=@sbindir@/ns-slapd -D @instconfigdir@/slapd-%i -i $PIDDIR/slapd-%i.pid -w $PIDDIR/slapd-%i.startpid

If so, then yes - paths like @sbindir@ etc. are replaced during the build by absolute paths.  On a Fedora/RHEL system, this will be expanded to /usr/sbin/ns-slapd and so on.

Comment 36 Bill Nottingham 2011-04-26 17:35:52 UTC
Moving systemd service RFEs to rawhide.

At this point, it is not appropriate in the Fedora 15 cycle to add these. Furthermore, at this point, we are still finalizing the packaging guidelines to handle SysV -> systemd upgrades.

We therefore request:
- wait until there are packaging guidelines (this will be announced on the devel list). This ensures that upgrades will work smoothly and we/you won't have to do multiple sets of changes.
- work on these sorts of changes for Fedora 16 where necessary, not Fedora 15, as we're trying to fix things for release.
- do *not* change a service from SysV to systemd in an existing release (such as Fedora 15), as this is the sort of behavior change that goes against our update policy, documented as https://fedoraproject.org/wiki/Updates_Policy

Comment 37 Rich Megginson 2011-04-27 01:20:55 UTC
So for f15, I should just put

SYSTEMCTL_SKIP_REDIRECT=1 ; export SYSTEMCTL_SKIP_REDIRECT

at the beginning of the regular sysv init script and save the other changes
I've made for f16.  Works for me.

Comment 39 Jóhann B. Guðmundsson 2011-07-13 11:56:35 UTC
Time to start looking at this one again since bug 698755 has been fixed in git hence it will be solved with next systemd build

Comment 40 Rich Megginson 2011-07-13 14:41:45 UTC
(In reply to comment #39)
> Time to start looking at this one again since bug 698755 has been fixed in git
> hence it will be solved with next systemd build

We have this work scheduled for next month - is that too late?

Comment 41 Jóhann B. Guðmundsson 2011-07-13 14:59:43 UTC
The sooner this hits the street the more you expose it to testing and catch potentially any Before= and After= ordering requirements etc. and just fix what needs fixing via update.

In all it strictness this needs to be resolved and package no later then 2011-09-06 ( Beta Change Deadline ) or it will miss F16 and get pushed to F17 which in turn will potentially disrupt any additional sysv related cleanups/changes that might take place during the F17 cycle ( the aim here is managing to convert all sysv legacy scripts to native systemd units during this release cycle )

Comment 42 Rich Megginson 2011-08-29 21:25:00 UTC
systemctl restart dirsrv
works
systemctl restart dirsrv.target
hangs - it shuts down the instance, but it does not restart it.  This is what I have so far:

/lib/systemd/system/dirsrv.target:
[Unit]
Description=389 Directory Server
After=syslog.target network.target

[Install]
WantedBy=multi-user.target

/lib/systemd/system/dirsrv@.service:
[Unit]
Description=389 Directory Server %i.
BindTo=dirsrv.target
After=dirsrv.target

[Service]
Type=forking
Environment=PIDDIR=/var/run/dirsrv
EnvironmentFile=/etc/sysconfig/dirsrv
EnvironmentFile=/etc/sysconfig/dirsrv-%i
ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w $PIDDIR/slapd-%i.startpid

ls /etc/systemd/system/dirsrv.target.wants
dirsrv -> /lib/systemd/system/dirsrv@.service

Comment 43 Rich Megginson 2011-08-29 21:28:16 UTC
If I do 
systemctl stop dirsrv.target
then
systemctl start dirsrv.target
everything works.  It is only the restart command that is the problem.

Comment 44 Jóhann B. Guðmundsson 2011-08-29 21:56:02 UTC
Hum wondering if it's because it's target not a service 

What happens if you test it with a service as in 

[Unit]
Description=389 Directory Server
After=syslog.target network.target

[Service]
Type=oneshot
ExecStart=/bin/true

[Install]
WantedBy=multi-user.target

Then create dirsrv.service.wants directory and link into that and restart the service ( services can have wants directory too ) 

If I can recall correctly something like the above dummy service I had though of to keep backwards compatibility with service command ofcourse you adjust the template as so...

[Unit]
Description=389 Directory Server %i.
BindTo=dirsrv.service
After=dirsrv.service

[Service]
Type=forking
Environment=PIDDIR=/var/run/dirsrv
EnvironmentFile=/etc/sysconfig/dirsrv
EnvironmentFile=/etc/sysconfig/dirsrv-%i
ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w
$PIDDIR/slapd-%i.startpid

Then users could use service dirsrv start/stop/restart

But the reason could also be because we have specified an pidfile as in PIDFile=/var/run/dirsrv/slapd-%i.pid in the service section of the template file

Comment 45 Rich Megginson 2011-08-30 02:54:36 UTC
(In reply to comment #44)
> Hum wondering if it's because it's target not a service 
> 
> What happens if you test it with a service as in 
> 
> [Unit]
> Description=389 Directory Server
> After=syslog.target network.target
> 
> [Service]
> Type=oneshot
> ExecStart=/bin/true
> 
> [Install]
> WantedBy=multi-user.target

So instead of /lib/systemd/system/dirsrv.target I would have /lib/systemd/system/dirsrv.service containing the above?

> 
> Then create dirsrv.service.wants directory and link into that and restart the
> service ( services can have wants directory too ) 
> 
> If I can recall correctly something like the above dummy service I had though
> of to keep backwards compatibility with service command ofcourse you adjust the
> template as so...
> 
> [Unit]
> Description=389 Directory Server %i.
> BindTo=dirsrv.service
> After=dirsrv.service
> 
> [Service]
> Type=forking
> Environment=PIDDIR=/var/run/dirsrv
> EnvironmentFile=/etc/sysconfig/dirsrv
> EnvironmentFile=/etc/sysconfig/dirsrv-%i
> ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w
> $PIDDIR/slapd-%i.startpid

So the above would go into /lib/systemd/system/dirsrv@.service - just replace dirsrv.target with dirsrv.service

> 
> Then users could use service dirsrv start/stop/restart
> 
> But the reason could also be because we have specified an pidfile as in
> PIDFile=/var/run/dirsrv/slapd-%i.pid in the service section of the template
> file

No - PIDFile is not specified anywhere - PIDDIR is but that's not the same thing.

Comment 46 Rich Megginson 2011-08-30 03:02:29 UTC
Ok.  I made the above changes and did
systemctl daemon-reload

same behavior - systemctl stop/start dirsrv.service works fine - restart just hangs

service dirsrv stop/start work fine
service dirsrv restart just hangs

The service command won't work because there is no way to control individual instances (afaict):
service dirsrv stop localhost - error
service dirsrv@localhost stop - error
service dirsrv stop - error

Comment 47 Jóhann B. Guðmundsson 2011-08-30 10:33:04 UTC
(In reply to comment #46)
> Ok.  I made the above changes and did
> systemctl daemon-reload
> 
> same behavior - systemctl stop/start dirsrv.service works fine - restart just
> hangs
> 
> service dirsrv stop/start work fine
> service dirsrv restart just hangs

You can try adding to the [Service] section of the template

StandardOutput=syslog
StandardError=syslog

And check /var/log/message if it captures why it's failing if nothing is there then I guess strace or enable debuging output is next on the list

> The service command won't work because there is no way to control individual
> instances (afaict):
> service dirsrv stop localhost - error
> service dirsrv@localhost stop - error
> service dirsrv stop - error

You will need to use systemctl for anything other than service dirsrv start/stop/restart the methods that you are using there are unsupported by the service command afaik

Comment 48 Rich Megginson 2011-08-30 14:51:11 UTC
Created attachment 520619 [details]
strace -f -o systemctl.strace systemctl restart dirsrv.service

This is the output of
strace -f -o systemctl.strace systemctl restart dirsrv.service

Note that systemctl is hung at line 299:

2078  poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}], 2, -1) = 1 ([{fd=5, revents=POLLIN}])

I interrupted it with Ctrl-C after waiting several minutes.

Comment 49 Rich Megginson 2011-08-30 15:09:22 UTC
I tried setting in the dirsrv.service and dirsrv@.service files:
[Service]
LogLevel=debug

then 

systemctl daemon-reload

but systemctl complained that 'LogLevel' is not a valid lvalue in the [Service] section.

I tried editing /etc/systemd/system.conf and uncommenting the LogLevel directive and set it to debug then

systemctl daemon-reload

but there was no extra output in /var/log/messages or dmesg

Then I rebooted the system - there was a lot of extra information in /var/log/messages

Then I tried

systemctl start dirsrv.service
followed by 
systemctl restart dirsrv.service

there was no extra output - still hangs

What now?

Comment 50 Jóhann B. Guðmundsson 2011-08-30 15:43:45 UTC
Hum you should add To the [Service] section of the template file ( see man systemd.exec )

SyslogLevel=debug 
StandardOutput=syslog
StandardError=syslog

Do you have selinux enabled by any chance?

Comment 51 Rich Megginson 2011-08-30 21:25:08 UTC
I've provided 389-ds-base systemd enabled builds for F16 at http://rmeggins.fedorapeople.org/rpms/

You'll need the perldap package for F16 - I've also built this in F16 updates-testing if you'd rather grab it from there

After installing the packages, run

setup-ds.pl

you can use localhost as the hostname - just ignore the warnings - it wants a FQDN but localhost should work fine for testing systemd

setup-ds.pl will create the symlink
/etc/systemd/system/dirsrv.target.wants/dirsrv -> /lib/systemd/system/dirsrv@.service

Comment 52 Jóhann B. Guðmundsson 2011-08-31 00:24:32 UTC
Mocking around with dirsrv here during the evening I have come across several issues first of all the /var/lock/dirsrv directory's are missing after reboot note that tmpfiles only cover /run or /var/run not /var/lock afaik, 

Secondly after restart the startpid file become owned by root.nobody and only the mainpid gets killed while ns-slapd just happily keeps on running.
( Note that this can be due to the dirsrv.service not being run as nobody I just have not gotten the daemon running again due to permission erro's to find out and it's getting a bit late )

I continuously hit ns-slap refusing to start due to permission errors [31/Aug /2011:00:00:56 +0000] createprlistensockets - PR_Bind() on All Interfaces port 389 failed: Netscape Portable Runtime error -5966 (Access Denied.) 

? the whole /var/run and /var/lock/dirsrv and subdirectorys are owned by nobody 

I've also hit issue where slap thinks some other server is running which seem to when dug deeper also be permission related? 

"Shutting down due to possible conflicts with other slapd processes"

So what's actually happening there in the background with all those permissions checks etc. 

If you can map that out for me than it becomes a question if we cant just let systemd handle that which should make it a bit less error prone and give you guys a chance to reduce the code a bit?

Comment 53 Rich Megginson 2011-08-31 01:20:08 UTC
(In reply to comment #52)
> Mocking around with dirsrv here during the evening I have come across several
> issues first of all the /var/lock/dirsrv directory's are missing after reboot
> note that tmpfiles only cover /run or /var/run not /var/lock afaik, 

Hmm - well before we used tmpfiles.d for /var/lock/dirsrv it used to fail upon reboot - with it it works - not sure what has changed - but if you can point me in the direction of the docs that say which tmpfiles.d we need to specify I will be happy to amend the code.

> 
> Secondly after restart the startpid file become owned by root.nobody and only
> the mainpid gets killed while ns-slapd just happily keeps on running.
> ( Note that this can be due to the dirsrv.service not being run as nobody I
> just have not gotten the daemon running again due to permission erro's to find
> out and it's getting a bit late )

Hmm - the problem that I see is that this is not correct:
ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w $PIDDIR/slapd-%i.startpid

it should be
ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i ${PIDDIR}/slapd-%i.pid -w ${PIDDIR}/slapd-%i.startpid

Without the braces, the ps -ef|grep slapd output would show just -i and -w and no pid files in /var/run/dirsrv - adding the braces shows the correct ps -ef output and the correct files with the correct ownerships

> 
> I continuously hit ns-slap refusing to start due to permission errors [31/Aug
> /2011:00:00:56 +0000] createprlistensockets - PR_Bind() on All Interfaces port
> 389 failed: Netscape Portable Runtime error -5966 (Access Denied.) 

you are starting it using
systemctl start dirsrv
running as root?  Is there another directory server running?  Something else listening to port 389?

> 
> ? the whole /var/run and /var/lock/dirsrv and subdirectorys are owned by nobody 

The default user is "nobody" - /var/run/dirsrv and /var/lock/dirsrv are owned by nobody.

> 
> I've also hit issue where slap thinks some other server is running which seem
> to when dug deeper also be permission related? 
> 
> "Shutting down due to possible conflicts with other slapd processes"
> 
> So what's actually happening there in the background with all those permissions
> checks etc. 

"all those permissions checks" is just "can I bind to port 389?" - this will either fail due to 
1) cannot bind to port 389 if not started as root (note the server will drop permissions soon after binding as root)
2) cannot bind to port 389 if another process is already bound to port 389
3) selinux prevents binding to port 389 (although this is allowed by the ldap/dirsrv policy in the base os selinux policy, so this should not happen for port 389)

> 
> If you can map that out for me than it becomes a question if we cant just let
> systemd handle that which should make it a bit less error prone and give you
> guys a chance to reduce the code a bit?

"reduce code a bit" == "write and debug a lot of code on several different platforms, some of which support systemd and some of which do not, in the short period of time we have to get a version of dirsrv in F16 that fully supports systemd"

I'd rather spend my time in the short term helping you figure out why
systemctl restart dirsrv.target
is not working.

Comment 54 Rich Megginson 2011-08-31 01:24:17 UTC
To put it another way - I'd rather get dirsrv working as is, since really the only thing preventing us from providing systemd support is this restart issue.  I'm sure we can work out the permissions/ownership/conflict problems without having to resort to a rewrite of the socket code in dirsrv.

Comment 55 Jóhann B. Guðmundsson 2011-08-31 08:48:00 UTC
(In reply to comment #53)
> (In reply to comment #52)
> > Mocking around with dirsrv here during the evening I have come across several
> > issues first of all the /var/lock/dirsrv directory's are missing after reboot
> > note that tmpfiles only cover /run or /var/run not /var/lock afaik, 
> 
> Hmm - well before we used tmpfiles.d for /var/lock/dirsrv it used to fail upon
> reboot - with it it works - not sure what has changed - but if you can point me
> in the direction of the docs that say which tmpfiles.d we need to specify I
> will be happy to amend the code.

man tmpfiles.d only mentions /run 

> > 
> > Secondly after restart the startpid file become owned by root.nobody and only
> > the mainpid gets killed while ns-slapd just happily keeps on running.
> > ( Note that this can be due to the dirsrv.service not being run as nobody I
> > just have not gotten the daemon running again due to permission erro's to find
> > out and it's getting a bit late )
> 
> Hmm - the problem that I see is that this is not correct:
> ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w
> $PIDDIR/slapd-%i.startpid
> 
> it should be
> ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i ${PIDDIR}/slapd-%i.pid
> -w ${PIDDIR}/slapd-%i.startpid
> 
> Without the braces, the ps -ef|grep slapd output would show just -i and -w and
> no pid files in /var/run/dirsrv - adding the braces shows the correct ps -ef
> output and the correct files with the correct ownerships

Yes I forgot to mentioned that I had notice that and used full path instead of PIDDIR variable

> 
> > 
> > I continuously hit ns-slap refusing to start due to permission errors [31/Aug
> > /2011:00:00:56 +0000] createprlistensockets - PR_Bind() on All Interfaces port
> > 389 failed: Netscape Portable Runtime error -5966 (Access Denied.) 
> 
> you are starting it using
> systemctl start dirsrv
> running as root?  Is there another directory server running?  Something else
> listening to port 389?

Nope the port is free however I had added User=nobody and Group=nobody to the template unit because I hit some other permission error if I can recall correctly the debug output said something about the lock file being owned by pid 0 or uid 0

So to the dirsrv@.template I added to the [Service] section

User=nobody
Group=nobody 

Gave full paths and turned on debugging ( -d 1 )

ExecStart=/usr/sbin/ns-slapd -d 1 -D /etc/dirsrv/slapd-%i -i /var/run/dirsrv/slapd-%i.pid -w /var/run/dirsrv/slapd-%i.startpid

Increased the timeout so systemd would not kill the service in startup since we are starting it in debug and that start up time exceeds the default thus systemd will kill the service 

TimeoutSec=5m

And finally added to catch anything that might be spewed to the console

StandardError=syslog

( We should create a how to debug 389 directory server wiki page with the above info  ) 

> > ? the whole /var/run and /var/lock/dirsrv and subdirectorys are owned by nobody 
> 
> The default user is "nobody" - /var/run/dirsrv and /var/lock/dirsrv are owned
> by nobody.
> 
> > 
> > I've also hit issue where slap thinks some other server is running which seem
> > to when dug deeper also be permission related? 
> > 
> > "Shutting down due to possible conflicts with other slapd processes"
> > 
> > So what's actually happening there in the background with all those permissions
> > checks etc. 
> 
> "all those permissions checks" is just "can I bind to port 389?" - this will
> either fail due to 
> 1) cannot bind to port 389 if not started as root (note the server will drop
> permissions soon after binding as root)
> 2) cannot bind to port 389 if another process is already bound to port 389
> 3) selinux prevents binding to port 389 (although this is allowed by the
> ldap/dirsrv policy in the base os selinux policy, so this should not happen for
> port 389)
> 
> > 
> > If you can map that out for me than it becomes a question if we cant just let
> > systemd handle that which should make it a bit less error prone and give you
> > guys a chance to reduce the code a bit?
> 
> "reduce code a bit" == "write and debug a lot of code on several different
> platforms, some of which support systemd and some of which do not, in the short
> period of time we have to get a version of dirsrv in F16 that fully supports
> systemd"
> 
> I'd rather spend my time in the short term helping you figure out why
> systemctl restart dirsrv.target
> is not working.

Understood

I was just pointing out that we probably could take care of those checks via Condition$foo and ExecStartPre/ExecStartPost/ExecStopPost to ensure things are correct directories exist etc,

Anyway my next test was to add to the dummy dirsrv.service  

Before=dirsrv

And 

User=nobody
Group=nobody 

To see if the startpid keept the correct ownership ( nobody.nobody as oppose to root.nobody ) when I kept hitting the directory server either refusing to start because of permission errors or it thought another instance was running which was not the case anyway I ran out of time ( Time was getting close to 01:00 ) and will continue to poke this when I get home from work but in the meantime I see if I cant get some clarification from Lennart on what's the expected behaviour when restarting a unit which has another units bound/required to it.

Comment 56 Rich Megginson 2011-08-31 14:31:01 UTC
(In reply to comment #55)> 
> Nope the port is free however I had added User=nobody and Group=nobody to the
> template unit because I hit some other permission error if I can recall
> correctly the debug output said something about the lock file being owned by
> pid 0 or uid 0

Ok.  I'm just not sure what's going on here.  Running setup-ds.pl should create /var/lock/dirsrv/slapd-name with 0770 and nobody:nobody.  It also creates /etc/tmpfiles.d/dirsrv-name.conf with
/var/run/dirsrv
/var/lock/dirsrv
/var/lock/dirsrv/slapd-name
all with 0770 nobody:nobody

So try this - start with a clean system
remove-ds.pl -i slapd-name
then
yum erase 389-ds-base-libs
then
rm -rf /etc/dirsrv /etc/sysconfig/dirsrv* /etc/tmpfiles.d/dirsrv* /var/*/dirsrv /usr/*/dirsrv

Then install the 389-ds-base package
Check for existence ownership permissions on /var/lock/dirsrv
Then run setup-ds.pl
Then check /var/lock/dirsrv again

> 
> So to the dirsrv@.template I added to the [Service] section
> 
> User=nobody
> Group=nobody 

I can't add this to the template because each instance may run as a different user.

> 
> Gave full paths and turned on debugging ( -d 1 )
> 
> ExecStart=/usr/sbin/ns-slapd -d 1 -D /etc/dirsrv/slapd-%i -i
> /var/run/dirsrv/slapd-%i.pid -w /var/run/dirsrv/slapd-%i.startpid

Note that when you turn on debugging:
Using -d with any value will tell ns-slapd to not daemonize - it will remain attached to the controlling process.
Using -d 1 will cause ns-slapd to take forever to startup.

> 
> Increased the timeout so systemd would not kill the service in startup since we
> are starting it in debug and that start up time exceeds the default thus
> systemd will kill the service 
> 
> TimeoutSec=5m
> 
> And finally added to catch anything that might be spewed to the console
> 
> StandardError=syslog
> 
> ( We should create a how to debug 389 directory server wiki page with the above
> info  ) 

Ok.

> 
> Anyway my next test was to add to the dummy dirsrv.service  
> 
> Before=dirsrv
> 
> And 
> 
> User=nobody
> Group=nobody 
> 
> To see if the startpid keept the correct ownership ( nobody.nobody as oppose to
> root.nobody ) when I kept hitting the directory server either refusing to start
> because of permission errors or it thought another instance was running which
> was not the case anyway I ran out of time ( Time was getting close to 01:00 )
> and will continue to poke this when I get home from work but in the meantime I
> see if I cant get some clarification from Lennart on what's the expected
> behaviour when restarting a unit which has another units bound/required to it.

Comment 57 Rich Megginson 2011-09-01 23:37:13 UTC
Created attachment 521113 [details]
0001-Bug-695736-Providing-native-systemd-file-for-upcomin.patch

"parameterized" the "group" name dirsrv.target and rebased to the latest code in master

Comment 58 Rich Megginson 2011-09-01 23:38:15 UTC
Created attachment 521114 [details]
spec file changes

"parameterized" the "group" name dirsrv.target - rebased on top of the latest code

Comment 59 Rich Megginson 2011-09-02 02:00:49 UTC
Created attachment 521127 [details]
0001-Bug-695736-Providing-native-systemd-file-for-upcomin.patch

have to explicitly enable systemd support - it will not enable systemd even if pkg-config says systemd is supported - must pass in --with-systemdsystemunitdir and --with-systemdsystemconfdir

Comment 60 Rich Megginson 2011-09-02 21:26:11 UTC
To ssh://git.fedorahosted.org/git/389/ds.git
   b5f77c6..144c607  master -> master
commit 144c607fa22e058a9ab3d343d0706432e94d5a63
Author: Rich Megginson <rmeggins>
Date:   Thu Apr 21 15:49:13 2011 -0600
    Reviewed by: nhosoi, nkinder (Thanks!)
    Branch: master
    Fix Description: Since we support multiple instances of directory server,
    create a dirsrv.target, and have the instances "want" that target.  There
    is a service template file dirsrv@.service that supports replaceable
    parameters which are instance specific.  When a new instance is created,
    we create a symlink called dirsrv@$instance.service which links to the
    template file.  systemd fills in the %i with the correct instance name.
    The service command will not work.  You have to use the systemctl command:
    systemctl stop dirsrv - single instance
    systemctl stop dirsrv.target - all instances
    There are still some outstanding issues with systemd:
    * systemctl restart dirsrv.target - will hang after shutting down the
    instances
    When using systemd, have to use the systemctl start command in startServer
    or other systemd commands like status, restart, stop will not work
    Note: the "group" name dirsrv.target is flexible - just change the
    --with-systemdgroupname=NAME when running configure
    Platforms tested: Fedora 16 x86_64
    Flag Day: yes
    Doc impact: yes

Comment 61 Fedora Update System 2011-09-21 18:57:10 UTC
389-ds-base-1.2.10-0.1.a1.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.1.a1.fc16

Comment 62 Fedora Update System 2011-09-24 20:51:01 UTC
Package 389-ds-base-1.2.10-0.1.a1.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing 389-ds-base-1.2.10-0.1.a1.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.1.a1.fc16
then log in and leave karma (feedback).

Comment 63 Fedora Update System 2011-10-05 21:55:17 UTC
389-ds-base-1.2.10-0.2.a2.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.2.a2.fc16

Comment 64 Fedora Update System 2011-10-08 03:24:30 UTC
389-ds-base-1.2.10-0.4.a4.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.4.a4.fc16

Comment 65 Fedora Update System 2011-10-19 15:25:12 UTC
389-ds-base-1.2.10-0.4.a4.fc16,sssd-1.6.2-2.fc16,freeipa-2.1.3-4.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.4.a4.fc16,sssd-1.6.2-2.fc16,freeipa-2.1.3-4.fc16

Comment 66 Fedora Update System 2011-10-25 03:38:34 UTC
389-ds-base-1.2.10-0.4.a4.fc16, freeipa-2.1.3-4.fc16, selinux-policy-3.10.0-46.fc16, sssd-1.6.2-4.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.