Created attachment 491503 [details] Native systemd service file for dirsrv snmp daemon Description of problem: The attached file is a native systemd file for upcoming F15 Feature [1] Please read [2] on how to packaging and installing systemd Service files. To learn more about Systemd daemon see [3]. To view old SysV with the new Systemd site by site see for your component see [4] If you have any question dont hesitate to ask them on this bug report. 1.http://fedoraproject.org/wiki/Features/systemd 2.https://fedoraproject.org/wiki/Systemd_Packaging_Draft 3.http://0pointer.de/public/systemd-man/daemon.html 4.https://fedoraproject.org/wiki/User:Johannbg/QA/Systemd/compatability Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Will be some time before I submit the dirsv service since I need to figure out what's the best way to deal with that mess
for the snmp file: Do I call it dirsrv-snmp.service? Does it go in /lib/systemd/system? (e.g. in the spec file should I have %files ... /lib/systemd/system/dirsrv-snmp.service ? Should I use some sort of macro instead of /lib/systemd/system?
you call it what it used to be called with .service ending so for 389 services it would be dirsrv.service dirsrv-admin.service dirsrv-snmp.service See https://fedoraproject.org/wiki/TomCallaway/Systemd_Revised_Draft And https://fedoraproject.org/wiki/User:Toshio/Systemd_scriptlet_options Note the old sysv initscript should be removed and subpackage. You can take a look at how others have packaged systemd services like for example sssd http://pkgs.fedoraproject.org/gitweb/?p=sssd.git;a=blob_plain;f=sssd.spec;hb=d895a5f72c49210793ec02ffc768106178521c3e
For dirsrv - we could have instance creation (the setup-ds-admin.pl or setup-ds.pl scripts) create the systemd .service file for that instance e.g. when you create slapd-instname it creates /lib/systemd/system/dirsrv-instname.service - I think we can live with using service dirsrv-instname start rather than service dirsrv start instname But what about service dirsrv start ? Is there any way to have a systemd service operate on a group of related services?
Let's use that option as a last resort I think we can handled this in a nicer way. Let's start by creating a clean template that runs a single service an single instance. Basically what I need to know is what's needed to start a service something like.. [Unit] Description=389 Directory Server. After=syslog.target network.target [Service] Type=forking EnvironmentFile=/etc/sysconfig/dirsrv-localhost ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-localhost -i /var/run/dirsrv/slapd-localhost.pid -w /var/run/dirsrv/slapd-localhost.startpid [Install] WantedBy=multi-user.target Once we have created a clean working template we can see if we cant use that template in a smarter way
That's it. That will start the single instance "localhost"
Created attachment 493618 [details] dirsrv template v0.1 This needs to be tested with a working fds setup on F15 Copy the attached file into /lib/systemd/system Then link into the relevant file like multible slapd instance would be ln -s /lib/systemd/system/dirsrv\@.service /etc/systemd/system/multi-user.target.wants/dirsrv And ln -s /lib/systemd/system/dirsrv\@.service /etc/systemd/system/multi-user.target.wants/dirsrv then run systemctl daemon-reload and systemctl start dirsrv systemctl start dirsrv To see if the template works correctly
Created attachment 493665 [details] info about failures systemctl start dirsrv is not working. The error messages don't give much to go on.
Apr 20 17:29:20 f15x8664 systemd[1]: Failed to load environment files: No such file or directory Apr 20 17:29:20 f15x8664 systemd[1]: dirsrv failed to run 'start' task: No such file or directory which tells us that EnvironmentFile=/etc/sysconfig/dirsrv-%i is not working Which is. A) Because the that file is missing or B) EnvironmentFile=/etc/sysconfig/dirsrv-%i ( which I will neede to ping Lennart about ) Could you set EnvironmentFile=/etc/sysconfig/dirsrv ( or create the file if it's missing ) and see if the service starts then normally.
The file was there, so tried B). Changed it to say EnvironmentFile=/etc/sysconfig/dirsrv-f15x8664 and then systemctl start dirsrv is working. However, systemctl status dirsrv shows the Process: ExecStart with %i in the paths for the config dir and pid files.
Ok I'll ping Lennart about that as well and file bugs for both of these. I'm going to create dirsrv.target and make modifications to dirsrv@.service template to bind it to that target so that user can run systemctl start/stop dirsrv.target to start all the dirsrv@$foo.services ( the /sbin/service command does not support starting targets only services so users have to use systemctl command ) Now the the setup-ds-admin.pl or setup-ds.pl scripts should then link the dirsrv@.service into the dirsrv.target.wants directory when setting up the ds. Using your setup mentioned here then the setup script would have performed.. ln -s /lib/systemd/system/dirsrv\@.service /etc/systemd/system/dirsrv.target.wants/dirsrv and reloaded the systemd daemon ( systemd daemon-reload ) Are there any check that should be performed before the service is started or on restart/reload? I also noticed that the pid files where owned by nobody We can add to the dirsrv template User=dirsrv Group=dirsrv To have it owned by dirsrv user ( that if it exists )
(In reply to comment #11) > Ok I'll ping Lennart about that as well and file bugs for both of these. > > I'm going to create dirsrv.target and make modifications to dirsrv@.service > template to bind it to that target so that user can run systemctl start/stop > dirsrv.target to start all the dirsrv@$foo.services ( the /sbin/service command > does not support starting targets only services so users have to use systemctl > command ) So this means users cannot use service any more? That's going to cause headaches for QE, and packages that depend on 389 like freeipa and dogtag. > Now the the setup-ds-admin.pl or setup-ds.pl scripts should then link the > dirsrv@.service into the dirsrv.target.wants directory when setting up the ds. > > Using your setup mentioned here then the setup script would have performed.. > > ln -s /lib/systemd/system/dirsrv\@.service > /etc/systemd/system/dirsrv.target.wants/dirsrv > > and reloaded the systemd daemon ( systemd daemon-reload ) Ok. Can I try this with the existing dirsrv@.service from https://bugzilla.redhat.com/attachment.cgi?id=493618 ? > Are there any check that should be performed before the service is started or > on restart/reload? Not right now. > I also noticed that the pid files where owned by nobody In fact they can be owned by nobody, dirsrv, pkiuser, etc. We are different than (apparently every) other services in that at setup time we allow you to choose the userid for the daemon - this userid then owns the config files, pid files, log files, etc. > > We can add to the dirsrv template > > User=dirsrv > Group=dirsrv > > To have it owned by dirsrv user ( that if it exists ) Can't add that to the dirsrv template since each instance may be owned by a different userid. Is this necessary?
Created attachment 493913 [details] dirsrv target v0.1 To test Copy the file to /lib/systemd/system mkdir /etc/systemd/system/dirsrv.target.wants ln -s /lib/systemd/system/dirsrv\@.service /etc/systemd/system/dirsrv.target.wants/dirsrv systemctl daemon-reload systemctl start dirsrv.target ( should start the dirsrv ) systemctl stop dirsrv.target ( should start the dirsrv )
Created attachment 493915 [details] dirsrv template v0.2 The template that should go with the dirsrv.target
do I need to remove the links I set up in https://bugzilla.redhat.com/show_bug.cgi?id=695736#c7 ?
(In reply to comment #12) > (In reply to comment #11) > So this means users cannot use service any more? That's going to cause > headaches for QE, and packages that depend on 389 like freeipa and dogtag. It wont work for targets and probably @service too ( systemd spesific ) but it will work for regular service like dirsrv-admin.service dirsrv-snmp.service so service dirsrv-admin start will start the dirsrv-admin.service Any package should be updated to reflect the new init systemd anyway and use it's commands and users get custom to use systemctl command instead of service and chkconfig ones. > Ok. Can I try this with the existing dirsrv@.service from > https://bugzilla.redhat.com/attachment.cgi?id=493618 > ? Nope you will have to use dirsrv@.service v0.2 one > Can't add that to the dirsrv template since each instance may be owned by a > different userid. Is this necessary? Nope not et all
(In reply to comment #15) > do I need to remove the links I set up in > https://bugzilla.redhat.com/show_bug.cgi?id=695736#c7 ? Yeah you should remove it since we are going to be usind dirsrv.target.wants from now on and it might cause conflict at boot up for you. dirsrv.target will be started when multi-user.target starts and if working correctly all the dirsrv@.services along with it so you would end up starting the same service twice since it's linked to it both in multi-user.target.wants and dirsrv.target.wants
Ok. This is working systemctl op dirsrv.target works on all instances systemctl op dirsrv works on that service For the spec file - is there a %{_systemdetcsystemdsystem} macro or is it ok to hardcode it as %{_sysconfdir}/systemd/system ?
Is there a bug for getting the EnvironmentFile=/etc/sysconfig/dirsrv-%i working correctly? Is it possible to have more than one EnvironmentFile? When a server starts up, it needs to read from the file that applies to all instances /etc/sysconfig/dirsrv then from the file that applies to the specific instance /etc/sysconfig/dirsrv-%i ?
(In reply to comment #19) > Is there a bug for getting the > EnvironmentFile=/etc/sysconfig/dirsrv-%i > working correctly? Yup I pinged Lennart on irc and just in case if he's like me ( forget everthing if aint in writing and I'm not regularly nagged about it ) i filed a bug #698755 > > Is it possible to have more than one EnvironmentFile? When a server starts up, > it needs to read from the file that applies to all instances > /etc/sysconfig/dirsrv > then from the file that applies to the specific instance > /etc/sysconfig/dirsrv-%i > ? Hum... Test defining it twice as in for the f15x8664,service it would be.. EnvironmentFile=/etc/sysconfig/dirsrv EnvironmentFile=/etc/sysconfig/dirsrv-f15x8664 That might work
Created attachment 494022 [details] /proc/PID/environ without using systemd it looks like an environment table: VAR=VALUE and nothing else It reads both /etc/sysconfig/dirsrv (STARTPID_TIME) and /etc/sysconfig/dirsrv-f15x8664 (DS_CONFIG_DIR)
Created attachment 494023 [details] /proc/PID/environ using systemd it reads in both /etc/sysconfig/dirsrv and /etc/sysconfig/dirsrv-f15x8664 - but it doesn't look like it processed them correctly afaik, a /etc/sysconfig/file does not have to be strictly VAR=VALUE - it can use any valid bourne shell syntax e.g. the source (". filename") command is used to read in the file. I'm not sure what systemd is doing - looks like it reads in the files verbatim then just does string replacement of any $VAR name it finds. IMHO, it should be processing the files as bourne shell source (". filename") does.
Perhaps not in the old system but I'm pretty sure /etc/sysconfig/file does have to be strictly VAR=VALUE in systemd since you spesifically have to invoke bash if you want bash ( and related behavior ) with systemd as in $foo=/bin/bash $bar and afaik it's only supported envoking bash in Exec$foo= and Lennart does not like people doing that since it's an clear indication that something is seriously broken and the daemon and or the relevant code should be fixed and I'm pretty sure you will have hard time convincing him to change this. Could you attache the /etc/sysconfig/dirsrv and /etc/sysconfig/dirsrv-f15x8664 so I can see what it looks like and what it's actually doing.
Created attachment 494027 [details] /etc/sysconfig/dirsrv 389 supports multiple platforms, many of which support something like environment files. On all of these platforms, the environment files are read into the process by the use of the bourne shell source (".") command, which allows any valid bourne shell syntax. We try to keep platform dependencies to a minimum to maximize portability. So I would rather not have to have multiple sysconfig files unless absolutely necessary. I'm also surprised (or maybe not, given how much trouble having multiple instances has been) that no one else has run into this - surely 389 is not the only project that has bourne shell code (and not just simple var=value) in the sysconfig files?
Created attachment 494028 [details] /etc/sysconfig/dirsrv-f15x8664
Ok - found the docs which explain EnvironmentFile - http://0pointer.de/public/systemd-man/systemd.exec.html it wants VAR=VALUE and nothing else. Can I use VAR as a value for other VAR settings? For example, is this supported? INSTANCENAME=f15x8664 CONFIG_DIR=/etc/dirsrv/slapd-$INSTANCENAME SERVER_DIR=/usr/lib64/dirsrv/slapd-$INSTANCENAME ?
To my knowledge I do belive this would not work but looking at the dirsrv-$foo file I think what you are looking for is Environment= As in having in the template. Environment=CONFIG_DIR=/etc/dirsrv/slapd-%i Environment=SERVER_DIR=/usr/lib64/dirsrv/slapd-%i and then I guess you can deprecate /etc/sysconfig/dirsrv-$foo file. . .
(In reply to comment #27) > To my knowledge I do belive this would not work but looking at the dirsrv-$foo > file I think what you are looking for is Environment= > > As in having in the template. > > Environment=CONFIG_DIR=/etc/dirsrv/slapd-%i > Environment=SERVER_DIR=/usr/lib64/dirsrv/slapd-%i > > and then I guess you can deprecate /etc/sysconfig/dirsrv-$foo file. . . No. I still want to give users the ability to add items to this file. I think editing /etc/sysconfig/dirsrv-foo is easier than editing /lib/systemd/system/dirsrv@.service && systemctl --system daemon-reload Which brings up another point - how do users increase the number of file descriptors available to the directory server process? The usual way is to do ulimit -n 8192 for example, in the /etc/sysconfig/dirsrv or dirsrv-foo. How can I do this with systemd?
(In reply to comment #28) > Which brings up another point - how do users increase the number of file > descriptors available to the directory server process? The usual way is to do > > ulimit -n 8192 > > for example, in the /etc/sysconfig/dirsrv or dirsrv-foo. How can I do this > with systemd? You have LimitNOFILE= which is ulimit -n So you would put in the service file LimitNOFILE=8192 See systemd.exec for various Limit$foo= options which control various resource limits for executed processes.
Created attachment 494776 [details] 0002-Bug-695741-Providing-native-systemd-file-for-upcomin.patch
Created attachment 494777 [details] 0002-Bug-695736-Providing-native-systemd-file-for-upcomin.patch
Created attachment 494779 [details] 389-ds-base.spec.patch
I should mention that you have to make sure that the path expansion in service files work correctly. Systemd needs absolute paths to commands and files in it's service files or it will refuse to start something that Joe Orton the Apache maintainer well found out and fixed here https://pkgs.fedoraproject.org/gitweb/?p=httpd.git;a=commitdiff;h=df147d55d0e1710a308096f170d9c4980ff32191 ( Just something to mention and check for after a package has been built as in the native systemd service file contain full paths ) I should mention for documentation purposes /lib/systemd is only for packages and /etc/systemd is for admins to do any custom stuff and for admins to be editing files in the /lib/systemd is considered bad practice. So when editing any of the dirsrv service ( or anyother service fi file they should first copy it into /etc/systemd/system For dirsrv-admin.service # cp /lib/systemd/system/dirsrv-admin.service /etc/systemd/system/ # vim /etc/systemd/system/dirsrv-admin.service # systemctl daemon-reload # systemctl start dirsrv-admin.service For dirsrv-snmp.service # cp /lib/systemd/system/dirsrv-snmp.service /etc/systemd/system/ # vim /etc/systemd/system/dirsrv-snmp.service # systemctl daemon-reload # systemctl (re)start dirsrv-snmp.service And for the dirsrv@.service template and since it is an template it requires few more steps then simply copy the file to /etc/systemd/system/ directory and start editing it and then reload systemd # cp /lib/systemd/system/dirsrv\@.service /etc/systemd/system/dirsrv\@.service # mkdir -p /etc/systemd/system/dirsrv.target.wants # vim /etc/systemd/system/dirsrv\@.service # ln -s /etc/systemd/system/dirsrv\@.service /etc/systemd/system/dirsrv.target.wants/dirsrv@$foo.service # systemctl daemon-reload # systemctl (re)start dirsrv.target Now the relevant setup scripts should just be working in the /etc/systemd/system directory as opposed to the /lib/systemd/system counterpart then all documentation can just refer admins to the /etc/systemd/system directory to edit the dirsrv-admin.service dirsrv-snmp.service dirsrv\@.service and reload the systemd daemon along with (re)starting the service ( and the dirsrv.target for the dirsrv@.service )
(In reply to comment #10) > systemctl status dirsrv > shows the Process: ExecStart with %i in the paths for the config dir and pid > files. Note that Lennart closed this bug as wont fix ( see bug 698761 for details )
(In reply to comment #33) > I should mention that you have to make sure that the path expansion in service > files work correctly. > > Systemd needs absolute paths to commands and files in it's service files or it > will refuse to start something that Joe Orton the Apache maintainer well found > out and fixed here > https://pkgs.fedoraproject.org/gitweb/?p=httpd.git;a=commitdiff;h=df147d55d0e1710a308096f170d9c4980ff32191 > > ( Just something to mention and check for after a package has been built as in > the native systemd service file contain full paths ) Are you talking about things in the upstream patch like this: ExecStart=@sbindir@/ns-slapd -D @instconfigdir@/slapd-%i -i $PIDDIR/slapd-%i.pid -w $PIDDIR/slapd-%i.startpid If so, then yes - paths like @sbindir@ etc. are replaced during the build by absolute paths. On a Fedora/RHEL system, this will be expanded to /usr/sbin/ns-slapd and so on.
Moving systemd service RFEs to rawhide. At this point, it is not appropriate in the Fedora 15 cycle to add these. Furthermore, at this point, we are still finalizing the packaging guidelines to handle SysV -> systemd upgrades. We therefore request: - wait until there are packaging guidelines (this will be announced on the devel list). This ensures that upgrades will work smoothly and we/you won't have to do multiple sets of changes. - work on these sorts of changes for Fedora 16 where necessary, not Fedora 15, as we're trying to fix things for release. - do *not* change a service from SysV to systemd in an existing release (such as Fedora 15), as this is the sort of behavior change that goes against our update policy, documented as https://fedoraproject.org/wiki/Updates_Policy
So for f15, I should just put SYSTEMCTL_SKIP_REDIRECT=1 ; export SYSTEMCTL_SKIP_REDIRECT at the beginning of the regular sysv init script and save the other changes I've made for f16. Works for me.
https://fedoraproject.org/wiki/Packaging:Guidelines:Systemd https://fedoraproject.org/wiki/Packaging:ScriptletSnippets#Systemd https://fedoraproject.org/wiki/Packaging:Tmpfiles.d
Time to start looking at this one again since bug 698755 has been fixed in git hence it will be solved with next systemd build
(In reply to comment #39) > Time to start looking at this one again since bug 698755 has been fixed in git > hence it will be solved with next systemd build We have this work scheduled for next month - is that too late?
The sooner this hits the street the more you expose it to testing and catch potentially any Before= and After= ordering requirements etc. and just fix what needs fixing via update. In all it strictness this needs to be resolved and package no later then 2011-09-06 ( Beta Change Deadline ) or it will miss F16 and get pushed to F17 which in turn will potentially disrupt any additional sysv related cleanups/changes that might take place during the F17 cycle ( the aim here is managing to convert all sysv legacy scripts to native systemd units during this release cycle )
systemctl restart dirsrv works systemctl restart dirsrv.target hangs - it shuts down the instance, but it does not restart it. This is what I have so far: /lib/systemd/system/dirsrv.target: [Unit] Description=389 Directory Server After=syslog.target network.target [Install] WantedBy=multi-user.target /lib/systemd/system/dirsrv@.service: [Unit] Description=389 Directory Server %i. BindTo=dirsrv.target After=dirsrv.target [Service] Type=forking Environment=PIDDIR=/var/run/dirsrv EnvironmentFile=/etc/sysconfig/dirsrv EnvironmentFile=/etc/sysconfig/dirsrv-%i ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w $PIDDIR/slapd-%i.startpid ls /etc/systemd/system/dirsrv.target.wants dirsrv -> /lib/systemd/system/dirsrv@.service
If I do systemctl stop dirsrv.target then systemctl start dirsrv.target everything works. It is only the restart command that is the problem.
Hum wondering if it's because it's target not a service What happens if you test it with a service as in [Unit] Description=389 Directory Server After=syslog.target network.target [Service] Type=oneshot ExecStart=/bin/true [Install] WantedBy=multi-user.target Then create dirsrv.service.wants directory and link into that and restart the service ( services can have wants directory too ) If I can recall correctly something like the above dummy service I had though of to keep backwards compatibility with service command ofcourse you adjust the template as so... [Unit] Description=389 Directory Server %i. BindTo=dirsrv.service After=dirsrv.service [Service] Type=forking Environment=PIDDIR=/var/run/dirsrv EnvironmentFile=/etc/sysconfig/dirsrv EnvironmentFile=/etc/sysconfig/dirsrv-%i ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w $PIDDIR/slapd-%i.startpid Then users could use service dirsrv start/stop/restart But the reason could also be because we have specified an pidfile as in PIDFile=/var/run/dirsrv/slapd-%i.pid in the service section of the template file
(In reply to comment #44) > Hum wondering if it's because it's target not a service > > What happens if you test it with a service as in > > [Unit] > Description=389 Directory Server > After=syslog.target network.target > > [Service] > Type=oneshot > ExecStart=/bin/true > > [Install] > WantedBy=multi-user.target So instead of /lib/systemd/system/dirsrv.target I would have /lib/systemd/system/dirsrv.service containing the above? > > Then create dirsrv.service.wants directory and link into that and restart the > service ( services can have wants directory too ) > > If I can recall correctly something like the above dummy service I had though > of to keep backwards compatibility with service command ofcourse you adjust the > template as so... > > [Unit] > Description=389 Directory Server %i. > BindTo=dirsrv.service > After=dirsrv.service > > [Service] > Type=forking > Environment=PIDDIR=/var/run/dirsrv > EnvironmentFile=/etc/sysconfig/dirsrv > EnvironmentFile=/etc/sysconfig/dirsrv-%i > ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w > $PIDDIR/slapd-%i.startpid So the above would go into /lib/systemd/system/dirsrv@.service - just replace dirsrv.target with dirsrv.service > > Then users could use service dirsrv start/stop/restart > > But the reason could also be because we have specified an pidfile as in > PIDFile=/var/run/dirsrv/slapd-%i.pid in the service section of the template > file No - PIDFile is not specified anywhere - PIDDIR is but that's not the same thing.
Ok. I made the above changes and did systemctl daemon-reload same behavior - systemctl stop/start dirsrv.service works fine - restart just hangs service dirsrv stop/start work fine service dirsrv restart just hangs The service command won't work because there is no way to control individual instances (afaict): service dirsrv stop localhost - error service dirsrv@localhost stop - error service dirsrv stop - error
(In reply to comment #46) > Ok. I made the above changes and did > systemctl daemon-reload > > same behavior - systemctl stop/start dirsrv.service works fine - restart just > hangs > > service dirsrv stop/start work fine > service dirsrv restart just hangs You can try adding to the [Service] section of the template StandardOutput=syslog StandardError=syslog And check /var/log/message if it captures why it's failing if nothing is there then I guess strace or enable debuging output is next on the list > The service command won't work because there is no way to control individual > instances (afaict): > service dirsrv stop localhost - error > service dirsrv@localhost stop - error > service dirsrv stop - error You will need to use systemctl for anything other than service dirsrv start/stop/restart the methods that you are using there are unsupported by the service command afaik
Created attachment 520619 [details] strace -f -o systemctl.strace systemctl restart dirsrv.service This is the output of strace -f -o systemctl.strace systemctl restart dirsrv.service Note that systemctl is hung at line 299: 2078 poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}], 2, -1) = 1 ([{fd=5, revents=POLLIN}]) I interrupted it with Ctrl-C after waiting several minutes.
I tried setting in the dirsrv.service and dirsrv@.service files: [Service] LogLevel=debug then systemctl daemon-reload but systemctl complained that 'LogLevel' is not a valid lvalue in the [Service] section. I tried editing /etc/systemd/system.conf and uncommenting the LogLevel directive and set it to debug then systemctl daemon-reload but there was no extra output in /var/log/messages or dmesg Then I rebooted the system - there was a lot of extra information in /var/log/messages Then I tried systemctl start dirsrv.service followed by systemctl restart dirsrv.service there was no extra output - still hangs What now?
Hum you should add To the [Service] section of the template file ( see man systemd.exec ) SyslogLevel=debug StandardOutput=syslog StandardError=syslog Do you have selinux enabled by any chance?
I've provided 389-ds-base systemd enabled builds for F16 at http://rmeggins.fedorapeople.org/rpms/ You'll need the perldap package for F16 - I've also built this in F16 updates-testing if you'd rather grab it from there After installing the packages, run setup-ds.pl you can use localhost as the hostname - just ignore the warnings - it wants a FQDN but localhost should work fine for testing systemd setup-ds.pl will create the symlink /etc/systemd/system/dirsrv.target.wants/dirsrv -> /lib/systemd/system/dirsrv@.service
Mocking around with dirsrv here during the evening I have come across several issues first of all the /var/lock/dirsrv directory's are missing after reboot note that tmpfiles only cover /run or /var/run not /var/lock afaik, Secondly after restart the startpid file become owned by root.nobody and only the mainpid gets killed while ns-slapd just happily keeps on running. ( Note that this can be due to the dirsrv.service not being run as nobody I just have not gotten the daemon running again due to permission erro's to find out and it's getting a bit late ) I continuously hit ns-slap refusing to start due to permission errors [31/Aug /2011:00:00:56 +0000] createprlistensockets - PR_Bind() on All Interfaces port 389 failed: Netscape Portable Runtime error -5966 (Access Denied.) ? the whole /var/run and /var/lock/dirsrv and subdirectorys are owned by nobody I've also hit issue where slap thinks some other server is running which seem to when dug deeper also be permission related? "Shutting down due to possible conflicts with other slapd processes" So what's actually happening there in the background with all those permissions checks etc. If you can map that out for me than it becomes a question if we cant just let systemd handle that which should make it a bit less error prone and give you guys a chance to reduce the code a bit?
(In reply to comment #52) > Mocking around with dirsrv here during the evening I have come across several > issues first of all the /var/lock/dirsrv directory's are missing after reboot > note that tmpfiles only cover /run or /var/run not /var/lock afaik, Hmm - well before we used tmpfiles.d for /var/lock/dirsrv it used to fail upon reboot - with it it works - not sure what has changed - but if you can point me in the direction of the docs that say which tmpfiles.d we need to specify I will be happy to amend the code. > > Secondly after restart the startpid file become owned by root.nobody and only > the mainpid gets killed while ns-slapd just happily keeps on running. > ( Note that this can be due to the dirsrv.service not being run as nobody I > just have not gotten the daemon running again due to permission erro's to find > out and it's getting a bit late ) Hmm - the problem that I see is that this is not correct: ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w $PIDDIR/slapd-%i.startpid it should be ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i ${PIDDIR}/slapd-%i.pid -w ${PIDDIR}/slapd-%i.startpid Without the braces, the ps -ef|grep slapd output would show just -i and -w and no pid files in /var/run/dirsrv - adding the braces shows the correct ps -ef output and the correct files with the correct ownerships > > I continuously hit ns-slap refusing to start due to permission errors [31/Aug > /2011:00:00:56 +0000] createprlistensockets - PR_Bind() on All Interfaces port > 389 failed: Netscape Portable Runtime error -5966 (Access Denied.) you are starting it using systemctl start dirsrv running as root? Is there another directory server running? Something else listening to port 389? > > ? the whole /var/run and /var/lock/dirsrv and subdirectorys are owned by nobody The default user is "nobody" - /var/run/dirsrv and /var/lock/dirsrv are owned by nobody. > > I've also hit issue where slap thinks some other server is running which seem > to when dug deeper also be permission related? > > "Shutting down due to possible conflicts with other slapd processes" > > So what's actually happening there in the background with all those permissions > checks etc. "all those permissions checks" is just "can I bind to port 389?" - this will either fail due to 1) cannot bind to port 389 if not started as root (note the server will drop permissions soon after binding as root) 2) cannot bind to port 389 if another process is already bound to port 389 3) selinux prevents binding to port 389 (although this is allowed by the ldap/dirsrv policy in the base os selinux policy, so this should not happen for port 389) > > If you can map that out for me than it becomes a question if we cant just let > systemd handle that which should make it a bit less error prone and give you > guys a chance to reduce the code a bit? "reduce code a bit" == "write and debug a lot of code on several different platforms, some of which support systemd and some of which do not, in the short period of time we have to get a version of dirsrv in F16 that fully supports systemd" I'd rather spend my time in the short term helping you figure out why systemctl restart dirsrv.target is not working.
To put it another way - I'd rather get dirsrv working as is, since really the only thing preventing us from providing systemd support is this restart issue. I'm sure we can work out the permissions/ownership/conflict problems without having to resort to a rewrite of the socket code in dirsrv.
(In reply to comment #53) > (In reply to comment #52) > > Mocking around with dirsrv here during the evening I have come across several > > issues first of all the /var/lock/dirsrv directory's are missing after reboot > > note that tmpfiles only cover /run or /var/run not /var/lock afaik, > > Hmm - well before we used tmpfiles.d for /var/lock/dirsrv it used to fail upon > reboot - with it it works - not sure what has changed - but if you can point me > in the direction of the docs that say which tmpfiles.d we need to specify I > will be happy to amend the code. man tmpfiles.d only mentions /run > > > > Secondly after restart the startpid file become owned by root.nobody and only > > the mainpid gets killed while ns-slapd just happily keeps on running. > > ( Note that this can be due to the dirsrv.service not being run as nobody I > > just have not gotten the daemon running again due to permission erro's to find > > out and it's getting a bit late ) > > Hmm - the problem that I see is that this is not correct: > ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i $PIDDIR/slapd-%i.pid -w > $PIDDIR/slapd-%i.startpid > > it should be > ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i ${PIDDIR}/slapd-%i.pid > -w ${PIDDIR}/slapd-%i.startpid > > Without the braces, the ps -ef|grep slapd output would show just -i and -w and > no pid files in /var/run/dirsrv - adding the braces shows the correct ps -ef > output and the correct files with the correct ownerships Yes I forgot to mentioned that I had notice that and used full path instead of PIDDIR variable > > > > > I continuously hit ns-slap refusing to start due to permission errors [31/Aug > > /2011:00:00:56 +0000] createprlistensockets - PR_Bind() on All Interfaces port > > 389 failed: Netscape Portable Runtime error -5966 (Access Denied.) > > you are starting it using > systemctl start dirsrv > running as root? Is there another directory server running? Something else > listening to port 389? Nope the port is free however I had added User=nobody and Group=nobody to the template unit because I hit some other permission error if I can recall correctly the debug output said something about the lock file being owned by pid 0 or uid 0 So to the dirsrv@.template I added to the [Service] section User=nobody Group=nobody Gave full paths and turned on debugging ( -d 1 ) ExecStart=/usr/sbin/ns-slapd -d 1 -D /etc/dirsrv/slapd-%i -i /var/run/dirsrv/slapd-%i.pid -w /var/run/dirsrv/slapd-%i.startpid Increased the timeout so systemd would not kill the service in startup since we are starting it in debug and that start up time exceeds the default thus systemd will kill the service TimeoutSec=5m And finally added to catch anything that might be spewed to the console StandardError=syslog ( We should create a how to debug 389 directory server wiki page with the above info ) > > ? the whole /var/run and /var/lock/dirsrv and subdirectorys are owned by nobody > > The default user is "nobody" - /var/run/dirsrv and /var/lock/dirsrv are owned > by nobody. > > > > > I've also hit issue where slap thinks some other server is running which seem > > to when dug deeper also be permission related? > > > > "Shutting down due to possible conflicts with other slapd processes" > > > > So what's actually happening there in the background with all those permissions > > checks etc. > > "all those permissions checks" is just "can I bind to port 389?" - this will > either fail due to > 1) cannot bind to port 389 if not started as root (note the server will drop > permissions soon after binding as root) > 2) cannot bind to port 389 if another process is already bound to port 389 > 3) selinux prevents binding to port 389 (although this is allowed by the > ldap/dirsrv policy in the base os selinux policy, so this should not happen for > port 389) > > > > > If you can map that out for me than it becomes a question if we cant just let > > systemd handle that which should make it a bit less error prone and give you > > guys a chance to reduce the code a bit? > > "reduce code a bit" == "write and debug a lot of code on several different > platforms, some of which support systemd and some of which do not, in the short > period of time we have to get a version of dirsrv in F16 that fully supports > systemd" > > I'd rather spend my time in the short term helping you figure out why > systemctl restart dirsrv.target > is not working. Understood I was just pointing out that we probably could take care of those checks via Condition$foo and ExecStartPre/ExecStartPost/ExecStopPost to ensure things are correct directories exist etc, Anyway my next test was to add to the dummy dirsrv.service Before=dirsrv And User=nobody Group=nobody To see if the startpid keept the correct ownership ( nobody.nobody as oppose to root.nobody ) when I kept hitting the directory server either refusing to start because of permission errors or it thought another instance was running which was not the case anyway I ran out of time ( Time was getting close to 01:00 ) and will continue to poke this when I get home from work but in the meantime I see if I cant get some clarification from Lennart on what's the expected behaviour when restarting a unit which has another units bound/required to it.
(In reply to comment #55)> > Nope the port is free however I had added User=nobody and Group=nobody to the > template unit because I hit some other permission error if I can recall > correctly the debug output said something about the lock file being owned by > pid 0 or uid 0 Ok. I'm just not sure what's going on here. Running setup-ds.pl should create /var/lock/dirsrv/slapd-name with 0770 and nobody:nobody. It also creates /etc/tmpfiles.d/dirsrv-name.conf with /var/run/dirsrv /var/lock/dirsrv /var/lock/dirsrv/slapd-name all with 0770 nobody:nobody So try this - start with a clean system remove-ds.pl -i slapd-name then yum erase 389-ds-base-libs then rm -rf /etc/dirsrv /etc/sysconfig/dirsrv* /etc/tmpfiles.d/dirsrv* /var/*/dirsrv /usr/*/dirsrv Then install the 389-ds-base package Check for existence ownership permissions on /var/lock/dirsrv Then run setup-ds.pl Then check /var/lock/dirsrv again > > So to the dirsrv@.template I added to the [Service] section > > User=nobody > Group=nobody I can't add this to the template because each instance may run as a different user. > > Gave full paths and turned on debugging ( -d 1 ) > > ExecStart=/usr/sbin/ns-slapd -d 1 -D /etc/dirsrv/slapd-%i -i > /var/run/dirsrv/slapd-%i.pid -w /var/run/dirsrv/slapd-%i.startpid Note that when you turn on debugging: Using -d with any value will tell ns-slapd to not daemonize - it will remain attached to the controlling process. Using -d 1 will cause ns-slapd to take forever to startup. > > Increased the timeout so systemd would not kill the service in startup since we > are starting it in debug and that start up time exceeds the default thus > systemd will kill the service > > TimeoutSec=5m > > And finally added to catch anything that might be spewed to the console > > StandardError=syslog > > ( We should create a how to debug 389 directory server wiki page with the above > info ) Ok. > > Anyway my next test was to add to the dummy dirsrv.service > > Before=dirsrv > > And > > User=nobody > Group=nobody > > To see if the startpid keept the correct ownership ( nobody.nobody as oppose to > root.nobody ) when I kept hitting the directory server either refusing to start > because of permission errors or it thought another instance was running which > was not the case anyway I ran out of time ( Time was getting close to 01:00 ) > and will continue to poke this when I get home from work but in the meantime I > see if I cant get some clarification from Lennart on what's the expected > behaviour when restarting a unit which has another units bound/required to it.
Created attachment 521113 [details] 0001-Bug-695736-Providing-native-systemd-file-for-upcomin.patch "parameterized" the "group" name dirsrv.target and rebased to the latest code in master
Created attachment 521114 [details] spec file changes "parameterized" the "group" name dirsrv.target - rebased on top of the latest code
Created attachment 521127 [details] 0001-Bug-695736-Providing-native-systemd-file-for-upcomin.patch have to explicitly enable systemd support - it will not enable systemd even if pkg-config says systemd is supported - must pass in --with-systemdsystemunitdir and --with-systemdsystemconfdir
To ssh://git.fedorahosted.org/git/389/ds.git b5f77c6..144c607 master -> master commit 144c607fa22e058a9ab3d343d0706432e94d5a63 Author: Rich Megginson <rmeggins> Date: Thu Apr 21 15:49:13 2011 -0600 Reviewed by: nhosoi, nkinder (Thanks!) Branch: master Fix Description: Since we support multiple instances of directory server, create a dirsrv.target, and have the instances "want" that target. There is a service template file dirsrv@.service that supports replaceable parameters which are instance specific. When a new instance is created, we create a symlink called dirsrv@$instance.service which links to the template file. systemd fills in the %i with the correct instance name. The service command will not work. You have to use the systemctl command: systemctl stop dirsrv - single instance systemctl stop dirsrv.target - all instances There are still some outstanding issues with systemd: * systemctl restart dirsrv.target - will hang after shutting down the instances When using systemd, have to use the systemctl start command in startServer or other systemd commands like status, restart, stop will not work Note: the "group" name dirsrv.target is flexible - just change the --with-systemdgroupname=NAME when running configure Platforms tested: Fedora 16 x86_64 Flag Day: yes Doc impact: yes
389-ds-base-1.2.10-0.1.a1.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.1.a1.fc16
Package 389-ds-base-1.2.10-0.1.a1.fc16: * should fix your issue, * was pushed to the Fedora 16 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing 389-ds-base-1.2.10-0.1.a1.fc16' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.1.a1.fc16 then log in and leave karma (feedback).
389-ds-base-1.2.10-0.2.a2.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.2.a2.fc16
389-ds-base-1.2.10-0.4.a4.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.4.a4.fc16
389-ds-base-1.2.10-0.4.a4.fc16,sssd-1.6.2-2.fc16,freeipa-2.1.3-4.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/389-ds-base-1.2.10-0.4.a4.fc16,sssd-1.6.2-2.fc16,freeipa-2.1.3-4.fc16
389-ds-base-1.2.10-0.4.a4.fc16, freeipa-2.1.3-4.fc16, selinux-policy-3.10.0-46.fc16, sssd-1.6.2-4.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.