1246238 – Puppet agent doesn't start cleanly in F22

Bug 1246238 - Puppet agent doesn't start cleanly in F22

Summary: Puppet agent doesn't start cleanly in F22

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	puppet
Sub Component:
Version:	22
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Lukas Zapletal
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-07-23 19:02 UTC by Erik Logtenberg
Modified:	2015-09-09 00:22 UTC (History)
CC List:	13 users (show)
Fixed In Version:	4.1.0-4.fc22
Clone Of:
Environment:
Last Closed:	2015-09-09 00:22:49 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Proposed fix/workaround (803 bytes, patch) 2015-07-27 15:32 UTC, Erik Logtenberg	no flags	Details \| Diff
View All

Description Erik Logtenberg 2015-07-23 19:02:47 UTC

Starting puppet agent with systemctl takes a long time, and after some timeout systemctl reports a failure. In the mean time, puppet agent has started just fine, but now I can't use systemctl to stop puppet agent either.

# systemctl start puppet
(... takes some time...)
Job for puppet.service failed. See "systemctl status puppet.service" and "journalctl -xe" for details.

In the processlist I see the following immediately after systemctl start puppet:

root     \_ -bash
root         \_ systemctl start puppet
root             \_ /usr/bin/systemd-tty-ask-password-agent --watch
root             \_ /usr/bin/pkttyagent --notify-fd 5 --fallback
root    /bin/sh /usr/bin/start-puppet-agent agent --no-daemonize
root     \_ /usr/bin/ruby-mri /usr/bin/puppet agent --no-daemonize
root         \_ puppet agent: applying configuration

By the time I get the error message, this is reduced to:

root     \_ -bash
root  /usr/bin/ruby-mri /usr/bin/puppet agent --no-daemonize

Puppet now appears to run just fine, but systemctl stop puppet has no effect (no error message either).

Version-Release number of selected component (if applicable):

puppet-4.1.0-1.fc22.noarch (on x86_64)

Comment 1 Lukas Zapletal 2015-07-24 09:24:17 UTC

Hey, I believe this was fixed in a75fde8e5170f718b8bc1cea4e61e06942d4b8d3. Can you test the latest build?

http://koji.fedoraproject.org/koji/buildinfo?buildID=670499

Comment 2 Lukas Zapletal 2015-07-24 09:28:26 UTC

I am pretty sure this fixed it, here is new build for F22. Can you please test it? http://koji.fedoraproject.org/koji/taskinfo?taskID=10461459

Comment 3 Fedora Update System 2015-07-24 09:36:25 UTC

puppet-4.1.0-2.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/puppet-4.1.0-2.fc22

Comment 4 Erik Logtenberg 2015-07-24 21:42:42 UTC

Unfortunately, no it does not. In fact, it makes it worse.

I don't see the /usr/bin/pkttyagent anymore, but still this one:

root \_ /usr/bin/systemd-tty-ask-password-agent --watch

In the mean time puppet agent is starting just fine:

root /usr/bin/ruby-mri /usr/bin/puppet agent --no-daemonize

But as soon as systemctl says that it didn't work, it kills the running puppet agent! Error message:

Job for puppet.service failed. See "systemctl status puppet.service" and "journalctl -xe" for details.

Nothing useful in the log file, except you now see systemd killing puppet agent, that was otherwise running just fine:

jul 24 23:35:17 systemd[1]: puppet.service start operation timed out. Terminating.
jul 24 23:35:17 puppet-agent[3256]: Caught TERM; storing stop
jul 24 23:35:18 puppet-agent[3256]: Processing stop
jul 24 23:35:18 systemd[1]: Failed to start Puppet agent.
-- Subject: Unit puppet.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit puppet.service has failed.
--
-- The result is failed.
jul 24 23:35:18 systemd[1]: Unit puppet.service entered failed state.
jul 24 23:35:18 systemd[1]: puppet.service failed.

Just being curious, was the goal of the change to ExexStart supposed to get rid of the systemd-tty-ask-password-agent? If not, then how do we get rid of that? The failure of starting this service seems to be caused by this password-agent that doesn't seem to serve any purpose in this context. Puppet itself runs just fine -- until it's killed again after ~1 minute or so, apparently after the password-agent timed out.

Comment 5 Erik Logtenberg 2015-07-27 12:10:36 UTC

The unit file is wrong, if you change the unit file to this, it works fine:

[Unit]
Description=Puppet agent
Wants=basic.target
After=basic.target network.target puppetmaster.service

[Service]
EnvironmentFile=-/etc/sysconfig/puppetagent
EnvironmentFile=-/etc/sysconfig/puppet
Type=forking
ExecStart=/usr/bin/start-puppet-agent agent $PUPPET_EXTRA_OPTS
KillMode=process
PIDFile=/run/puppet/agent.pid

[Install]
WantedBy=multi-user.target



Thanks.

Comment 6 Lukas Zapletal 2015-07-27 15:22:57 UTC

Your report should go upstream, because we use upstream systemd unit with only change we spawn via shell script. It looks like puppet agent does use fork call therefore this should be fixed in upstream first.

https://bugzilla.redhat.com/show_bug.cgi?id=1246238
diff --git a/ext/systemd/puppet.service b/ext/systemd/puppet.service
index 7050c9f..7dec513 100644
--- a/ext/systemd/puppet.service
+++ b/ext/systemd/puppet.service
@@ -1,15 +1,15 @@
 [Unit]
 Description=Puppet agent
 Wants=basic.target
-After=basic.target network.target
+After=basic.target network.target puppetmaster.service

This should work on systems without puppetmaster I think.
 
 [Service]
 EnvironmentFile=-/etc/sysconfig/puppetagent
 EnvironmentFile=-/etc/sysconfig/puppet
-EnvironmentFile=-/etc/default/puppet

This should be kept ^^^

-ExecReload=/bin/kill -USR1 $MAINPID

Thist must be kept as well ^^^

+Type=forking
+ExecStart=/usr/bin/start-puppet-agent agent $PUPPET_EXTRA_OPTS
 KillMode=process
+PIDFile=/run/puppet/agent.pid

The rest is okay, doublecheck the pid is at the correct location.

After this is merged upstream, we can backport perhaps
 [Install]
 WantedBy=multi-user.target

Comment 7 Erik Logtenberg 2015-07-27 15:32:27 UTC

Created attachment 1056661 [details]
Proposed fix/workaround

It turns out that just removing --no-daemonize is enough to make things work. This is the patch against current Fedora puppet.spec I use to get things running.

Note that this patch also implements the fix/workaround for bug #1229703

Comment 8 Lukas Zapletal 2015-07-27 15:39:42 UTC

Again, this must be opened against upstream. We do not carry the systemd unit in Fedora.

Comment 9 Erik Logtenberg 2015-07-28 07:41:06 UTC

Look, there is nothing for upstream to fix. The upstream systemd unit file works fine.

It is the Fedora .spec file that adds "Type=forking", right here:

sed -i 's|^ExecStart=/opt/puppetlabs/puppet/bin/puppet|Type=forking\nExecStart=/usr/bin/start-puppet-agent|' \
   %{buildroot}%{_unitdir}/puppet.service

Well, if you want the service to be of the forking kind, then you better make sure that it actually forks, in other words: remove the --no-daemonize.

If you want the --no-daemonize argument to stay, then don't add "Type=forking".

It works either way. However the combination that Fedora now makes, with having both "Type=forking" and --no-daemonize in there, is the thing that does not work.

Comment 10 Lukas Zapletal 2015-07-28 12:11:49 UTC

Erik,

I haven't realized we add forking actually. This makes sense, absolutely. Can you confirm that removing Type=forking (and keeping --no-daemonize) does really work? Because due to SELinux we use a wrapper which forks puppet (via exec call).

I already started a discussion in one of my upstream PRs suggesting forking to be default for upstream. This is a different story I guess.

Comment 11 Erik Logtenberg 2015-07-28 13:30:03 UTC

Yes, removing Type=forking works. By the way, you say the wrapper script forks puppet, but the exec call doesn't fork; it replaces the current process. It works because it does -not- fork.

Comment 12 Lukas Zapletal 2015-07-28 15:33:15 UTC

Yeah it was actually forking and I made the change to exec. Anyway, here is another build (rawhide), it looks fine on my system now.

http://koji.fedoraproject.org/koji/taskinfo?taskID=10511424

https://kojipkgs.fedoraproject.org//work/tasks/1424/10511424/puppet-4.1.0-4.fc24.noarch.rpm

Can you test please? Then I am going to backport into 22.

Comment 13 Lukas Zapletal 2015-07-30 10:58:56 UTC

Works fine on f22 (SELinux must be permissive tho, a different bug).

Comment 14 Fedora Update System 2015-07-30 10:59:21 UTC

puppet-4.1.0-3.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/puppet-4.1.0-3.fc22

Comment 15 Erik Logtenberg 2015-07-30 12:38:00 UTC

Sorry I haven't come around to testing your fix yet. I am running self-built puppet packages that contain fixes and workarounds for several other outstanding bugs (both upstream and downstream), especially with regard to the puppet/F22 combination.
Anyway thanks for this fix, it brings us one step closer to being able to run puppet on F22 reliably :)

Comment 16 Lukas Zapletal 2015-07-30 14:44:27 UTC

Thanks for help, well Puppet is such a moving target... ;-)

Comment 17 Fedora Update System 2015-07-31 07:53:44 UTC

Package puppet-4.1.0-3.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing puppet-4.1.0-3.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-12407/puppet-4.1.0-3.fc22
then log in and leave karma (feedback).

Comment 18 Fedora Update System 2015-08-07 12:30:50 UTC

puppet-4.1.0-4.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/puppet-4.1.0-4.fc22

Comment 19 Fedora Update System 2015-09-09 00:22:44 UTC

puppet-4.1.0-4.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.