Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1378929 - [3.3.0.32] Restarting systemd-journald causes master controllers to die
[3.3.0.32] Restarting systemd-journald causes master controllers to die
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.3.0
x86_64 Linux
unspecified Severity medium
: ---
: 3.3.1
Assigned To: Devan Goodwin
Mike Fiedler
:
: 1380570 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-09-23 10:17 EDT by Mike Fiedler
Modified: 2016-12-01 01:48 EST (History)
20 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-11-15 14:09:34 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Logs from fresh recreate at Fri Sep 23 10:25:29 EDT 2016 (2.90 MB, application/x-gzip)
2016-09-23 10:31 EDT, Mike Fiedler
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2749571 None None None 2016-11-04 02:15 EDT
Red Hat Product Errata RHSA-2016:2778 normal SHIPPED_LIVE Moderate: atomic-openshift-utils security and bug fix update 2016-11-15 19:08:29 EST

  None (edit)
Description Mike Fiedler 2016-09-23 10:17:08 EDT
Description of problem:

On 3.3.0.32 on a multi-master cluster, restarting systemd-journald on all nodes causes the master controllers to die.    The systemd units stop and the processes exit.   Logs show no panics or errors and there are no cores on the system.

We were restarting systemd-journald to pick up a fix for:  https://access.redhat.com/solutions/2117311

Version-Release number of selected component (if applicable):  3.3.0.32


How reproducible: always


Config:

All on AWS us-west-2b
3 combined master/etcd
1 master load balancer
2 infra nodes (router/registry)
5 application nodes

Steps to Reproduce:
1. Verify cluster is running 
2. In a loop, ssh to all nodes and systemctl restart systemd-journald
3. Watch master controller processes on all masters

Actual results:

All master controller processes stop

Expected results:

Cluster carries on normally

Additional info:

Contact @mffiedler for cluster access
Comment 1 Mike Fiedler 2016-09-23 10:31 EDT
Created attachment 1204192 [details]
Logs from fresh recreate at Fri Sep 23 10:25:29 EDT 2016

Added logs from all masters for fresh recreate.   Restart occurred at Fri Sep 23 10:25:29 EDT 2016



There are several recreates in there.   The first encounter was Sep 22 around 19:30.   At Sep 22 19:37 you can see the cluster go into a really bad state as all the lists started failing and nodes realized the cluster was hosed.
Comment 2 Mike Fiedler 2016-09-23 10:34:05 EDT
systemd was updated earlier.   The problem occurs on every restart of systemd, not just for the original update.
Comment 3 Mike Fiedler 2016-09-23 11:01:02 EDT
root@ip-172-31-51-247: ~ # systemctl status atomic-openshift-master-controllers                                                                                                                                                                                     [1/1860]
● atomic-openshift-master-controllers.service - Atomic OpenShift Master Controllers
   Loaded: loaded (/usr/lib/systemd/system/atomic-openshift-master-controllers.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Fri 2016-09-23 10:56:21 EDT; 26s ago
     Docs: https://github.com/openshift/origin
  Process: 94375 ExecStart=/usr/bin/openshift start master controllers --config=${CONFIG_FILE} $OPTIONS (code=killed, signal=PIPE)
 Main PID: 94375 (code=killed, signal=PIPE)
Comment 4 Mike Fiedler 2016-09-23 11:17:07 EDT
Systemd is 219-30.el7 

To be clear,  even though systemd-journald was restarted everywhere, this problem can be reproduced by just restarting it on the combined master/etcd.
Comment 5 Timothy St. Clair 2016-09-23 12:42:48 EDT
So logs don't show the window which appears @10:56 for a Controllers event.?.? and no OOM-ing either.  It looks like the api-sever is still executing after the event. 

$ grep Started messages.master1 | grep systemd
Sep 23 10:20:26 ip-172-31-51-245 systemd: Started Atomic OpenShift Master Controllers.
Sep 23 10:25:28 ip-172-31-51-245 systemd: Started Session 224 of user root.
Sep 23 10:25:28 ip-172-31-51-245 systemd: Started Flush Journal to Persistent Storage.
Sep 23 10:26:28 ip-172-31-51-245 systemd: Started Atomic OpenShift Node.
Sep 23 10:26:51 ip-172-31-51-245 systemd: Started Session 225 of user root.

$ grep systemd messages.master1 | grep -i Openshift
Sep 23 10:20:26 ip-172-31-51-245 systemd: Starting Atomic OpenShift Master Controllers...
Sep 23 10:20:26 ip-172-31-51-245 systemd: Started Atomic OpenShift Master Controllers.
Sep 23 10:26:27 ip-172-31-51-245 systemd: atomic-openshift-node.service holdoff time over, scheduling restart.
Sep 23 10:26:27 ip-172-31-51-245 systemd: Starting Atomic OpenShift Node...
Sep 23 10:26:28 ip-172-31-51-245 atomic-openshift-node: I0923 10:26:28.451318   87612 factory.go:54] Registering systemd factory
Sep 23 10:26:28 ip-172-31-51-245 atomic-openshift-node: I0923 10:26:28.453859   87612 oomparser.go:185] oomparser using systemd
Sep 23 10:26:28 ip-172-31-51-245 systemd: Started Atomic OpenShift Node.
Comment 6 Seth Jennings 2016-09-23 13:06:54 EDT
I was able to recreate once with a "systemctl daemon-reexec".  This is what the %post in systemd RPM does.

Let me first say that I don't think this is new.

https://bugzilla.redhat.com/show_bug.cgi?id=1317031#c20

I think that systemd upgrades have, at least in recent history, always taken out the master-controller.

It appears that the master-controller has a pipe open and a daemon-rexec or systemd-journald restart breaks it.  At some point in the future, the master-controller tries to access the pipe and gets a SIGPIPE.

The service unit does have:

Restart=on-failure
RestartSec=5s

However, SIGPIPE is considered a "clean" exit signal so it does not restart (search for SIGPIPE) here
https://www.freedesktop.org/software/systemd/man/systemd.service.html

So the process is not restarted by systemd.

systemd v219 and later has a mechanism to maintain file descriptors across daemon restart (thanks to Derek for finding this):
https://github.com/systemd/systemd/blob/master/NEWS#L2016

However, the service file must set FileDescriptorStoreMax to use it.

While it is included in the journald service file upstream for v219:
https://github.com/systemd/systemd/blob/v219/units/systemd-journald.service.in

It is not in the RHEL 7.2 service file.  It is patched in our rpm:

SOURCES]$ grep FileDescriptorStoreMax *
0125-Revert-journald-allow-restarting-journald-without-lo.patch:-FileDescriptorStoreMax=1024

Including that is likely the fix for this.
Comment 7 Mike Fiedler 2016-09-23 13:18:56 EDT
Installed 3.2 on an identical cluster config and ran the same scenario.   The problem was not reproducible.
Comment 8 Derek Carr 2016-09-23 14:40:38 EDT
Mike has since confirmed that it was reproduced on 3.2 using an identical cluster configuration, but it took time for the problem to manifest.

I am lowering the priority and moving this to UPCOMING_RELEASE.

The proper fix is to leverage the FileDescriptorStoreMax features introduced in systemd 219 but disabled.  I am moving this bug to the productization team to evaluate the risk of enabling the systemd feature so this problem does not occur.
Comment 9 Derek Carr 2016-09-23 14:43:55 EDT
See https://bugzilla.redhat.com/show_bug.cgi?id=1378929#c6 on the host change needed to prevent this from occurring.
Comment 10 Mike Fiedler 2016-09-23 15:00:40 EDT
Confirming comment 9.   The cluster mentioned in comment 7 did reproduce this issue when it was put under load during the restart of systemd.
Comment 12 Kenjiro Nakayama 2016-10-03 02:02:38 EDT
Can you increase the severity to high? It normally happens that the customer tuned /etc/systemd/journald.conf and restart the journald services in production environment. (One of the customer hit this issue, actually.)

Also, I appreciate it if you could add this issue to the Known-issue list in documentation - https://docs.openshift.com/container-platform/3.3/release_notes/ocp_3_3_release_notes.html#ocp-33-known-issues
Comment 15 Scott Dodson 2016-10-04 11:06:39 EDT
*** Bug 1380570 has been marked as a duplicate of this bug. ***
Comment 27 Gaoyun Pei 2016-11-03 06:08:18 EDT
Setup a multi-master ocp-3.3 cluster with openshift-ansible-3.3.44-1.git.0.48e34f6.el7.noarch.rpm, check the atomic-openshift-master-controllers.service configuration file, the service is using Restart=always now.

Run a script to restart systemd-journald on combined master/etcd continuously, after a while, atomic-openshift-master-controllers.service on the masters are all running well.

[root@ip-172-18-15-196 ~]# systemctl status atomic-openshift-master-controllers
● atomic-openshift-master-controllers.service - Atomic OpenShift Master Controllers
   Loaded: loaded (/usr/lib/systemd/system/atomic-openshift-master-controllers.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2016-11-03 05:24:12 EDT; 30min ago
     Docs: https://github.com/openshift/origin
 Main PID: 41694 (openshift)
   Memory: 50.3M
   CGroup: /system.slice/atomic-openshift-master-controllers.service
           └─41694 /usr/bin/openshift start master controllers --config=/etc/origin/master/master-config.yaml --loglevel=5 --listen=https://0.0.0.0:8444
Comment 28 Takayoshi Kimura 2016-11-03 20:58:02 EDT
I think we need to verify the status of atomic-openshift-master-api as well because in the duplicate bug BZ1380570 and customer support case, -api also stopped unexpectedly due to this issue.
Comment 29 Gaoyun Pei 2016-11-04 02:35:02 EDT
Also verified the status of atomic-openshift-master-api service on all masters were running. When systemd-journald restart on master, atomic-openshift-master-api also got restart, after that, the service would always be running.

@Mike, so is this working well for you now?
Comment 30 Devan Goodwin 2016-11-04 07:48:11 EDT
The api service has Restart=always for containerized deployments, but *not* for rpm deployments. I suspect this means it could happen there as well, this one is a bit hard to reproduce reliably as I think it requires logging messages to be sent at the right times.

I have applied the same fix for API service in this PR:

https://github.com/openshift/openshift-ansible/pull/2718
Comment 34 Mike Fiedler 2016-11-07 08:08:41 EST
Verified on container install (3.3.1.4) that restarting systemd-journald caused both master-api and master-controllers containers to exit and both were immediately restarted.
Comment 35 Mike Fiedler 2016-11-07 22:02:34 EST
Still have to check master-api on rpm install.   Moving back to ON_QA temporarily.
Comment 36 Johnny Liu 2016-11-07 22:45:24 EST
BTW, do we also need this fix for 3.4?
Comment 37 Mike Fiedler 2016-11-08 03:27:21 EST
On 3.3.1.4, verified 

rpm:  atomic-openshift-master-controllers and atomic-openshift-master-api are restarted by systemd.

containerized:  atomic-openshift-master-controllers and atomic-openshift-master-api are restarted by systemd.

Commands do fail during the restart interval, but the services restart.

Re: comment 36 the change seems to be on openshift-ansible master.   Is there anyplace it is missing?
Comment 38 Johnny Liu 2016-11-08 05:25:05 EST
For comment 36, I was using an older playbook to install my env. Sorry for my fault. Pls ignore it.



According to comment 37, move this bug to "verified".
Comment 39 Kenjiro Nakayama 2016-11-11 03:23:12 EST
Since this fix will be provided ansible, current userss will not get this fix. They need manual change by themselves, don't they?
On top of that, even if users got this fix, when they restart systemd-journald, OpenShift (master) services always restart.
I still think that this issue should be listed in known issue, but it is not necessary?
Comment 40 Devan Goodwin 2016-11-14 10:41:33 EST
It's probably a good idea to recommend it, however journald restarts should be relatively rare (I think?) and personally I'd be rebooting that host after one.
Comment 41 Kenjiro Nakayama 2016-11-14 10:52:27 EST
OK, I see. I may be worrying too much. If Engineering team says so, it is no problem for me.
Comment 43 errata-xmlrpc 2016-11-15 14:09:34 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:2778

Note You need to log in before you can comment on or make changes to this bug.