Bug 1693559 - sd-bus: deal with cookie overruns
Summary: sd-bus: deal with cookie overruns
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: systemd
Version: 7.4
Hardware: Unspecified
OS: Unspecified
Target Milestone: rc
: ---
Assignee: Jan Synacek
QA Contact: Frantisek Sumsal
: 1716581 1719004 (view as bug list)
Depends On:
Blocks: 1186913 1688348 1720699
TreeView+ depends on / blocked
Reported: 2019-03-28 07:51 UTC by wbs9399
Modified: 2019-09-06 19:14 UTC (History)
12 users (show)

Fixed In Version: systemd-219-65.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1694999 1720699 (view as bug list)
Last Closed: 2019-08-06 12:43:57 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2091 None None None 2019-08-06 12:44:19 UTC
Red Hat Knowledge Base (Solution) 4214501 Troubleshoot None Containers may be stuck in "Terminating" state when using OpenShift or Docker 2019-06-14 10:58:52 UTC

Description wbs9399 2019-03-28 07:51:27 UTC
Yesterday one node of my kubernetes cluster became notready. ps -ef showed some docker-runc processes had been running many days

root 26579 1303 0 2018 ? 00:00:00 docker-runc --systemd-cgroup=true events --stats c29996ea9566f16616505e7118315635582714308564ba0d9a70f8fb8cf73f0a
root 27841 2913 0 2018 ? 00:00:00 docker-runc --systemd-cgroup=true kill --all 8561b78c9cb19c0d883e30eafc8ff41ddf3007043985271386ffdbafc24d4376 SIGKILL
root 28293 1303 0 2018 ? 00:00:00 docker-runc --systemd-cgroup=true delete 25660e4c1f66593ec33ae57823def641a4c4a9ae1a7c6840afd081961b66e66e

After some investigation, I found docker-runc hang when calling systemd.UseSystemd. Below is the stack.

In fact, any dbus method call send to org.freedesktop.systemd1 was not responsed, for example, the below command would wait forever:

dbus-send --system --dest=org.freedesktop.systemd1 --type=method_call --print-reply /org/freedesktop/systemd1 org.freedesktop.DBus.Introspectable.Introspect

Also there were many systemd errors in /var/log/messages:
Jan 4 11:56:31 host-k8s-node001 systemd: Failed to propagate agent release message: Operation not supported

busctl tree reported Failed to introspect object / of service org.freedesktop.systemd1: Connection timed out

Resolved by restarting systemd: systemctl daemon-reexec

more stack info ref: https://github.com/opencontainers/runc/issues/1959

Comment 2 wbs9399 2019-03-28 08:03:02 UTC
This issue fixed by https://github.com/systemd/systemd/pull/11818 in systemd upstream.

Will the rhel embedded systemd cherry-pick this fix? and witch version will resolve this?

Comment 3 Jan Synacek 2019-04-02 09:03:19 UTC

Comment 5 Lukáš Nykrýn 2019-04-02 10:30:26 UTC
fix merged to staging branch -> https://github.com/lnykryn/systemd-rhel/pull/322 -> post

Comment 6 wbs9399 2019-04-15 08:09:39 UTC
@Lukáš Nykrýn  When will this fix in systemd releases?

Comment 7 Jan Synacek 2019-04-15 08:28:33 UTC
We don't give any release dates. The fix is currently scheduled to be released in 7.7.

Comment 9 Michal Sekletar 2019-06-14 10:58:52 UTC
*** Bug 1719004 has been marked as a duplicate of this bug. ***

Comment 16 Jerry 2019-07-10 12:24:53 UTC
(In reply to Jan Synacek from comment #7)
> We don't give any release dates. The fix is currently scheduled to be
> released in 7.7.

Is there any possibility that this fix can/will be deployed as a patch for RHEL 7.6?

Comment 17 Filip Krska 2019-07-10 14:40:31 UTC
Yep, 7.6 Z-Stream request is tracked by public Bug 1720699, currently ON_QA...

Comment 20 errata-xmlrpc 2019-08-06 12:43:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Comment 21 Wayah Wurzbacher 2019-08-08 20:29:07 UTC
*** Bug 1716581 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.