1389159 – RHCS 2 daemons can not dump core because PR_SET_DUMPABLE is set to 0 after setuid call

Bug 1389159 - RHCS 2 daemons can not dump core because PR_SET_DUMPABLE is set to 0 after setuid call

Summary: RHCS 2 daemons can not dump core because PR_SET_DUMPABLE is set to 0 after se...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	2.2
Assignee:	Brad Hubbard
QA Contact:	Vidushi Mishra
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-10-27 03:34 UTC by Brad Hubbard
Modified:	2017-07-30 15:15 UTC (History)
CC List:	7 users (show)
Fixed In Version:	RHEL: ceph-10.2.5-7.el7cp Ubuntu: ceph_10.2.5-3redhat1xenial
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-03-14 15:46:06 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	17650	None	None	None	2016-10-27 03:34:30 UTC
Red Hat Bugzilla	1423417	high	CLOSED	RGW daemons can not dump core because PR_SET_DUMPABLE is set to 0 after setuid call	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1427116	unspecified	CLOSED	RHCS 2 core dumps not getting generated as a ceph user on Ubuntu likely due clrearing away of the PR_SET_DUMPABLE flag ...	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHBA-2017:0514	normal	SHIPPED_LIVE	Red Hat Ceph Storage 2.2 bug fix and enhancement update	2017-03-21 07:24:26 UTC

Internal Links: 1423417 1427116

Description Brad Hubbard 2016-10-27 03:34:30 UTC

Description of problem:
When ceph-* drops drops privileges via setuid, core dumps are no longer
generated because its DUMPABLE flag is cleared.

Version-Release number of selected component (if applicable):
ceph-10.2.2-38.el7cp.x86_64

How reproducible:
100%

Steps to Reproduce:

This can cause EPERM errors on anything that calls PTRACE_ATTACH.

set DefaultLimitCORE=infinity in /etc/systemd/system.conf

$ sudo su - ceph

$ strace -p 49596
strace: attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted

$ gstack 49596
$

$ gcore 49596
ptrace: Operation not permitted.
You can't do that without a process to debug.
The program is not being run.
gcore: failed to create core.49596

$ sudo systemd-coredumpctl
No coredumps found.


After installing an rpm including Patrick's patch.

$ strace -p 48740
Process 48740 attached
futex(0x7f040334d9d0, FUTEX_WAIT, 48767, NULL^CProcess 48740 detached
 <detached ...>

$ gstack 48740|tail -5
#0  0x00007f04158edef7 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f04179a16e0 in Thread::join(void**) ()
#2  0x00007f0417a7dd62 in DispatchQueue::wait() ()
#3  0x00007f0417995f2b in SimpleMessenger::wait() ()
#4  0x00007f04172b3d66 in main ()

$ gcore 48740
...
Saved corefile core.48740

$ sudo kill -SIGSEGV 48740
$ sudo systemd-coredumpctl list
TIME                            PID   UID   GID SIG PRESENT EXE
Wed 2016-10-26 23:00:54 EDT   48740   167   167  11 * /usr/bin/ceph-osd

Comment 1 Brad Hubbard 2016-10-27 03:38:33 UTC

(In reply to Brad Hubbard from comment #0)
> 
> $ sudo systemd-coredumpctl
> No coredumps found.

Should read...

$ sudo kill -SIGSEGV 49596

$ sudo systemd-coredumpctl
No coredumps found.

Comment 5 Brad Hubbard 2016-12-30 23:34:59 UTC

https://github.com/ceph/ceph/pull/11736

Comment 6 Brad Hubbard 2016-12-31 23:17:48 UTC

Harald Klein came up with the following workaround to get around this.

"The following steps should enable core dump functionality if you did not adjust any settings in that regard already:

1) backup /lib/systemd/system/ceph-osd@.service
2) edit /lib/systemd/system/ceph-osd@.service and add the following in the [Service] section:

LimitCORE=infinity

3) adjust sysctl:

# sysctl -w fs.suid_dumpable=2
# sysctl -w kernel.core_uses_pid=1
# sysctl -w kernel.core_pattern=/tmp/core-%e-sig%s-user%u-group%g-pid%p-time%t

4) do a systemctl daemon-reload
5) verify that max core file size is unlimited, e.g. for osd id 1 in my test env:

# ps auxw | grep ceph-osd
ceph      2420  0.9  1.6 1234124 407088 ?      Ssl  07:37   0:03 /usr/bin/ceph-osd -f --cluster ceph --id 1 --setuser ceph --setgroup ceph
# cat /proc/2420/limits | grep core
Max core file size        unlimited            unlimited            bytes

6) the current settings should lead to a coredump being written as in the following example when the osd process segfaults:

# ls -ltr /tmp/core*
-rw-------. 1 root ceph 1048887296 Dec 30 07:48 /tmp/core-ceph-osd-sig11-user167-group167-pid2714-time1483102135"

Comment 16 errata-xmlrpc 2017-03-14 15:46:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0514.html

Note You need to log in before you can comment on or make changes to this bug.