Bug 1899162 - RFE: let the admin configure the coredump naming
Summary: RFE: let the admin configure the coredump naming
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: systemd
Version: 8.3
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: 8.0
Assignee: systemd maint
QA Contact: Frantisek Sumsal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-18 16:12 UTC by Renaud Métrich
Modified: 2023-08-14 11:27 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Story
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5585291 0 None None None 2020-11-19 07:42:06 UTC

Description Renaud Métrich 2020-11-18 16:12:25 UTC
Description of problem:

On RHEL8, the coredump naming in /var/lib/systemd/coredump is hardcoded to "core.<COMM>.<UID>.<bootid>.<PID>.<TS>000000"

This needs to be enhanced to let the admin name the coredumps as he needs to, typically adding the <HOSTNAME> (hostname or container name), which is initially available when systemd-coredump executes through kernel.core_pattern.

This would help a lot analyzing OCP issues for examples.


Version-Release number of selected component (if applicable):

systemd-239 but also Upstream (looks like at least, from reading the code)


How reproducible:

ALWAYS

Steps to Reproduce:
1. Create a miniroot to execute "bash"

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# yum -y install --installroot=/tmp/test --releasever=/ bash
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

2. Spawn a container and crash bash in the container

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# systemd-nspawn -D /tmp/test bash
Spawning container test on /tmp/test.
Press ^] three times within 1s to kill container.

bash-4.4# ulimit -c unlimited
bash-4.4# function foo {
foo
}
bash-4.4# foo
Container test terminated by signal SEGV.
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

3. Check the core filename (it doesn't show the container name, which would be useful)

Actual results:

------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# ls /var/lib/systemd/coredump
core.bash.0.33efdaea7a3c4cce86184cbbb6c28368.2343.1605715274000000
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Expected results:

Some custom naming based on available properties

Comment 1 David Tardon 2020-11-19 09:28:33 UTC
AFAIK hostname is available in metadata, so what you want should already be possible with coredumpctl, like this:

# coredumpctl list _HOSTNAME=<hostname>

Comment 2 Renaud Métrich 2020-11-23 07:59:39 UTC
This doesn't work. coredumpctl shows always Hostname as the name of the host ("vm-rhel8" in my case), even though the %h passed is "test" (name of my container).

Additionally, even though this could be a possibility, it wouldn't be convenient since admins would have to use coredumpctl commands to find out what's they are interested into.

Comment 3 David Tardon 2020-11-23 09:46:27 UTC
(In reply to Renaud Métrich from comment #2)
> This doesn't work. coredumpctl shows always Hostname as the name of the host
> ("vm-rhel8" in my case), even though the %h passed is "test" (name of my
> container).

Yeah, I didn't check... The right field is COREDUMP_HOSTNAME.

> Additionally, even though this could be a possibility, it wouldn't be
> convenient since admins would have to use coredumpctl commands to find out
> what's they are interested into.

IMHO that's what they should do anyway. It allows to match all available metadata and it is more reliable than parsing coredump filenames with ad hoc globs or regexes.

Comment 4 Renaud Métrich 2020-11-23 09:57:48 UTC
Indeed this works.
However, this isn't "reliable" since coredumpctl bases itself on the journal, so:
- if the journal rotates / vacuums, you may not have the information
- if the journal is not persistent, after reboot you lose the information

Comment 5 Zbigniew Jędrzejewski-Szmek 2020-12-01 10:28:40 UTC
To summarize the discussion during the upstream meeting today:
The file name is supposed to be unique and somewhat informative, but the details are not fixed API.
It does include the machine id, so files from different containers are somewhat segregated.

Two "official" interfaces exist:
1. the journal entry which has all the metadata
2. the extended attributes on the file

$ sudo getfattr --absolute-names -d /var/lib/systemd/coredump/core.systemd.0.b07239dbd2264cc2bf9b070929ead2a7.697927.1606814763000000.zst
# file: var/lib/systemd/coredump/core.systemd.0.b07239dbd2264cc2bf9b070929ead2a7.697927.1606814763000000.zst
user.coredump.comm="systemd"
user.coredump.exe="/usr/lib/systemd/systemd"
user.coredump.gid="0"
user.coredump.hostname="rawhide"
user.coredump.pid="697927"
user.coredump.rlimit="18446744073709551615"
user.coredump.signal="11"
user.coredump.timestamp="1606814763000000"
user.coredump.uid="0"

The attributes obviously stay behind even if the journal entry is removed.

Comment 12 Christoph Obexer 2022-05-12 09:20:54 UTC
A list of all the required information for coredump analysis:
 * Kubernetes namespace
 * Pod name
 * Container name
 * Executable name
 * Signal
 * Timestamp
 * User
 * Group
 * Image used to run the binary
 * SELinux information
 * possibly more

Putting all of these in the filename is not going to work...

How about we instead get as much of that metadata as possible into an appropriate place like the system journal and/or the core dump file attributes or a file next to the dump?

Pros:
 * less stuff to configure
 * fewer things to coordinate across teams
 * more relevant information available
 * defined API

Cons:
 * None

Comment 13 Renaud Métrich 2022-05-12 11:10:21 UTC
There are already metadata stored as extended attributes.


Note You need to log in before you can comment on or make changes to this bug.