Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
On systems with pagesize 64k, systemd-coredump fails when coredump is smaller than 1MB.
Failed to get COMM, falling back to the command line: No such process
Failed to get EXE, ignoring: No such process
See https://github.com/systemd/systemd/issues/17119
for details.
On Power9 systems, it's causing "BUG: soft lockup - CPU#4 stuck for 153s!" kernel errors. See BZ1896983 for details.
Version-Release number of selected component (if applicable):
RHEL-8.4.0-20210409.0 - RHEL-8.4 beta compose
kernel 4.18.0-304.el8.ppc64le
systemd-239-45.el8.ppc64le
How reproducible:
Always on systems with pagesize 64k with coredumps smaller than 1MiB. Error messages in dmesg:
Failed to get COMM, falling back to the command line: No such process
Failed to get EXE, ignoring: No such process
Steps to Reproduce:
1) git -c http.sslVerify=false clone https://gitlab.cee.redhat.com/kernel-performance/sched/scheduler-benchmarks.git See [1] bellow for troubleshooting.
2) cd scheduler-benchmarks/Scripts/
3) git checkout devel
4) cd 2021-Apr-15-coredump_BZ1896983/
5) ./run.sh
Actual results:
$ ./run.sh
PAGESIZE 65536
./run.sh: line 7: 791638 Segmentation fault (core dumped) ./coredump
[26505.992198] systemd-coredump[311374]: Failed to get COMM, falling back to the command line: No such process
[26505.992225] systemd-coredump[311374]: Failed to get EXE, ignoring: No such process
[27280.783527] coredump[311839]: segfault (11) at 0 nip 100005b8 lr 7fffa7704078 code 1 in coredump[10000000+10000]
[27280.783574] coredump[311839]: code: 60000000 60420000 3c401002 38427f00 4bffff40 fbe1fff8 f821ffc1 7c3f0b78
[27280.783616] coredump[311839]: code: 39200000 f93f0020 e93f0020 39400000 <91490000> 39200000 7d234b78 383f0040
[27280.796108] systemd-coredump[311840]: Failed to get COMM, falling back to the command line: No such process
[27280.796144] systemd-coredump[311840]: Failed to get EXE, ignoring: No such process
[30708.383015] coredump[791638]: segfault (11) at 0 nip 100005b8 lr 7fff98ef4078 code 1 in coredump[10000000+10000]
[30708.383053] coredump[791638]: code: 60000000 60420000 3c401002 38427f00 4bffff40 fbe1fff8 f821ffc1 7c3f0b78
[30708.383069] coredump[791638]: code: 39200000 f93f0020 e93f0020 39400000 <91490000> 39200000 7d234b78 383f0040
Expected results:
coredump is successfully generated. These two errors are NOT in dmesg:
[27280.796108] systemd-coredump[311840]: Failed to get COMM, falling back to the command line: No such process
[27280.796144] systemd-coredump[311840]: Failed to get EXE, ignoring: No such process
Additional info:
The issue is fixed in the upstream. See
https://github.com/systemd/systemd/issues/17119
[1] When git clone fails with
`fatal: unable to access 'https://gitlab.cee.redhat.com/kernel-performance/pbench.git/': SSL certificate problem: self signed certificate in certificate chain`
you can fix the problem by installing the Red Hat certificate:
```
curl --insecure --output /etc/pki/ca-trust/source/anchors/RH-IT-Root-CA.crt https://password.corp.redhat.com/RH-IT-Root-CA.crt
update-ca-trust extract
```
(In reply to Jiri Hladky from comment #3)
> we are hitting this issue on Power10 LPARs in IBM YZ. What is the plan to
> fix this in RHEL?
It is going to be fixed in 8.5.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (systemd bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2021:4469