Bug 1714117
| Summary: | F30, user sessions fail with: systemd[PID]: PAM failed: Cannot allocate memory | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Francis Montagnac <francis.montagnac> | ||||||||||||
| Component: | systemd | Assignee: | systemd-maint | ||||||||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||
| Priority: | unspecified | ||||||||||||||
| Version: | 30 | CC: | jonied, lnykryn, msekleta, ssahani, s, systemd-maint, zbyszek | ||||||||||||
| Target Milestone: | --- | ||||||||||||||
| Target Release: | --- | ||||||||||||||
| Hardware: | x86_64 | ||||||||||||||
| OS: | Linux | ||||||||||||||
| Whiteboard: | |||||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||
| Clone Of: | Environment: | ||||||||||||||
| Last Closed: | 2019-08-05 20:59:32 UTC | Type: | Bug | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Attachments: |
|
||||||||||||||
Created attachment 1573780 [details]
the journal with mem=16384M
Created attachment 1573784 [details]
Top ten processes for vsz
Hi.
This is perhaps related to:
Bug 1680654 - pam_systemd should stay optional in RHEL-8 (revert 1643928)
https://bugzilla.redhat.com/show_bug.cgi?id=1680654
and I have setup authselect with pam_systemd required, as it did
initially for F29,
I will test that next Monday.
Created attachment 1576591 [details]
ltrace of pid 22591
Tested this morning: put pam_systemd.so optional in /etc/pam.d/password-auth and /etc/pam.d/system-auth The bug remains, but the login is now, as expected, possible. Attempting a "systemctl start user" shows also this bug. I managed to ltrace that, with the following debuginfo RPMs installed: systemd-debuginfo.x86_64 241-8.git9ef65cb.fc30 systemd-debugsource.x86_64 241-8.git9ef65cb.fc30 systemd-libs-debuginfo.x86_64 241-8.git9ef65cb.fc30 systemd-pam-debuginfo.x86_64 241-8.git9ef65cb.fc30 See ths ltrace (ltrace -f -tt -S -s 1000 -p 22591) attached. Any other way to track down this bug welcome. Part of the journal during this test: grep -w 22591 debug-journalctl-b-2019-06-03 Jun 03 07:56:54 kermit systemd[1]: user: Forked /usr/lib/systemd/systemd as 22591 Jun 03 07:57:28 kermit audit[22591]: USER_ACCT pid=22591 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_unix,pam_permit acct="fm" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Jun 03 07:57:28 kermit systemd[22591]: pam_unix(systemd-user:session): session opened for user fm by (uid=0) Jun 03 07:57:28 kermit audit[22591]: USER_START pid=22591 uid=0 auid=1005 ses=10 msg='op=PAM:session_open grantors=pam_selinux,pam_selinux,pam_loginuid,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="fm" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Jun 03 07:57:28 kermit systemd[22591]: Failed to fork: Cannot allocate memory Jun 03 07:57:28 kermit systemd[22591]: PAM failed: Cannot allocate memory Jun 03 07:57:28 kermit systemd[22591]: pam_unix(systemd-user:session): session closed for user fm Jun 03 07:57:28 kermit audit[22591]: USER_END pid=22591 uid=0 auid=1005 ses=10 msg='op=PAM:session_close grantors=pam_selinux,pam_selinux,pam_loginuid,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="fm" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Jun 03 07:57:28 kermit systemd[22591]: user: Failed to set up PAM session: Cannot allocate memory Jun 03 07:57:28 kermit systemd[22591]: user: Failed at step PAM spawning /usr/lib/systemd/systemd: Cannot allocate memory Jun 03 07:57:30 kermit systemd[1]: Received SIGCHLD from PID 22591 ((systemd)). Jun 03 07:57:30 kermit systemd[1]: Child 22591 ((systemd)) died (code=exited, status=224/PAM) Jun 03 07:57:30 kermit systemd[1]: user: Child 22591 belongs to user. Created attachment 1586579 [details]
Traces on a VMware VM
Hi.
Sorry to say that, but I am surprised that this bug hasn't been taken
for a month :-(
I managed to reproduce this bug also on a VMware VM:
OK with 16 G of memory
FAIL with 64
See the file attached.
In addition, the bug doesn't show up if one kept only files for the
group entry in /etc/nsswitch.conf:
group: files
instead of:
group: files nis systemd
Finally an ltrace shows a suspect calls to mmap during the call of
initgroups.
See also the file attached.
Any advice to circumvent that greatly appreciated.
Thanks
*** This bug has been marked as a duplicate of bug 1705641 *** |
Created attachment 1573778 [details] the journal of the normal boot that shows this problem Description of problem: Unable to login (graphical, textual, ssh) except as root The journal shows: systemd[1]: Started Getty on tty2. systemd[1864]: pam_unix(systemd-user:session): session opened for user fm by (uid=0) systemd[1864]: PAM failed: Cannot allocate memory I add time, from a second textual console as root, to see the "systemd --user" process reach more than 40 GB of memory before dying Version-Release number of selected component (if applicable): systemd-241-8.git9ef65cb.fc30.x86_64 How reproducible: always Steps to Reproduce: 1. boot the machine - in multi-user.target is sufficient 2. try to login as a normal user Actual results: After une few seconds you fall back to the login prompt Expected results: Being able to login Additional info: Tries that do not solve this problem: - re-installing kernel, systemd - downgrading systemd to systemd-239-12.git8bca462.fc29 - fresh install in F30 Tries that solve this problem: - Adding mem=8192M or mem=16384M to the kernel command line Really weird. This machine has 64 G of memory - Commenting pam_systemd.so from PAM (but this is not a solution) Before upgrading (then re-installing) this machine, I successfully upgraded/installed 4 other ones (2 VMware VMs and 2 physical machines) in the same manner. I administrate a park of more than 400 machines running Fedora for years. I am thus stuck upgrading to F30. This particular machine is my main station that runs fine since past november: fresh install in F28, upgraded to F29, then to F30 were this problem showed up. A re-install did not solved this problem I put systemd and pam_systemd in debug mode. Find please as attachments: journal-FAIL-2019-05-27.txt.gz the journal of the normal boot that show this problem journal-mem-16384M-OK-2019-05-27.txt.gz the journal with mem=16384M