Bug 1714117 - F30, user sessions fail with: systemd[PID]: PAM failed: Cannot allocate memory
Summary: F30, user sessions fail with: systemd[PID]: PAM failed: Cannot allocate memory
Keywords:
Status: CLOSED DUPLICATE of bug 1705641
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 30
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: systemd-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-27 07:13 UTC by Francis Montagnac
Modified: 2019-11-04 15:07 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2019-08-05 20:59:32 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
the journal of the normal boot that shows this problem (180.02 KB, application/gzip)
2019-05-27 07:13 UTC, Francis Montagnac
no flags Details
the journal with mem=16384M (209.27 KB, application/gzip)
2019-05-27 07:17 UTC, Francis Montagnac
no flags Details
Top ten processes for vsz (809 bytes, text/plain)
2019-05-27 07:33 UTC, Francis Montagnac
no flags Details
ltrace of pid 22591 (74.45 KB, text/plain)
2019-06-03 10:46 UTC, Francis Montagnac
no flags Details
Traces on a VMware VM (6.62 KB, text/plain)
2019-07-02 08:47 UTC, Francis Montagnac
no flags Details

Description Francis Montagnac 2019-05-27 07:13:53 UTC
Created attachment 1573778 [details]
the journal of the normal boot that shows this problem

Description of problem:

  Unable to login (graphical, textual, ssh) except as root

  The journal shows:

      systemd[1]: Started Getty on tty2.
      systemd[1864]: pam_unix(systemd-user:session): session opened for user fm by (uid=0)
      systemd[1864]: PAM failed: Cannot allocate memory

  I add time, from a second textual console as root, to see the
  "systemd --user" process reach more than 40 GB of memory before
  dying

Version-Release number of selected component (if applicable):

    systemd-241-8.git9ef65cb.fc30.x86_64

How reproducible: always

Steps to Reproduce:
  1. boot the machine - in multi-user.target is sufficient
  2. try to login as a normal user

Actual results:

  After une few seconds you fall back to the login prompt

Expected results:

  Being able to login

Additional info:

  Tries that do not solve this problem:

   - re-installing kernel, systemd
   - downgrading systemd to systemd-239-12.git8bca462.fc29
   - fresh install in F30

  Tries that solve this problem:

    - Adding mem=8192M or mem=16384M to the kernel command line
      Really weird.
      This machine has 64 G of memory
    - Commenting pam_systemd.so from PAM (but this is not a solution)

  Before upgrading (then re-installing) this machine, I successfully
  upgraded/installed 4 other ones (2 VMware VMs and 2 physical
  machines) in the same manner.

  I administrate a park of more than 400 machines running Fedora for
  years. I am thus stuck upgrading to F30.

  This particular machine is my main station that runs fine since past
  november: fresh install in F28, upgraded to F29, then to F30 were
  this problem showed up. A re-install did not solved this problem

  I put systemd and pam_systemd in debug mode. Find please as
  attachments:

    journal-FAIL-2019-05-27.txt.gz
       the journal of the normal boot that show this problem 

    journal-mem-16384M-OK-2019-05-27.txt.gz
       the journal with mem=16384M

Comment 1 Francis Montagnac 2019-05-27 07:17:28 UTC
Created attachment 1573780 [details]
the journal with mem=16384M

Comment 2 Francis Montagnac 2019-05-27 07:33:30 UTC
Created attachment 1573784 [details]
Top ten processes for vsz

Comment 3 Francis Montagnac 2019-06-01 06:44:42 UTC
Hi.

This is perhaps related to:

    Bug 1680654 - pam_systemd should stay optional in RHEL-8 (revert 1643928) 
    https://bugzilla.redhat.com/show_bug.cgi?id=1680654

and I have setup authselect with pam_systemd required, as it did
initially for F29,

I will test that next Monday.

Comment 4 Francis Montagnac 2019-06-03 10:46:54 UTC
Created attachment 1576591 [details]
ltrace of pid 22591

Comment 5 Francis Montagnac 2019-06-03 10:48:16 UTC
Tested this morning: put pam_systemd.so optional in
/etc/pam.d/password-auth and /etc/pam.d/system-auth

The bug remains, but the login is now, as expected, possible.

Attempting a "systemctl start user" shows also this bug.

I managed to ltrace that, with the following debuginfo RPMs installed:

systemd-debuginfo.x86_64       241-8.git9ef65cb.fc30
systemd-debugsource.x86_64     241-8.git9ef65cb.fc30
systemd-libs-debuginfo.x86_64  241-8.git9ef65cb.fc30
systemd-pam-debuginfo.x86_64   241-8.git9ef65cb.fc30

See ths ltrace (ltrace -f -tt -S -s 1000 -p 22591) attached.

Any other way to track down this bug welcome.

Part of the journal during this test:

grep -w 22591 debug-journalctl-b-2019-06-03
Jun 03 07:56:54 kermit systemd[1]: user: Forked /usr/lib/systemd/systemd as 22591
Jun 03 07:57:28 kermit audit[22591]: USER_ACCT pid=22591 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_unix,pam_permit acct="fm" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jun 03 07:57:28 kermit systemd[22591]: pam_unix(systemd-user:session): session opened for user fm by (uid=0)
Jun 03 07:57:28 kermit audit[22591]: USER_START pid=22591 uid=0 auid=1005 ses=10 msg='op=PAM:session_open grantors=pam_selinux,pam_selinux,pam_loginuid,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="fm" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jun 03 07:57:28 kermit systemd[22591]: Failed to fork: Cannot allocate memory
Jun 03 07:57:28 kermit systemd[22591]: PAM failed: Cannot allocate memory
Jun 03 07:57:28 kermit systemd[22591]: pam_unix(systemd-user:session): session closed for user fm
Jun 03 07:57:28 kermit audit[22591]: USER_END pid=22591 uid=0 auid=1005 ses=10 msg='op=PAM:session_close grantors=pam_selinux,pam_selinux,pam_loginuid,pam_keyinit,pam_limits,pam_systemd,pam_unix acct="fm" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jun 03 07:57:28 kermit systemd[22591]: user: Failed to set up PAM session: Cannot allocate memory
Jun 03 07:57:28 kermit systemd[22591]: user: Failed at step PAM spawning /usr/lib/systemd/systemd: Cannot allocate memory
Jun 03 07:57:30 kermit systemd[1]: Received SIGCHLD from PID 22591 ((systemd)).
Jun 03 07:57:30 kermit systemd[1]: Child 22591 ((systemd)) died (code=exited, status=224/PAM)
Jun 03 07:57:30 kermit systemd[1]: user: Child 22591 belongs to user.

Comment 6 Francis Montagnac 2019-07-02 08:47:48 UTC
Created attachment 1586579 [details]
Traces on a VMware VM

Hi.

Sorry to say that, but I am surprised that this bug hasn't been taken
for a month :-(

I managed to reproduce this bug also on a VMware VM:

   OK with 16 G of memory
   FAIL with 64

See the file attached.

In addition, the bug doesn't show up if one kept only files for the
group entry in /etc/nsswitch.conf:

  group:      files

instead of:

  group:      files nis systemd

Finally an ltrace shows a suspect calls to mmap during the call of
initgroups.

See also the file attached.

Any advice to circumvent that greatly appreciated.

Thanks

Comment 7 Zbigniew Jędrzejewski-Szmek 2019-08-05 20:59:32 UTC

*** This bug has been marked as a duplicate of bug 1705641 ***


Note You need to log in before you can comment on or make changes to this bug.