Bug 1942891

Summary: glibc seems to have no mechanism to avoid growing /run/utmp resulting in login slowness and increase cpu consumption
Product: Red Hat Enterprise Linux 8 Reporter: jtougne
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED NOTABUG QA Contact: qe-baseos-tools-bugs
Severity: high Docs Contact:
Priority: high    
Version: 8.2CC: aegorenk, akarlsso, ashankar, bwelterl, codonell, dj, fweimer, jbreitwe, jorton, mnewsome, omejzlik, pfrankli, rmetrich, sipoyare
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-04-14 19:23:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vsftpd configuration none

Description jtougne 2021-03-25 10:03:51 UTC
Description of problem:

Hi,
As seen in a customer case 02896756, glibc seems to have no mechanism to avoid growing /run/utmp resulting in login slowness and increase cpu consumption.


This could be a real issue even on "normal" systems not experiencing this since a long-living system would see /run/utmp grow which would result in higher CPU consumption (and login slowness) with time.


Here's some technical extract:


> So the file /run/utmp is huge and slow to process because sshd (through calling PAM stack, hence pam_unix) processes 384 bytes entries.
> In the strace, it stopped in timeout after processing ~105000 entries.
> 
> The related code in PAM is shown below (backtrace created when PAM open /run/utmp):
> -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
> (gdb) bt
> #0  __GI___open64_nocancel (file=0x7fedd7159630 <default_file_name> "/var/run/utmp", oflag=524288)
>     at ../sysdeps/unix/sysv/linux/open64_nocancel.c:45
> #1  0x00007fedd7103cb1 in __libc_setutent () at ../login/utmp_file.c:146
> #2  0x00007fedd710348c in __setutent () at getutent_r.c:35
> #3  0x00007fedd91a7f53 in pam_modutil_getlogin (pamh=pamh@entry=0x562d8eec7400) at pam_modutil_getlogin.c:51
> #4  0x00007fedd488535c in pam_sm_open_session (pamh=0x562d8eec7400, flags=<optimized out>, argc=<optimized out>, 
>     argv=<optimized out>) at pam_unix_sess.c:91
> #5  0x00007fedd91a17b4 in _pam_dispatch_aux (use_c



Version-Release number of selected component (if applicable):
RHEL8.2
glibc.x86_64  2.28-101.el8  

How reproducible:
Have a huge /run/utmp file (50-300MB)

Steps to Reproduce:
1.
2.
3.

Actual results:
/run/utmp keep growing. Only a system reboot reduce that file size.
When it's too big the system consume lot of cpu, timeout. Customer said also no sft/and or ftp connections could be made.
The command "who" would never complete or take very long.

Expected results:
System should not be inpact by a large /run/utmp. It should be able to recover from that on its own, maybe by rotating it if its getting too big.



Additional info:

Comment 2 Carlos O'Donell 2021-03-25 21:01:12 UTC
In glibc-2.28-101 we did significant work to improve both the performance and correctness of several of the related APIs.

The work done was part of bug 1749439 (https://bugzilla.redhat.com/show_bug.cgi?id=1749439), which were driven by an observation similar to what you see using vsftpd (https://bugzilla.redhat.com/show_bug.cgi?id=1688848) in RHEL7.

If you ran a system before glibc-2.28-101 update then you might have lots of stale utmp entries that could cause performance problems as they are iterated.

Has the customer tried clearing the utmp entries and then ensuring that glibc-2.28-101 is in use to avoid the stale entries being generated?

I'm CC'ing the vsftpd team because last time we ran into this we were looking for a more stable reproducer.

Comment 3 Renaud Métrich 2021-03-26 06:59:14 UTC
Hello,

The customer is running glibc-2.28-101.el8.x86_64.
I have a local reproducer using vsftpd, it shows that /run/utmp increases in size every time a ftp session opens.
The file never shrinks, which causes performance issues with time, as the large number of entries delays everything that goes through the PAM stack and pam_unix module: sshd, vsftpd, sudo, su, ...

Reproducer below:

1. Install vsftpd, lftp and strace

  # yum -y install vsftpd lftp strace

2. Configure vsftpd (file attached for convenience)

/etc/vsftpd/vsftpd.conf:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
anonymous_enable=NO
local_enable=YES
write_enable=YES
local_umask=022
dirmessage_enable=YES
xferlog_enable=YES
connect_from_port_20=YES
xferlog_std_format=YES
idle_session_timeout=180
chroot_local_user=YES
chroot_list_enable=NO
listen=YES
listen_ipv6=NO
pam_service_name=vsftpd
session_support=YES
dual_log_enable=YES
hide_ids=NO
setproctitle_enable=NO
text_userdb_names=NO
max_per_ip=500
force_dot_files=YES
allow_writeable_chroot=YES
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

  # systemctl start vsftpd

3. Create a user for vsftpd

  # useradd ftpuser1
  # echo "redhat" | passwd --stdin ftpuser1

4. Connect to the FTP service multiple times

  # SEQ=0; while [ $SEQ -ne 4000 ]; do let SEQ++; lftp -u ftpuser1,redhat localhost -e 'ls;quit' >/dev/null; done


Result:

- /run/utmp grows indefinitely

  # utmpdump /run/utmp | wc -l
  Utmp dump of /run/utmp
  4000

- logins / PAM services will be delayed with time, as setutent() parses the file completely

  # strace -fttTvyy -o sshd.strace -s 1024 -p $(cat /run/sshd.pid) &
  # ssh localhost true
  # grep -c "/run/utmp" sshd.strace 
  4007

Comment 4 Renaud Métrich 2021-03-26 06:59:36 UTC
Created attachment 1766501 [details]
vsftpd configuration

Comment 5 jtougne 2021-03-26 07:00:20 UTC
Hi,FYI the /run/utmp is massive due to vsftpd, see BZ 1942597:

-We did a utmpdump and confirmed the above, it's vsftpd the culprit (and parameter session_support=yes):

utmpdump /run/utmp 
[8] [92690] [    ] [        ] [vsftpd:92690] [                    ] [0.0.0.0        ] [1970-01-01T00:00:00,000000+00:00]
[8] [92725] [    ] [        ] [vsftpd:92725] [                    ] [0.0.0.0        ] [1970-01-01T00:00:00,000000+00:00]
[8] [92788] [    ] [        ] [vsftpd:92788] [                    ] [0.0.0.0        ] [1970-01-01T00:00:00,000000+00:00]
[8] [92796] [    ] [        ] [vsftpd:92796] [                    ] [0.0.0.0        ] [1970-01-01T00:00:00,000000+00:00]
[8] [92812] [    ] [        ] [vsftpd:92812] [                    ] [0.0.0.0        ] [1970-01-01T00:00:00,000000+00:00]
[8] [92844] [    ] [        ] [vsftpd:92844] [                    ] [0.0.0.0        ] [1970-01-01T00:00:00,000000+00:00]
... (thousands and thousands of similar lines; see the incorrect timestamp also)

We also did a grep -v to exclude and there are hardly any lines not related to vsftpd.


Also, we did not ran that system before glibc-2.28-101 update, we deployed that server on RHEL8.2, a few weeks ago, we did not perform any update since then. It had already glibc.x86_64  2.28-101.el8.
One of the workaround the customer did was to restart the server, it then reduce the size of that tmp file /run/utmp (and that reduced the problem). At the same time they double the cpu specs of that VM.
What we see after that, is the issue still happening (server slowly eating more cpu), but less fast. We have graphs to prove that.

So it's never about any stale entries.

On another note, the customer will propose this, as root: "> /run/utmp" (to reduce the size).

Comment 6 Florian Weimer 2021-03-26 08:08:29 UTC
I think this issue is very different from the previous bugs we had in this area.

Looking at the utmpdump output, the order of the columns is:

type, PID, ID, user, line, host, address, time

So “vsftpd:92690” is the ”line” column of the first row. Apparently, it contains the PID (“92690” is also the value of the first column). vsftpd uses pututxline to update utmp file, which requires a match on the “line” column. Since the PID is part of the column value, over time, a row for every possible PID in the system is added to the file. Given the simple file structure, this results in very poor performance.

I don't think this behavior has changed. It suspect it wasn't visible before because the previous system had a much lower kernel.pid_max value, essentially capping the size of the utmp file at a few tens of thousands of rows due to PID reuse. Reducing kernel.pid_max to its previous value (which could be as low as 32768, see <https://access.redhat.com/solutions/874023> for the algorithm) may be a workaround.

Which parts of vsftpd session support does the customer need? Is it both the session logging via utmp and the PAM support? Keeping session logging in utmp would require vsftpd to assign its own small number to each concurrent session, and stop using the PID for that, I think.

Comment 8 jtougne 2021-03-26 09:19:39 UTC
(In reply to Florian Weimer from comment #6)
> I think this issue is very different from the previous bugs we had in this
> area.
> 
> Looking at the utmpdump output, the order of the columns is:
> 
> type, PID, ID, user, line, host, address, time
> 
> So “vsftpd:92690” is the ”line” column of the first row. Apparently, it
> contains the PID (“92690” is also the value of the first column). vsftpd
> uses pututxline to update utmp file, which requires a match on the “line”
> column. Since the PID is part of the column value, over time, a row for
> every possible PID in the system is added to the file. Given the simple file
> structure, this results in very poor performance.
> 
> I don't think this behavior has changed. It suspect it wasn't visible before
> because the previous system had a much lower kernel.pid_max value,
> essentially capping the size of the utmp file at a few tens of thousands of
> rows due to PID reuse. Reducing kernel.pid_max to its previous value (which
> could be as low as 32768, see <https://access.redhat.com/solutions/874023>
> for the algorithm) may be a workaround.
> 
> Which parts of vsftpd session support does the customer need? Is it both the
> session logging via utmp and the PAM support? Keeping session logging in
> utmp would require vsftpd to assign its own small number to each concurrent
> session, and stop using the PID for that, I think.


Hi, I can comment on the last question: "Which parts of vsftpd session support does the customer need?"
I verified with the customer and the answer is simple: none. They will come back to the default value of session_support=no in /etc/vsftpd/vsftpd.conf.
Someone configured that when their ftp server was running on RHEL6. While preparing the new ftp server on RHEL, it looked an important setting, description was not very clear, we migrated it over to RHEL8 vsftpd configuration.


Saying that and coming back to this BZ about glibc and "/run/utmp" getting too big and causing issues, it's still a problem in general (maybe not anymore for that customer).



(You can find some more explanation on the customer support case: https://access.redhat.com/support/cases/#/case/02896756/discussion, I think the content is out of topic as an answer your question)

Hope that clarifies,

Comment 9 Joe Orton 2021-03-26 13:19:59 UTC
(In reply to Florian Weimer from comment #6)
> So “vsftpd:92690” is the ”line” column of the first row. Apparently, it
> contains the PID (“92690” is also the value of the first column). vsftpd
> uses pututxline to update utmp file, which requires a match on the “line”
> column. Since the PID is part of the column value, over time, a row for
> every possible PID in the system is added to the file. Given the simple file
> structure, this results in very poor performance.

Thanks for the excellent analsysis Florian.

Is there some best practice here?  Should vsftpd construct ut_line differently for remote sessions, or is there some way to recycle/reuse the DEAD_PROCESS entries which seem to be accumulating here?  Should vsftpd do this manually?

Comment 10 Florian Weimer 2021-03-26 14:19:37 UTC
(In reply to Joe Orton from comment #9)
> (In reply to Florian Weimer from comment #6)
> > So “vsftpd:92690” is the ”line” column of the first row. Apparently, it
> > contains the PID (“92690” is also the value of the first column). vsftpd
> > uses pututxline to update utmp file, which requires a match on the “line”
> > column. Since the PID is part of the column value, over time, a row for
> > every possible PID in the system is added to the file. Given the simple file
> > structure, this results in very poor performance.
> 
> Thanks for the excellent analsysis Florian.
> 
> Is there some best practice here?  Should vsftpd construct ut_line
> differently for remote sessions, or is there some way to recycle/reuse the
> DEAD_PROCESS entries which seem to be accumulating here?  Should vsftpd do
> this manually?

We have ut_id matching (and ut_id matching only) if ut_id is not empty, so perhaps vsftpd should simply stop logging anonymous sessions with an empty ut_id string. If ut_id is not empty and refers to a user, the previous entry for that user will be reused. Assuming that only authenticated users are logged, this will bound the number of records to the number of configured users.

If we changed glibc to strictly implement the POSIX requirement (only match ut_id, even if it is empty), the current vsftpd logging of anonymous sessions would be fairly useless because there would be just one entry that keeps getting overwritten again and again. So I think making the vsftpd not to log anonymous sessions would still make sense.

(We need to verify all of this on Red Hat Enterprise Linux 6, and make sure that we did not regress behavior. I won't be able to do that work anytime soon, sorry.)

Comment 12 Florian Weimer 2021-04-14 19:23:08 UTC
I used the following approximation to the vsftpd behavior:

/* Test mostly extracted from vsftpd.  See vsf_insert_uwtmp and
   vsf_remove_uwtmp.  */

#include <utmpx.h>
#include <err.h>
#include <unistd.h>
#include <stdio.h>

int
main (void)
{
  struct utmpx ent = { };
  ent.ut_type = USER_PROCESS;
  ent.ut_pid = getpid ();
  snprintf (ent.ut_line, sizeof (ent.ut_line),
            "vsftpd:%lu", (unsigned long int) ent.ut_pid);
  setutxent ();
  if (pututxline (&ent) == NULL)
    err (1, "pututxline (USER_PROCESS)");
  endutxent ();
  updwtmpx (WTMPX_FILE, &ent);

  setutxent ();
  ent.ut_type = DEAD_PROCESS;
  if (pututxline (&ent) == NULL)
    err (1, "pututxline (DEAD_PROCESS)");
  endutxent ();
  updwtmpx (WTMPX_FILE, &ent);
}

I ran this in a loop, using:

# time for x in {1..100000} ; do \time ./pututxline ; done

I verified that with Red Hat Enterprise Linux 6.10 (glibc-2.12-1.212.el6_10.3.x86_64, kernel-2.6.32-754.35.1.el6.x86_64), kernel.pid_max defaults to 49152 on a random test machine. 

# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
44530

On a random Red Hat Enterprise Linux 7.9 machine (glibc-2.17-324.el7_9.x86_64, kernel-3.10.0-1160.24.1.el7.x86_64), kernel.pid_max is 65536, and we get:

# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
63764

On a Red Hat Enterprise Linux 8.4 machine (glibc-2.28-151.el8.x86_64, kernel-4.18.0-304.el8.x86_64), kernel.pid_max is 4194304, and the job is still running. It reached:

# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
146488

If I set kernel.pid_max manually to 999999 on the Red Hat Enterprise Linux 6.10 test machine, the file grows effectively without bounds, too:

# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
92088

I conclude that the reason for the observed behavioral change is indeed the kernel.pid_max change.

My recommendation would be for vfstpd to disable logging of anonymous sessions (where there is no identifier that can be used as a key) even if session logging is enabled. As there is nothing to change on the glibc side, I'm closing this bug.