Bug 1942891
| Summary: | glibc seems to have no mechanism to avoid growing /run/utmp resulting in login slowness and increase cpu consumption | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | jtougne | ||||
| Component: | glibc | Assignee: | glibc team <glibc-bugzilla> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | qe-baseos-tools-bugs | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 8.2 | CC: | aegorenk, akarlsso, ashankar, bwelterl, codonell, dj, fweimer, jbreitwe, jorton, mnewsome, omejzlik, pfrankli, rmetrich, sipoyare | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-04-14 19:23:08 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
In glibc-2.28-101 we did significant work to improve both the performance and correctness of several of the related APIs. The work done was part of bug 1749439 (https://bugzilla.redhat.com/show_bug.cgi?id=1749439), which were driven by an observation similar to what you see using vsftpd (https://bugzilla.redhat.com/show_bug.cgi?id=1688848) in RHEL7. If you ran a system before glibc-2.28-101 update then you might have lots of stale utmp entries that could cause performance problems as they are iterated. Has the customer tried clearing the utmp entries and then ensuring that glibc-2.28-101 is in use to avoid the stale entries being generated? I'm CC'ing the vsftpd team because last time we ran into this we were looking for a more stable reproducer. Hello, The customer is running glibc-2.28-101.el8.x86_64. I have a local reproducer using vsftpd, it shows that /run/utmp increases in size every time a ftp session opens. The file never shrinks, which causes performance issues with time, as the large number of entries delays everything that goes through the PAM stack and pam_unix module: sshd, vsftpd, sudo, su, ... Reproducer below: 1. Install vsftpd, lftp and strace # yum -y install vsftpd lftp strace 2. Configure vsftpd (file attached for convenience) /etc/vsftpd/vsftpd.conf: -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- anonymous_enable=NO local_enable=YES write_enable=YES local_umask=022 dirmessage_enable=YES xferlog_enable=YES connect_from_port_20=YES xferlog_std_format=YES idle_session_timeout=180 chroot_local_user=YES chroot_list_enable=NO listen=YES listen_ipv6=NO pam_service_name=vsftpd session_support=YES dual_log_enable=YES hide_ids=NO setproctitle_enable=NO text_userdb_names=NO max_per_ip=500 force_dot_files=YES allow_writeable_chroot=YES -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- # systemctl start vsftpd 3. Create a user for vsftpd # useradd ftpuser1 # echo "redhat" | passwd --stdin ftpuser1 4. Connect to the FTP service multiple times # SEQ=0; while [ $SEQ -ne 4000 ]; do let SEQ++; lftp -u ftpuser1,redhat localhost -e 'ls;quit' >/dev/null; done Result: - /run/utmp grows indefinitely # utmpdump /run/utmp | wc -l Utmp dump of /run/utmp 4000 - logins / PAM services will be delayed with time, as setutent() parses the file completely # strace -fttTvyy -o sshd.strace -s 1024 -p $(cat /run/sshd.pid) & # ssh localhost true # grep -c "/run/utmp" sshd.strace 4007 Created attachment 1766501 [details]
vsftpd configuration
Hi,FYI the /run/utmp is massive due to vsftpd, see BZ 1942597: -We did a utmpdump and confirmed the above, it's vsftpd the culprit (and parameter session_support=yes): utmpdump /run/utmp [8] [92690] [ ] [ ] [vsftpd:92690] [ ] [0.0.0.0 ] [1970-01-01T00:00:00,000000+00:00] [8] [92725] [ ] [ ] [vsftpd:92725] [ ] [0.0.0.0 ] [1970-01-01T00:00:00,000000+00:00] [8] [92788] [ ] [ ] [vsftpd:92788] [ ] [0.0.0.0 ] [1970-01-01T00:00:00,000000+00:00] [8] [92796] [ ] [ ] [vsftpd:92796] [ ] [0.0.0.0 ] [1970-01-01T00:00:00,000000+00:00] [8] [92812] [ ] [ ] [vsftpd:92812] [ ] [0.0.0.0 ] [1970-01-01T00:00:00,000000+00:00] [8] [92844] [ ] [ ] [vsftpd:92844] [ ] [0.0.0.0 ] [1970-01-01T00:00:00,000000+00:00] ... (thousands and thousands of similar lines; see the incorrect timestamp also) We also did a grep -v to exclude and there are hardly any lines not related to vsftpd. Also, we did not ran that system before glibc-2.28-101 update, we deployed that server on RHEL8.2, a few weeks ago, we did not perform any update since then. It had already glibc.x86_64 2.28-101.el8. One of the workaround the customer did was to restart the server, it then reduce the size of that tmp file /run/utmp (and that reduced the problem). At the same time they double the cpu specs of that VM. What we see after that, is the issue still happening (server slowly eating more cpu), but less fast. We have graphs to prove that. So it's never about any stale entries. On another note, the customer will propose this, as root: "> /run/utmp" (to reduce the size). I think this issue is very different from the previous bugs we had in this area. Looking at the utmpdump output, the order of the columns is: type, PID, ID, user, line, host, address, time So “vsftpd:92690” is the ”line” column of the first row. Apparently, it contains the PID (“92690” is also the value of the first column). vsftpd uses pututxline to update utmp file, which requires a match on the “line” column. Since the PID is part of the column value, over time, a row for every possible PID in the system is added to the file. Given the simple file structure, this results in very poor performance. I don't think this behavior has changed. It suspect it wasn't visible before because the previous system had a much lower kernel.pid_max value, essentially capping the size of the utmp file at a few tens of thousands of rows due to PID reuse. Reducing kernel.pid_max to its previous value (which could be as low as 32768, see <https://access.redhat.com/solutions/874023> for the algorithm) may be a workaround. Which parts of vsftpd session support does the customer need? Is it both the session logging via utmp and the PAM support? Keeping session logging in utmp would require vsftpd to assign its own small number to each concurrent session, and stop using the PID for that, I think. (In reply to Florian Weimer from comment #6) > I think this issue is very different from the previous bugs we had in this > area. > > Looking at the utmpdump output, the order of the columns is: > > type, PID, ID, user, line, host, address, time > > So “vsftpd:92690” is the ”line” column of the first row. Apparently, it > contains the PID (“92690” is also the value of the first column). vsftpd > uses pututxline to update utmp file, which requires a match on the “line” > column. Since the PID is part of the column value, over time, a row for > every possible PID in the system is added to the file. Given the simple file > structure, this results in very poor performance. > > I don't think this behavior has changed. It suspect it wasn't visible before > because the previous system had a much lower kernel.pid_max value, > essentially capping the size of the utmp file at a few tens of thousands of > rows due to PID reuse. Reducing kernel.pid_max to its previous value (which > could be as low as 32768, see <https://access.redhat.com/solutions/874023> > for the algorithm) may be a workaround. > > Which parts of vsftpd session support does the customer need? Is it both the > session logging via utmp and the PAM support? Keeping session logging in > utmp would require vsftpd to assign its own small number to each concurrent > session, and stop using the PID for that, I think. Hi, I can comment on the last question: "Which parts of vsftpd session support does the customer need?" I verified with the customer and the answer is simple: none. They will come back to the default value of session_support=no in /etc/vsftpd/vsftpd.conf. Someone configured that when their ftp server was running on RHEL6. While preparing the new ftp server on RHEL, it looked an important setting, description was not very clear, we migrated it over to RHEL8 vsftpd configuration. Saying that and coming back to this BZ about glibc and "/run/utmp" getting too big and causing issues, it's still a problem in general (maybe not anymore for that customer). (You can find some more explanation on the customer support case: https://access.redhat.com/support/cases/#/case/02896756/discussion, I think the content is out of topic as an answer your question) Hope that clarifies, (In reply to Florian Weimer from comment #6) > So “vsftpd:92690” is the ”line” column of the first row. Apparently, it > contains the PID (“92690” is also the value of the first column). vsftpd > uses pututxline to update utmp file, which requires a match on the “line” > column. Since the PID is part of the column value, over time, a row for > every possible PID in the system is added to the file. Given the simple file > structure, this results in very poor performance. Thanks for the excellent analsysis Florian. Is there some best practice here? Should vsftpd construct ut_line differently for remote sessions, or is there some way to recycle/reuse the DEAD_PROCESS entries which seem to be accumulating here? Should vsftpd do this manually? (In reply to Joe Orton from comment #9) > (In reply to Florian Weimer from comment #6) > > So “vsftpd:92690” is the ”line” column of the first row. Apparently, it > > contains the PID (“92690” is also the value of the first column). vsftpd > > uses pututxline to update utmp file, which requires a match on the “line” > > column. Since the PID is part of the column value, over time, a row for > > every possible PID in the system is added to the file. Given the simple file > > structure, this results in very poor performance. > > Thanks for the excellent analsysis Florian. > > Is there some best practice here? Should vsftpd construct ut_line > differently for remote sessions, or is there some way to recycle/reuse the > DEAD_PROCESS entries which seem to be accumulating here? Should vsftpd do > this manually? We have ut_id matching (and ut_id matching only) if ut_id is not empty, so perhaps vsftpd should simply stop logging anonymous sessions with an empty ut_id string. If ut_id is not empty and refers to a user, the previous entry for that user will be reused. Assuming that only authenticated users are logged, this will bound the number of records to the number of configured users. If we changed glibc to strictly implement the POSIX requirement (only match ut_id, even if it is empty), the current vsftpd logging of anonymous sessions would be fairly useless because there would be just one entry that keeps getting overwritten again and again. So I think making the vsftpd not to log anonymous sessions would still make sense. (We need to verify all of this on Red Hat Enterprise Linux 6, and make sure that we did not regress behavior. I won't be able to do that work anytime soon, sorry.) I used the following approximation to the vsftpd behavior:
/* Test mostly extracted from vsftpd. See vsf_insert_uwtmp and
vsf_remove_uwtmp. */
#include <utmpx.h>
#include <err.h>
#include <unistd.h>
#include <stdio.h>
int
main (void)
{
struct utmpx ent = { };
ent.ut_type = USER_PROCESS;
ent.ut_pid = getpid ();
snprintf (ent.ut_line, sizeof (ent.ut_line),
"vsftpd:%lu", (unsigned long int) ent.ut_pid);
setutxent ();
if (pututxline (&ent) == NULL)
err (1, "pututxline (USER_PROCESS)");
endutxent ();
updwtmpx (WTMPX_FILE, &ent);
setutxent ();
ent.ut_type = DEAD_PROCESS;
if (pututxline (&ent) == NULL)
err (1, "pututxline (DEAD_PROCESS)");
endutxent ();
updwtmpx (WTMPX_FILE, &ent);
}
I ran this in a loop, using:
# time for x in {1..100000} ; do \time ./pututxline ; done
I verified that with Red Hat Enterprise Linux 6.10 (glibc-2.12-1.212.el6_10.3.x86_64, kernel-2.6.32-754.35.1.el6.x86_64), kernel.pid_max defaults to 49152 on a random test machine.
# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
44530
On a random Red Hat Enterprise Linux 7.9 machine (glibc-2.17-324.el7_9.x86_64, kernel-3.10.0-1160.24.1.el7.x86_64), kernel.pid_max is 65536, and we get:
# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
63764
On a Red Hat Enterprise Linux 8.4 machine (glibc-2.28-151.el8.x86_64, kernel-4.18.0-304.el8.x86_64), kernel.pid_max is 4194304, and the job is still running. It reached:
# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
146488
If I set kernel.pid_max manually to 999999 on the Red Hat Enterprise Linux 6.10 test machine, the file grows effectively without bounds, too:
# utmpdump /var/run/utmp | wc -l
Utmp dump of /var/run/utmp
92088
I conclude that the reason for the observed behavioral change is indeed the kernel.pid_max change.
My recommendation would be for vfstpd to disable logging of anonymous sessions (where there is no identifier that can be used as a key) even if session logging is enabled. As there is nothing to change on the glibc side, I'm closing this bug.
|
Description of problem: Hi, As seen in a customer case 02896756, glibc seems to have no mechanism to avoid growing /run/utmp resulting in login slowness and increase cpu consumption. This could be a real issue even on "normal" systems not experiencing this since a long-living system would see /run/utmp grow which would result in higher CPU consumption (and login slowness) with time. Here's some technical extract: > So the file /run/utmp is huge and slow to process because sshd (through calling PAM stack, hence pam_unix) processes 384 bytes entries. > In the strace, it stopped in timeout after processing ~105000 entries. > > The related code in PAM is shown below (backtrace created when PAM open /run/utmp): > -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- > (gdb) bt > #0 __GI___open64_nocancel (file=0x7fedd7159630 <default_file_name> "/var/run/utmp", oflag=524288) > at ../sysdeps/unix/sysv/linux/open64_nocancel.c:45 > #1 0x00007fedd7103cb1 in __libc_setutent () at ../login/utmp_file.c:146 > #2 0x00007fedd710348c in __setutent () at getutent_r.c:35 > #3 0x00007fedd91a7f53 in pam_modutil_getlogin (pamh=pamh@entry=0x562d8eec7400) at pam_modutil_getlogin.c:51 > #4 0x00007fedd488535c in pam_sm_open_session (pamh=0x562d8eec7400, flags=<optimized out>, argc=<optimized out>, > argv=<optimized out>) at pam_unix_sess.c:91 > #5 0x00007fedd91a17b4 in _pam_dispatch_aux (use_c Version-Release number of selected component (if applicable): RHEL8.2 glibc.x86_64 2.28-101.el8 How reproducible: Have a huge /run/utmp file (50-300MB) Steps to Reproduce: 1. 2. 3. Actual results: /run/utmp keep growing. Only a system reboot reduce that file size. When it's too big the system consume lot of cpu, timeout. Customer said also no sft/and or ftp connections could be made. The command "who" would never complete or take very long. Expected results: System should not be inpact by a large /run/utmp. It should be able to recover from that on its own, maybe by rotating it if its getting too big. Additional info: