1871397 – glibc: Fix fgetsgent_r data corruption bug

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1871397 - glibc: Fix fgetsgent_r data corruption bug

Summary: glibc: Fix fgetsgent_r data corruption bug

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	glibc
Sub Component:
Version:	8.4
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	rc
Target Release:	8.0
Assignee:	DJ Delorie
QA Contact:	Sergey Kolosov
Docs Contact:	Zuzana Zoubkova
URL:
Whiteboard:
Duplicates (1):	1793577 (view as bug list)
Depends On:
Blocks:	1877115
TreeView+	depends on / blocked

Reported:	2020-08-23 02:55 UTC by Carlos O'Donell
Modified:	2024-10-01 16:48 UTC (History)
CC List:	15 users (show)
Fixed In Version:	glibc-2.28-132.el8
Doc Type:	Bug Fix
Doc Text:	.Reading configuration files with `fgetsgent()` and `fgetsgent_r()` is now more robust Specifically structured entries in the `/etc/gshadow` file, or changes in file sizes while reading, sometimes caused the `fgetsgent()` and `fgetsgent_r()` functions to return invalid pointers. Consequently, applications that used these functions to read `/etc/gshadow`, or other configuration files in `/etc/`, failed with a segmentation fault error. This update modifies `fgetsgent()` and `fgetsgent_r()` to make reading of configuration files more robust. As a result, applications are now able to read configuration files successfully.
Clone Of:
Environment:
Last Closed:	2021-05-18 14:36:39 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	systemd systemd issues 6512	None	closed	systemd-sysusers segfaults	2021-02-15 17:03:09 UTC
Red Hat Bugzilla	1793577	unspecified	CLOSED	glibc: Parsing of /etc/gshadow can return incorrect pointers causing application segfaults	2023-07-18 14:30:35 UTC
Red Hat Bugzilla	1927040	unspecified	CLOSED	glibc: After upgrade, before reboot, systemd services using USER= do not start (caused by fix for bug 1871397)	2023-10-13 10:29:08 UTC
Sourceware	20338	P2	RESOLVED	Parsing of /etc/gshadow can return bad pointers causing segfaults in applications	2021-02-15 17:03:10 UTC

Internal Links: 1927040

Description Carlos O'Donell 2020-08-23 02:55:16 UTC

Backport the following commits to fix the fgetsgent_r data corruption bug:

299210c1fa67e2dfb564475986fce11cd33db9ad
nss_files: Consolidate file opening in __nss_files_fopen

23ed36735af09c258e542266aaed92cdd8571c6c
nss_compat: Do not use mmap to read database files (bug 26258)

e9b2340998ab22402a8e968ba674c380a625b9dc
nss_files: Consolidate line parse declarations in <nss_files.h>

9980bf0b307368959cb29f3ca3f7446ad92347f1
nss_files: Use generic result pointer in parse_line

d4b4586315974d2471486d41891aa9463a5838ad
libio: Add fseterr_unlocked for internal use

bdee910e88006ae33dc83ac3d2c0708adb6627d0
nss: Add __nss_fgetent_r

4f62a21d0ed19ff29bba704167179b862140d011
grp: Implement fgetgrent_r using __nss_fgetent_r

2add4235ef674988948155f9a8f60a8c7b09bcff
gshadow: Implement fgetsgent_r using __nss_fgetent_r (bug 20338)

ee1c062be09da006e82ab34c1c9b5c82dd2af92c
pwd: Implement fgetpwent_r using __nss_fgetent_r

00bc6830e3fe3f10495917afe0835ddd19133c6a
shadow: Implement fgetspent_r using __nss_fgetent_r

ec2f1fddf29053957d061dfe310f106388472a4f
libio: Remove __libc_readline_unlocked

Comment 1 Carlos O'Donell 2020-08-25 13:33:05 UTC

*** Bug 1793577 has been marked as a duplicate of this bug. ***

Comment 14 ifelmail@gmail.com 2021-02-02 04:05:34 UTC

Thank you for working on this. The fix broke systemd service start, if the service uses USER=.
We run CentOS 8 stream snapshot taken on 2020-12-01. That time, the glibc version/release was glibc-2.28-129.el8.x86_64. I started testing a new version, with this fix, which is glibc-2.28-145.el8.x86_64 (it also includes other fixes). Testing it, I found that some services fail to start. Here is the error from the systemd log:
```
Feb 01 18:47:17 <hostname was here> systemd[1]: Starting TPM2 Access Broker and Resource Management Daemon...
░░ Subject: A start job for unit tpm2-abrmd.service has begun execution
░░ Defined-By: systemd
░░
░░ A start job for unit tpm2-abrmd.service has begun execution.
░░
░░ The job identifier is 716412.
Feb 01 18:47:17 <hostname was here> systemd[1084136]: tpm2-abrmd.service: Failed to determine user credentials: No such process
Feb 01 18:47:17 <hostname was here> systemd[1084136]: tpm2-abrmd.service: Failed at step USER spawning /usr/sbin/tpm2-abrmd: No such process
░░ Subject: Process /usr/sbin/tpm2-abrmd could not be executed
░░ Defined-By: systemd
░░
░░ The process /usr/sbin/tpm2-abrmd could not be executed and failed.
░░
░░ The error number returned by this process is ERRNO.
Feb 01 18:47:17 <hostname was here> systemd[1]: tpm2-abrmd.service: Main process exited, code=exited, status=217/USER
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░
░░ An ExecStart= process belonging to unit tpm2-abrmd.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 217.
Feb 01 18:47:17 <hostname was here> systemd[1]: tpm2-abrmd.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░
░░ The unit tpm2-abrmd.service has entered the 'failed' state with result 'exit-code'.
Feb 01 18:47:17 <hostname was here> systemd[1]: Failed to start TPM2 Access Broker and Resource Management Daemon.
░░ Subject: A start job for unit tpm2-abrmd.service has failed
░░ Defined-By: systemd
░░
░░ A start job for unit tpm2-abrmd.service has finished with a failure.
░░
░░ The job identifier is 716412 and the job result is failed.
```

Here is the service definition (note this is not the only service fails this way, it seems all of the having USER=xxx do):
```
# /usr/lib/systemd/system/tpm2-abrmd.service
[Unit]
Description=TPM2 Access Broker and Resource Management Daemon

[Service]
Type=dbus
Restart=always
RestartSec=5
EnvironmentFile=-/etc/default/tpm2-abrmd
BusName=com.intel.tss2.Tabrmd
StandardOutput=syslog
ExecStart=/usr/sbin/tpm2-abrmd
User=tss
Environment=G_MESSAGES_DEBUG=all
MemoryMax=200M

[Install]
WantedBy=multi-user.target
```

I pulled the sources from https://git.centos.org/rpms/glibc/tree/c8s, and first just rebuilt it as it is, installed on the test machine and tried to restart the service. It failed with the same error.
Then, I commented out 11 patch lines from this bug - Patch370: glibc-rh1871397-1.patch to Patch380: glibc-rh1871397-11.patch and built it again, installed the packages on to the test host and tried to restart the service. It restarted without errors. So, it looks like the fix breaks these services.

Would you please check into this regression? If you need a new bug to be opened, LMK, I will open one.

Comment 15 Florian Weimer 2021-02-02 07:49:37 UTC

(In reply to ifelmail from comment #14)
> Would you please check into this regression? If you need a new bug to be
> opened, LMK, I will open one.

I cannot reproduce this. Please open a new bug, include the systemd version and the contents of /etc/nsswitch.conf. Thanks.

Comment 16 ifelmail@gmail.com 2021-02-03 16:43:57 UTC

Florian, thank you for checking into this.

nsswitch.conf is pretty much harmless:
```
aliases: files
automount: files
ethers: files
group: files mymachines systemd
hosts: files dns mymachines myhostname dns
initgroups: files
netgroup: files
netmasks: files
networks: files
passwd: files mymachines systemd
protocols: files
publickey: files
rpc: files
services: files
shadow: files
```

Systemd on the machine is not the one coming with C8s, but newer one. We use the systemd version 246.1-1, with these changes:
- Backport PR #17495 to fix program leak
- Backport PR #17495 to fix BPF program lifecycle
- Backport PR #17422 to clean up cgroups more reliably after exit
- Backport PR #17497 to add FixedRandomDelay= support
- Backport PR #16838 and #16857 to improve $PATH handling
- Backport PR #16940 to fix ECONN handling in sockets
- Backport PR #17031 to fix rate limiting on units in restart loop
- Backport PR #17082 to get nspawn TTY tweaks

- Don't compile in systemd-repart (needs libfdisk >= 2.33 and C8 has 2.32)
- Remove unused systemd-journal-remote.xml and systemd-journal-gatewayd.xml files since we never used firewalld

I'm trying to repro this with the version of systemd comes with the OS now. Another thing I found, if machine is rebooted, the issue is gone (at least for that service). This happens on provisioning time.

Please hold on, I will get back, once I have more data.

Comment 17 ifelmail@gmail.com 2021-02-05 04:24:52 UTC

I provisioned a test host twice. First time with systemd and glibc coming from the latest version of the repo, the service failed there with the error. Here is strace output: https://gist.githubusercontent.com/ifel/3e8fa070ad70c87e07a4610051ec1cf1/raw/320fdf46e07600e665aed3f1b4440209173e8956/gistfile1.txt
Versions:
[root@<hostname> ~]# rpm -qa | grep ^systemd
systemd-pam-239-43.el8.x86_64
systemd-udev-239-43.el8.x86_64
systemd-container-239-43.el8.x86_64
systemd-libs-239-43.el8.x86_64
systemd-239-43.el8.x86_64
[root@<hostname> ~]# rpm -qa |grep ^glibc
glibc-all-langpacks-2.28-145.el8.x86_64
glibc-common-2.28-145.el8.x86_64
glibc-devel-2.28-145.el8.x86_64
glibc-headers-2.28-145.el8.x86_64
glibc-2.28-145.el8.x86_64
[root@<hostname> ~]#

Second time I provisioned the box with systemd coming from the latest version of the repo, but glibc was the latest version but rebuilt without glibc-rh1871397-1.patch to glibc-rh1871397-11.patch patches, the rest was the same. And the service started. Here is strace output: https://gist.githubusercontent.com/ifel/248c6e820462b4aeb2a470c289b1ebc7/raw/0c3622379eb78c2f9574a25672fabc1c4d3f387b/gistfile1.txt (note I cut parts of passwd, to show root and tss accounts only).

As you can see, when it works (without the patches), it reads /etc/passwd, and when it fails, it does not read the file. I'm under impression it's either uses cached version, or it tries to use other means, in spite nsswitch.conf instructs to use files.

Hope this is helpful.

Comment 18 ifelmail@gmail.com 2021-02-05 04:26:26 UTC

Versions used on the second test (when the service worked):
[root@sparefullten25865 ~]# rpm -qa | grep ^systemd
systemd-pam-239-43.el8.x86_64
systemd-udev-239-43.el8.x86_64
systemd-container-239-43.el8.x86_64
systemd-libs-239-43.el8.x86_64
systemd-239-43.el8.x86_64
[root@sparefullten25865 ~]# rpm -qa |grep ^glibc
glibc-2.28-145.fb2.el8.x86_64
glibc-headers-2.28-145.fb2.el8.x86_64
glibc-devel-2.28-145.fb2.el8.x86_64
glibc-all-langpacks-2.28-145.fb2.el8.x86_64
glibc-common-2.28-145.fb2.el8.x86_64
[root@sparefullten25865 ~]#

Comment 19 Florian Weimer 2021-02-05 08:52:08 UTC

(In reply to ifelmail from comment #16)
> I'm trying to repro this with the version of systemd comes with the OS now.
> Another thing I found, if machine is rebooted, the issue is gone (at least
> for that service). This happens on provisioning time.

This update cannot be installed with a reboot because the internal GLIBC_PRIVATE ABI changes. As a result, the nss_files module cannot be loaded, and the system behaves as if the information is not present.

I will check if we can make this experience smoother.

Comment 20 Florian Weimer 2021-02-05 09:16:39 UTC

(In reply to Florian Weimer from comment #19)
> (In reply to ifelmail from comment #16)
> > I'm trying to repro this with the version of systemd comes with the OS now.
> > Another thing I found, if machine is rebooted, the issue is gone (at least
> > for that service). This happens on provisioning time.
> 
> This update cannot be installed with a reboot because the internal
> GLIBC_PRIVATE ABI changes. As a result, the nss_files module cannot be
> loaded, and the system behaves as if the information is not present.
> 
> I will check if we can make this experience smoother.

Sorry, I meant to write “cannot be installed *without* a reboot”.

Comment 21 ifelmail@gmail.com 2021-02-08 17:19:17 UTC

Thank you. Let me know if you need any help with this - like test or something.

Comment 22 Florian Weimer 2021-02-10 19:38:41 UTC

(In reply to ifelmail from comment #21)
> Thank you. Let me know if you need any help with this - like test or
> something.

Thanks for the offer. The reboot requirement is tracked in bug 1927040.

Comment 26 errata-xmlrpc 2021-05-18 14:36:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: glibc security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1585

Note You need to log in before you can comment on or make changes to this bug.