Bug 1292238
Summary: | extreme memory usage in libnfsidmap sss.so plug-in when resolving groups with many members | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | James Ralston <ralston> | ||||||||
Component: | sssd | Assignee: | SSSD Maintainers <sssd-maint> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Steeve Goveas <sgoveas> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 7.2 | CC: | grajaiya, jhrozek, lslebodn, mkosek, mniranja, mzidek, pbrezina, ralston, sbose | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | sssd-1.14.0-0.1.alpha.el7 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2016-11-04 07:14:07 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
James Ralston
2015-12-16 19:39:12 UTC
Created attachment 1106513 [details]
call trace of rpc.idmapd when OOM killer killed it
On the NFS server, one of the invocation of the OOM killer managed to capture the call trace for the rpc.idmapd process at the time it killed it.
Unfortunately, there doesn't seem to be much to go on here.
I'm open to suggestions on how to debug this further.
We could revert to using the sss plug-in for idmapd, and attach to it with gdb, but I'm not sure how to capture any useful information before the OOM killer is invoked and performs a kill -9 on it.
I'm not sure if reverting to the sss plug-in and running valgrind rpc.idmapd -f -vvv until the malloc bomb happened would produce and useful information. Thoughts?
I'm a Fedora developer, so I'd be happy to re-roll local sssd RPMs with extra debugging patches applied.
Created attachment 1106892 [details]
evidence of a "domain users" group lookup malloc bomb
I believe I've figured out the problem.
This attachment is the output of "ltrace -f rpc.idmapd -f" interleaved with full rpc.idmapd debugging from /var/log/messages.
When libnfsidmap attempts to resolve gr_gid = 1651947284 to the group name, it malloc bombs the host.
In our domain, gr_gid = 1651947284 is the "domain users" group:
$ getent group 1651947284 | cut -d: -f1
domain users
This group contains every single user in our domain:
$ getent group "domain users" | tr , '\012' | wc -l
925
I strongly suspect that the extreme memory usage is dependent on the (non-trivial) number of members of the group.
Resolving the "domain users" group doesn't malloc bomb the host every time. However, I can most easily reproduce the malloc bomb when the host has been quiescent for some time. Therefore, I suspect the extreme memory usage only occurs when the "domain users" group has expired from sssd's cache.
We appreciate the fact that sssd provides the libnfsidmap plugin. It finally allows us to use NFSv4+krb5+AD, which wasn't possible before.
But this is a SHOWSTOPPER-class problem. It is completely unreasonable for the libnfsidmap plugin to consume gigabytes of memory—however briefly—in the course of normal operations. It needs to be able to function in a reasonably small memory footprint.
(In reply to James Ralston from comment #3) > Resolving the "domain users" group doesn't malloc bomb the host every time. > However, I can most easily reproduce the malloc bomb when the host has been > quiescent for some time. Therefore, I suspect the extreme memory usage only > occurs when the "domain users" group has expired from sssd's cache. > Can you check if adding [sss] memcache = False will expose the malloc bomb all the time? For your description I assume that as long as the group has a valid entry in the memory cache no memory leak exists. If this is true it would help to track down the code path where the memory happens. Looks like I was wrong, the issue is in the memcache path, so that with above config option you shouldn't see the memory bomb. You are right, it is an infinite loop and attached patch fixes it for me. I'll try to prepare a test build. Please find a test-build at http://koji.fedoraproject.org/koji/taskinfo?taskID=12238159. Created attachment 1107202 [details]
nfs idmap: fix infinite loop
Good catch! Yes, that was clearly an infinite loop bug / malloc bomb: the size argument was the initial size of the buffer, not the grown size. The malloc bomb only triggered when the initial buffer size of 4096 bytes was inadequate: #define BUF_LEN (4096) In a test environment, without many users, the "domain users" group entry wouldn't have been large enough to exceed BUF_LEN and trigger the malloc bomb. But at larger sites with many hundreds or thousands of users, BUF_LEN would be inadequate, triggering the bug. And ~925 users is definitely enough to exceed BUF_LEN: $ getent group 1651947284 | wc -c 7545 I've rebuilt local versions of sssd with your patch, and I've updated our test NFSv4 server/client to use them. I'll post an update of the results of our testing (which might take longer than usual, because many people will be on vacation over the next two weeks). Upstream ticket: https://fedorahosted.org/sssd/ticket/2909 (In reply to James Ralston from comment #8) > Good catch! Yes, that was clearly an infinite loop bug / malloc bomb: the > size argument was the initial size of the buffer, not the grown size. > > The malloc bomb only triggered when the initial buffer size of 4096 bytes > was inadequate: > > #define BUF_LEN (4096) > > In a test environment, without many users, the "domain users" group entry > wouldn't have been large enough to exceed BUF_LEN and trigger the malloc > bomb. But at larger sites with many hundreds or thousands of users, BUF_LEN > would be inadequate, triggering the bug. And ~925 users is definitely enough > to exceed BUF_LEN: > > $ getent group 1651947284 | wc -c > 7545 > > I've rebuilt local versions of sssd with your patch, and I've updated our > test NFSv4 server/client to use them. > > I'll post an update of the results of our testing (which might take longer > than usual, because many people will be on vacation over the next two weeks). The patch make sense. But I would like to ask whether you hit another issue in your testing. With the patch, we haven't been able to reproduce the OOM problem. We haven't noticed any other issues from using the patch. However, as I said in comment 8, we had very light usage of the test hosts last week (only one person logged in the entire week), so last week wasn't a good test. We should be able to test the hosts more thoroughly this week. So I think if we cannot reproduce the OOM problem after this week, and we don't see any other problems, then we can assume the problem is fixed. master: 2a256e4e4b64891fe846e933589506daa68aa13e sssd-1-13: 93e404ea1b3bda5961fb61e6d22d081adfed889a sssd-1-12: 097edb7fa766ab50c768c049c5b1e15361785ed6 For the record, after another week of testing, we have not been able to reproduce the malloc bomb in the NFS idmap plugin, and we have not noticed any additional problems introduced by the patch. (In reply to James Ralston from comment #13) > For the record, after another week of testing, we have not been able to > reproduce the malloc bomb in the NFS idmap plugin, and we have not noticed > any additional problems introduced by the patch. Great, thanks again for testing. FYI I also commited that code to RHEL-6 since 6.8 will enable the NFS plugin as well -- and this bug would make the plugin unusable. There is a COPR repo with 6.8 preview builds if you're interested in testing. This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions To reproduce this you need a group with many user. The actual amount depends a bit on the mean name length of the members. Since the issue is about a 4k buffer with a mean name length of 5 characters you would need more then 820 mebers in the group, with a length of 10 more then 410 members. Instead with a full NFS setup is should be possible to reproduce the issue by calling nfsidmap 12345 uid:group_with_many_members@domain The 12345 is a keyring ID but since the keyring operation happen after the group lookup the exact value does not matter. Without the patch 'nfsidmap' should not return and consume more and more memory. With the patch 'nfsidmap' should just return. (In reply to Sumit Bose from comment #18) > To reproduce this you need a group with many user. The actual amount depends > a bit on the mean name length of the members. Since the issue is about a 4k > buffer with a mean name length of 5 characters you would need more then 820 > mebers in the group, with a length of 10 more then 410 members. > > Instead with a full NFS setup is should be possible to reproduce the issue > by calling > > nfsidmap 12345 uid:group_with_many_members@domain > > The 12345 is a keyring ID but since the keyring operation happen after the > group lookup the exact value does not matter. Without the patch 'nfsidmap' > should not return and consume more and more memory. With the patch > 'nfsidmap' should just return. sorry, the loop is triggered by the lookup by GID, please use: nfsidmap 12345 group:987654 where 987654 is the GID of the group with many members. 1. Reproduce the issue with older version ======================================= [root@client2 ~]# rpm -qa | grep sss sssd-client-1.13.0-40.el7_2.9.x86_64 sssd-ipa-1.13.0-40.el7_2.9.x86_64 sssd-proxy-1.13.0-40.el7_2.9.x86_64 libsss_idmap-1.13.0-40.el7_2.9.x86_64 libsss_nss_idmap-1.13.0-40.el7_2.9.x86_64 sssd-common-1.13.0-40.el7_2.9.x86_64 sssd-common-pac-1.13.0-40.el7_2.9.x86_64 sssd-ad-1.13.0-40.el7_2.9.x86_64 sssd-ldap-1.13.0-40.el7_2.9.x86_64 python-sssdconfig-1.13.0-40.el7_2.9.noarch sssd-krb5-common-1.13.0-40.el7_2.9.x86_64 sssd-krb5-1.13.0-40.el7_2.9.x86_64 sssd-1.13.0-40.el7_2.9.x86_64 2. Configure RHEL7.3 client to authenticate to AD using sss and join AD using net ads command 3. Create a group TestGroup1 with 1001 members [root@client2 ~]# getent group "TestGroup1" | tr , '\012' | wc -l 1001 4. cat /etc/sssd/sssd.conf [sssd] domains = centaur.test config_file_version = 2 services = nss, pam [domain/centaur.test] ad_domain = centaur.test krb5_realm = CENTAUR.TEST cache_credentials = True id_provider = ad krb5_store_password_if_offline = True default_shell = /bin/bash ldap_id_mapping = True use_fully_qualified_names = True fallback_homedir = /home/%u@%d access_provider = ad 5. cat /etc/idmap.conf [General] Domain = centaur.test Verbosity = 10 [Mapping] Nobody-User = nobody Nobody-Group = nobody [Translation] Method = sss [Static] [UMICH_SCHEMA] LDAP_server = ldap-server.local.domain.edu LDAP_base = dc=local,dc=domain,dc=edu 6. Restart nfs-idmap service $ systemctl restart nfs-idmapd 7. Run nfsidmap command [root@client2 ~]# nfsidmap 1131314 group:1993601671 [root@client2 ~]# /usr/bin/time -v nfsidmap 1131313 group:1993601671 Command exited with non-zero status 1 Command being timed: "nfsidmap 1131313 group:1993601671" User time (seconds): 1.19 System time (seconds): 1.08 Percent of CPU this job got: 33% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.72 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 885776 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 25 Minor (reclaiming a frame) page faults: 727361 Voluntary context switches: 102 Involuntary context switches: 162 Swaps: 0 File system inputs: 26760 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 1 Following message are seen Sep 22 20:44:28 client2.example.test nfsidmap[3075]: name_lookup: keyctl_instantiate failed: Operation not permitted Sep 22 20:48:50 client2.example.test nfsidmap[3092]: key: 0x114332 type: group value: 1993601671 timeout 600 Sep 22 20:48:50 client2.example.test nfsidmap[3092]: nfs4_gid_to_name: calling sss_nfs->gid_to_name Sep 22 20:48:57 client2.example.test nfsidmap[3092]: sss_nfs_gid_to_name: rc=0 msg=Success Sep 22 20:48:57 client2.example.test nfsidmap[3092]: nfs4_gid_to_name: sss_nfs->gid_to_name returned 0 Sep 22 20:48:57 client2.example.test nfsidmap[3092]: nfs4_gid_to_name: final return value is 0 Sep 22 20:48:57 client2.example.test nfsidmap[3092]: name_lookup: keyctl_instantiate failed: Operation not permitted 8. Update sssd to 1.14 sssd-common-1.14.0-41.el7.x86_64 sssd-krb5-1.14.0-41.el7.x86_64 sssd-ipa-1.14.0-41.el7.x86_64 libsss_nss_idmap-1.13.0-40.el7_2.9.x86_64 sssd-client-1.14.0-41.el7.x86_64 sssd-krb5-common-1.14.0-41.el7.x86_64 sssd-ad-1.14.0-41.el7.x86_64 sssd-ldap-1.14.0-41.el7.x86_64 python-sssdconfig-1.14.0-41.el7.noarch libsss_idmap-1.14.0-41.el7.x86_64 sssd-common-pac-1.14.0-41.el7.x86_64 sssd-proxy-1.14.0-41.el7.x86_64 sssd-1.14.0-41.el7.x86_64 9. Run nfsidmap again: Command exited with non-zero status 1 Command being timed: "nfsidmap 1131314 group:1993601671" User time (seconds): 0.00 System time (seconds): 0.00 Percent of CPU this job got: 50% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 1144 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 354 Voluntary context switches: 1 Involuntary context switches: 0 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 1 Sep 22 20:52:23 client2.example.test systemd[1]: Started Network Manager Script Dispatcher Service. Sep 22 20:52:23 client2.example.test nm-dispatcher[3199]: Dispatching action 'dhcp4-change' for eth0 Sep 22 20:52:26 client2.example.test nfsidmap[3212]: key: 0x114332 type: group value: 1993601671 timeout 600 Sep 22 20:52:26 client2.example.test nfsidmap[3212]: nfs4_gid_to_name: calling sss_nfs->gid_to_name Sep 22 20:52:26 client2.example.test nfsidmap[3212]: found gid 1993601671 in memcache Sep 22 20:52:26 client2.example.test nfsidmap[3212]: sss_nfs_gid_to_name: rc=0 msg=Success Sep 22 20:52:26 client2.example.test nfsidmap[3212]: nfs4_gid_to_name: sss_nfs->gid_to_name returned 0 Sep 22 20:52:26 client2.example.test nfsidmap[3212]: nfs4_gid_to_name: final return value is 0 Sep 22 20:52:26 client2.example.test nfsidmap[3212]: name_lookup: keyctl_instantiate failed: Operation not permitted Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-2476.html |