Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1068907

Summary: BUG: a callback did not free its request. May leak memory
Product: Red Hat Enterprise Linux 6 Reporter: Brian J. Murrell <brian.murrell>
Component: sssdAssignee: Jakub Hrozek <jhrozek>
Status: CLOSED DUPLICATE QA Contact: Kaushik Banerjee <kbanerje>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 6.5CC: brian.murrell, grajaiya, jgalipea, lslebodn, mkosek, pbrezina, preichl
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-05-29 09:21:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ssh log
none
domain log none

Description Brian J. Murrell 2014-02-23 04:09:08 UTC
Description of problem:
When trying to ssh to another system in an identity management managed cluster, the ssh blocks for a long time polling on /var/lib/sss/pipes/ssh.

Version-Release number of selected component (if applicable):
sssd-client-1.9.2-129.el6_5.4.x86_64
sssd-1.9.2-129.el6_5.4.x86_64

How reproducible:
100%

Steps to Reproduce:
1. install idm
2. add an idm client
3. ssh to another node inside the domain

Actual results:
Connection hangs for a few 10s of seconds while polling on /var/lib/sss/pipes/ssh but does complete eventually.

Expected results:
Connection should complete quickly.

Additional info:
When this happens the following is logged to /var/log/sssd/sssd_ssh.log:

(Sat Feb 22 19:58:17 2014) [sssd[ssh]] [sss_dp_req_destructor] (0x0010): BUG: a callback did not free its request. May leak memory

This is the strace just before the poll that has to time out before ssh continues:

[pid  8175] connect(3, {sa_family=AF_FILE, path="/var/lib/sss/pipes/ssh"}, 110) = 0
[pid  8175] fstat(3, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
[pid  8175] poll([{fd=3, events=POLLOUT}], 1, 300000) = 1 ([{fd=3, revents=POLLOUT}])
[pid  8175] sendto(3, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL, 0) = 16
[pid  8175] poll([{fd=3, events=POLLOUT}], 1, 300000) = 1 ([{fd=3, revents=POLLOUT}])
[pid  8175] sendto(3, "\0\0\0\0", 4, MSG_NOSIGNAL, NULL, 0) = 4
[pid  8175] poll([{fd=3, events=POLLIN}], 1, 300000) = 1 ([{fd=3, revents=POLLIN}])
[pid  8175] read(3, "\24\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0", 16) = 16
[pid  8175] poll([{fd=3, events=POLLIN}], 1, 300000) = 1 ([{fd=3, revents=POLLIN}])
[pid  8175] read(3, "\0\0\0\0", 4)      = 4
[pid  8175] poll([{fd=3, events=POLLOUT}], 1, 300000) = 1 ([{fd=3, revents=POLLOUT}])
[pid  8175] sendto(3, ">\0\0\0\342\0\0\0\0\0\0\0\0\0\0\0", 16, MSG_NOSIGNAL, NULL, 0) = 16
[pid  8175] poll([{fd=3, events=POLLOUT}], 1, 300000) = 1 ([{fd=3, revents=POLLOUT}])
[pid  8175] sendto(3, "\1\0\0\0\33\0\0\0mgmt-1.example.c"..., 46, MSG_NOSIGNAL, NULL, 0) = 46
[pid  8175] poll([{fd=3, events=POLLIN}], 1, 300000

Comment 2 Jakub Hrozek 2014-02-24 09:40:00 UTC
Can you provide the whole sssd_ssh log as well as the domain log with a high debug_level (7+) ?

Comment 3 Jakub Hrozek 2014-03-05 16:15:46 UTC
Hi, any luck getting the logs?

Comment 4 Jakub Hrozek 2014-04-03 09:54:12 UTC
I wasn't able to reproduce the problem on our end and without the requested logs, there is no way to see what went wrong. I'm going to close this bug, please reopen when you have the log files.

It would also be nice if you could test with the 6.6 preview packages:
http://copr-fe.cloud.fedoraproject.org/coprs/jhrozek/SSSD-1.11-RHEL6/

Chances are the bug is already fixed there.

Comment 5 Brian J. Murrell 2014-05-02 19:24:09 UTC
My apologies for my tardiness on this.  I just got back to this task.  I will attach the requested log files.

Comment 6 Brian J. Murrell 2014-05-02 19:25:13 UTC
Created attachment 891999 [details]
ssh log

Comment 7 Brian J. Murrell 2014-05-02 19:25:53 UTC
Created attachment 892000 [details]
domain log

Comment 8 Jakub Hrozek 2014-05-15 12:04:16 UTC
Brian, did you have a chance to try the 1.11 packages from the COPR repository? It would be nice to see if the bug was already fixed in the rebase.

Comment 9 Brian J. Murrell 2014-05-15 12:35:52 UTC
Jakub: I did not try any other packages.  We have to stay pretty strict here with only running released EL software.

In any case, things have been pretty smooth since I abandoned our previously hand-crafted sssd.conf file and just gone with the one provided by ipa-client-install.

That only works on EL6 though, sadly.

Because some of the nodes we install are test nodes they are re-provisioned very frequently and as such need to use "--force-join" with ipa-client-install and as you know that's only available on the version of freeipa that's in EL6 and not EL5.  So on EL5 we still have to hand-craft the sssd.conf file and only use freeipa as an LDAP provider due to ipa-client-install barfing when it finds the node is already joined.

Comment 10 Jakub Hrozek 2014-05-15 14:37:57 UTC
(In reply to Brian J. Murrell from comment #9)
> Because some of the nodes we install are test nodes they are re-provisioned
> very frequently and as such need to use "--force-join" with
> ipa-client-install and as you know that's only available on the version of
> freeipa that's in EL6 and not EL5.  So on EL5 we still have to hand-craft
> the sssd.conf file and only use freeipa as an LDAP provider due to
> ipa-client-install barfing when it finds the node is already joined.

Ah, interesting, but then I wonder if --uninstall wouldn't be better than --force-join ?

Anyhow, even with forced join I would suggest to use the IPA ID provider and noth the LDAP provider. If the keytab is in place, the ID provider should just work. Actually, we keep the config file backwards compatible, so the EL6 config file should work when copied to the EL5 client verbatim.

Comment 11 Brian J. Murrell 2014-05-15 15:58:43 UTC
(In reply to Jakub Hrozek from comment #10)
> 
> Ah, interesting, but then I wonder if --uninstall wouldn't be better than
> --force-join ?

Would --uninstall work on a node that does not yet have ipa-client-install run on it yet because it's just been re-installed since the last time ipa-client-install was run?  I.e. the order of operations are:

1. install O/S
2. ipa-client-install
3. goto 1

Pressumably you are suggesting a step:

1.5 ipa-client-install --uninstall

right?
 
> Anyhow, even with forced join

s/with/without?

> I would suggest to use the IPA ID provider and
> noth the LDAP provider.

But it won't work without a key (which is installed by ipa-client-install) will it?

> If the keytab is in place,

But it wouldn't be because the O/S was re-installed and it's fresh.  Maybe you are suggesting to fetch the keytab rather than a full ipa-client-install?  Any hints on how to do that?

> the ID provider should
> just work. Actually, we keep the config file backwards compatible, so the
> EL6 config file should work when copied to the EL5 client verbatim.

Comment 12 Jakub Hrozek 2014-05-16 14:20:38 UTC
(In reply to Brian J. Murrell from comment #11)
> (In reply to Jakub Hrozek from comment #10)
> > 
> > Ah, interesting, but then I wonder if --uninstall wouldn't be better than
> > --force-join ?
> 
> Would --uninstall work on a node that does not yet have ipa-client-install
> run on it yet because it's just been re-installed since the last time
> ipa-client-install was run?  I.e. the order of operations are:
> 
> 1. install O/S
> 2. ipa-client-install
> 3. goto 1
> 
> Pressumably you are suggesting a step:
> 

Ah, sorry, when you said 'reinstalled' I meant 're-enrolled', not that the whole client had been wiped out.

> 1.5 ipa-client-install --uninstall
> 
> right?
>  
> > Anyhow, even with forced join
> 
> s/with/without?
> 

With and without :-) I'd always recommend to use the IPA ID provider, the only trick is retrieving the keytab (and maybe the ipa's ca.crt)

> > I would suggest to use the IPA ID provider and
> > noth the LDAP provider.
> 
> But it won't work without a key (which is installed by ipa-client-install)
> will it?
>

Correct, you need a keytab in order to use the IPA backend.

If the client was already enrolled with the server, you could grab the previous keytab with ipa-getkeytab, independently of ipa-client-install.
 
> > If the keytab is in place,
> 
> But it wouldn't be because the O/S was re-installed and it's fresh.  Maybe
> you are suggesting to fetch the keytab rather than a full
> ipa-client-install?  Any hints on how to do that?
> 

Right :) You can use ipa-getkeytab, the man page has a nice example.

> > the ID provider should
> > just work. Actually, we keep the config file backwards compatible, so the
> > EL6 config file should work when copied to the EL5 client verbatim.

Comment 13 Jakub Hrozek 2014-05-22 09:58:58 UTC
Brian, can you clarify your comment #9 a bit? Are you still seeing the 'BUG' message with the config file generated by IPA client install?

Comment 14 Brian J. Murrell 2014-05-22 13:22:50 UTC
(In reply to Jakub Hrozek from comment #13)
> Brian, can you clarify your comment #9 a bit? Are you still seeing the 'BUG'
> message with the config file generated by IPA client install?

No, that message is gone.

Comment 15 Jakub Hrozek 2014-05-29 09:17:21 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/1751

Comment 16 Jakub Hrozek 2014-05-29 09:21:08 UTC
I'm pretty sure the issue you were seeing was solved by ticket #1751 upstream, which is going to be included in RHEL-6.6.

Moreover, using the SSH responder with a non-IPA back end is not a supported configuration at the moment.

I'm going to close this bug as a duplicate of bug #1019285. Please reopen if you disagree and thanks for filing the bug report.

Comment 17 Jakub Hrozek 2014-05-29 09:21:35 UTC

*** This bug has been marked as a duplicate of bug 1019285 ***