Bug 1859426
| Summary: | When encountered KDC policy reject, samba-winbind should not flood the network by retrying intensively | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Ding-Yi Chen <dchen> |
| Component: | samba | Assignee: | Isaac Boukris <iboukris> |
| Status: | CLOSED NOTABUG | QA Contact: | sssd-qe <sssd-qe> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.7 | CC: | gdeschner, iboukris, jarrpa, metze |
| Target Milestone: | rc | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-07-31 05:39:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Ding-Yi Chen
2020-07-22 01:46:04 UTC
> The trusted domain has "selective authentication",
That helps, looking into it.
We already have "winbind scan trusted domains = no", would that have fixed the problem too? It would be good to know of any reason why we still need the scanning at all. But for sure we should fix the regression. (In reply to Stefan Metzmacher from comment #3) > We already have "winbind scan trusted domains = no", would that have fixed > the problem too? I didn't know the scanning wasn't really needed. @ding-yi: sounds like a good idea to suggest to customer, at least for the meantime until we fix it back to skip forests with selective-authentication (as we used to). I'm not sure if it's needed in the customers configuration, but if it turns out it's not needed I'm keep it off forever. The wbinfo -m --verbose, wbinfo -u, wbinfo -g outputs differs, but that should not impact any real user interaction, see also https://www.samba.org/samba/history/samba-4.8.0.html In 4.8 times some idmap backends and pam_winbind with krb5 required the scanning, but I'm not sure if it's still required today. @Issac customer mentioned that "winbind scan trusted domains = no" does not work for them. Will ask him about the symptom and log. (In reply to Ding-Yi Chen from comment #6) > @Issac customer mentioned that "winbind scan trusted domains = no" does not > work for them. > > Will ask him about the symptom and log. I have a question, in your tests did you enable selective-authentication only on the trusted forest side or also on the local one? And can you also test the old samba version with selective-authentication configured only on the trusted forest side, not the local one. Thanks (In reply to Ding-Yi Chen from comment #0) > Expected results: > > Retry after 1 hour > > > Additional info: > > A configuration item for setting scan interval in smb.conf, such as: > > winbind rescan trusted domains interval In a closer look we already have such configuration option "winbind reconnect delay" which you can set to 3600, from man page: This parameter specifies the number of seconds the winbindd(8) daemon will wait between attempts to contact a Domain controller for a domain that is determined to be down or not contactable. Default: winbind reconnect delay = 30 Let me know if this works out for the customer, as I'm not sure we'll change it back to ignore trusts with selective-authentication, as it isn't necessarily correct. @Issac > I have a question, in your tests did you enable selective-authentication only on the trusted forest side or also on the local one? > And can you also test the old samba version with selective-authentication configured only on the trusted forest side, not the local one. The selective-authentication is on trusted forest. Local one does not have that. The old version did not scan the trust domain. No traffic in tcpdump to the trust domain when the samba-winbind starting. > In a closer look we already have such configuration option "winbind reconnect delay" which you can set to 3600 > ... > Default: winbind reconnect delay = 30 From previous log, it seems to contact less than one second (from 14:17:24.482116 ~ 14:17:24.575087) Do you think something else might also involved? Log: ~~~ [2020/07/20 14:17:24.482116, 1] ../../source3/winbindd/winbindd_cm.c:1306(cm_prepare_connection) Failed to prepare SMB connection to ad.trusted.example.com: NT_STATUS_LOGON_FAILURE [2020/07/20 14:17:24.574381, 0] ../../source3/librpc/crypto/gse.c:543(gse_get_client_auth_token) gse_get_client_auth_token: gss_init_sec_context failed with [Unspecified GSS failure. Minor code may provide more information: KDC policy rejects request](2529638924) [2020/07/20 14:17:24.574518, 1] ../../auth/gensec/spnego.c:596(gensec_spnego_client_negTokenInit_step) gensec_spnego_client_negTokenInit_step: gse_krb5: creating NEG_TOKEN_INIT for cifs/ad.trusted.example.com failed (next[(null)]): NT_STATUS_LOGON_FAILURE [2020/07/20 14:17:24.574854, 1] ../../source3/winbindd/winbindd_cm.c:1166(cm_prepare_connection) authenticated session setup to ad.trusted.example.com using SAMBA1$@HOME.EXAMPLE.COM failed with NT_STATUS_LOGON_FAILURE [2020/07/20 14:17:24.575087, 1] ../../source3/winbindd/winbindd_cm.c:1306(cm_prepare_connection) Failed to prepare SMB connection to ad.trusted.example.com: NT_STATUS_LOGON_FAILURE ~~~ That said, I will ask cu to apply the option. (In reply to Ding-Yi Chen from comment #9) > @Issac > > > > > I have a question, in your tests did you enable selective-authentication only on the trusted forest side or also on the local one? > > And can you also test the old samba version with selective-authentication configured only on the trusted forest side, not the local one. > > The selective-authentication is on trusted forest. > Local one does not have that. > > The old version did not scan the trust domain. No traffic in tcpdump to the > trust domain when the samba-winbind starting. This doesn't match my testing in lab and the code, when the local one doesn't have selective-auth and only the trusted has, we'd retry the same way every 30 seconds even in the older version. > > In a closer look we already have such configuration option "winbind reconnect delay" which you can set to 3600 > > ... > > Default: winbind reconnect delay = 30 > > From previous log, it seems to contact less than one second (from > 14:17:24.482116 ~ 14:17:24.575087) > Do you think something else might also involved? > > Log: > ~~~ > [2020/07/20 14:17:24.482116, 1] > ../../source3/winbindd/winbindd_cm.c:1306(cm_prepare_connection) > Failed to prepare SMB connection to ad.trusted.example.com: > NT_STATUS_LOGON_FAILURE > [2020/07/20 14:17:24.574381, 0] > ../../source3/librpc/crypto/gse.c:543(gse_get_client_auth_token) > gse_get_client_auth_token: gss_init_sec_context failed with [Unspecified > GSS failure. Minor code may provide more information: KDC policy rejects > request](2529638924) > [2020/07/20 14:17:24.574518, 1] > ../../auth/gensec/spnego.c:596(gensec_spnego_client_negTokenInit_step) > gensec_spnego_client_negTokenInit_step: gse_krb5: creating NEG_TOKEN_INIT > for cifs/ad.trusted.example.com failed (next[(null)]): > NT_STATUS_LOGON_FAILURE > [2020/07/20 14:17:24.574854, 1] > ../../source3/winbindd/winbindd_cm.c:1166(cm_prepare_connection) > authenticated session setup to ad.trusted.example.com using > SAMBA1$@HOME.EXAMPLE.COM failed with NT_STATUS_LOGON_FAILURE > > [2020/07/20 14:17:24.575087, 1] > ../../source3/winbindd/winbindd_cm.c:1306(cm_prepare_connection) > Failed to prepare SMB connection to ad.trusted.example.com: > NT_STATUS_LOGON_FAILURE > ~~~ > > > That said, I will ask cu to apply the option. That's a couple of requests of a single retry, we only retry every 30 second (based on this option). While the customer have not provided the log with "winbind scan trusted domains = no" yet, He did mention that "winbind reconnect delay = 3600" worked for him, as the error messages only appear on hourly basis. Customer would like to know whether there are the ways to specify delays for certain domain, such as: winbind reconnect * : delay = 30 winbind reconnect SELECTIVE_TRUSTED : delay = 3600 (In reply to Ding-Yi Chen from comment #12) > Customer would like to know whether there are the ways to specify delays for > certain domain, such as: > > winbind reconnect * : delay = 30 > winbind reconnect SELECTIVE_TRUSTED : delay = 3600 No. I'm closing as not a bug as we aren't going to filter out selective-auth as noted. Hi, Just wondering, is that possible that we can have an additional option that address reject domains. For example, "winbind reject retry = 7200" means: if the domains get the KDC policy rejects, it will retry after 2 hours. Having this will simplify config for mass deployments, the system admin don't need to worry about the error message flooding the system; individual users can apply for permission without changing the configuration files. (In reply to Ding-Yi Chen from comment #14) > Hi, > > Just wondering, is that possible that we can have an additional option that > address reject domains. > > For example, "winbind reject retry = 7200" means: if the domains get the KDC > policy rejects, it will retry after 2 hours. No, there is no such an option. |