Bug 691313
Summary: | Need TLS/SSL error messages in repl status and errors log | ||||||
---|---|---|---|---|---|---|---|
Product: | [Retired] 389 | Reporter: | Moisés Barba Pérez <mbarper> | ||||
Component: | Replication - General | Assignee: | Nathan Kinder <nkinder> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Viktor Ashirov <vashirov> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 1.2.5 | CC: | amsharma, mniranja, nkinder, rmeggins | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i386 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 720461 (view as bug list) | Environment: | |||||
Last Closed: | 2015-12-07 17:06:00 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 690318, 708096, 720461 | ||||||
Attachments: |
|
Description
Moisés Barba Pérez
2011-03-28 07:53:16 UTC
I think the problem here is that the replication status and the errors log should contain the 'hostname does not match CN in peer certificate' message, or any other relevant message that could help diagnose connection problems. I think so too, because the 'stop_fatal_error' is very general to get the problem easy. With replication level logging, I see the following in the errors log on my system the first time the certificate hostname mismatch error is encountered: --------------------------------------------------------------------------- [08/Jul/2011:11:03:21 -0700] NSMMReplicationPlugin - agmt="cn=meTolocalhost.localdomain1120" (localhost:2636): Replication bind with SIMPLE auth failed: LDAP error -1 (Can't contact LDAP server) (TLS: hostname does not match CN in peer certificate) --------------------------------------------------------------------------- After this first error, replication goes into backoff mode. This message is not logged a second time. This is a different state that "fatal stop", so the original scenario must be a bit different than my test case. I'm not sure why my test doesn't exactly match up with the original reported case though. When we encounter an error during the bind stage in a replication session, we only log the error the first time we encounter it. We do this to prevent filling the logs with the same error over and over. From looking at the code, I believe that the certificate hostname error was likely logged on the reporters system when it first tried to start the replication session. The message is then suppressed until we get a different return code (or the DS instance is restarted). Created attachment 511991 [details]
Patch
Pushed to master. Thanks to Noriko for her review! Counting objects: 15, done. Delta compression using up to 2 threads. Compressing objects: 100% (8/8), done. Writing objects: 100% (8/8), 1.08 KiB, done. Total 8 (delta 6), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 354249c..a24d9cc master -> master Followed steps in comment#8 and I got : 08/Nov/2011:17:41:06 +051800] slapi_ldap_bind - Error: could not send bind request for id [cn=replication manager,cn=config] mech [SIMPLE]: error -1 (Can't contact LDAP server) -8157 (Certificate extension not found.) 115 (Operation now in progress) [08/Nov/2011:17:41:06 +051800] NSMMReplicationPlugin - agmt="cn=mmr2" (snmaptest:636): Replication bind with SIMPLE auth failed: LDAP error -1 (Can't contact LDAP server) (TLS: hostname does not match CN in peer certificate) [08/Nov/2011:17:41:10 +051800] slapi_ldap_bind - Error: could not send bind request for id [cn=replication manager,cn=config] mech [SIMPLE]: error -1 (Can't contact LDAP server) -8157 (Certificate extension not found.) 115 (Operation now in progress) [08/Nov/2011:17:41:13 +051800] slapi_ldap_bind - Error: could not send bind request for id [cn=replication manager,cn=config] mech [SIMPLE]: error -1 (Can't contact LDAP server) -8157 (Certificate extension not found.) 115 (Operation now in progress) [08/Nov/2011:17:41:13 +051800] slapi_ldap_bind - Error: could not send bind request for id [cn=replication manager,cn=config] mech [SIMPLE]: error -1 (Can't contact LDAP server) -8157 (Certificate extension not found.) 115 (Operation now in progress) [08/Nov/2011:17:41:16 +051800] slapi_ldap_bind - Error: could not send bind request for id [cn=replication manager,cn=config] mech [SIMPLE]: error -1 (Can't contact LDAP server) -8157 (Certificate extension not found.) 115 (Operation now in progress) [08/Nov/2011:17:41:17 +051800] slapi_ldap_bind - Error: could not send bind request for id [cn=replication manager,cn=config] mech [SIMPLE]: error -1 (Can't contact LDAP server) -8157 (Certificate extension not found.) 115 (Operation now in progress) [08/Nov/2011:17:41:17 +051800] slapi_ldap_bind - Error: could not send bind request for id [cn=replication manager,cn=config] mech [SIMPLE]: error -1 (Can't contact LDAP server) -8157 (Certificate extension not found.) 115 (Operation now in progress) Hence marking as VERIFIED. |