Bug 245790
Summary: | TPS can't do token operation against clone CA | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Certificate System | Reporter: | Issue Tracker <tao> | ||||
Component: | TPS | Assignee: | Christina Fu <cfu> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Chandrasekar Kannan <ckannan> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.2 | CC: | awnuk, benl, cfu, tao | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-10-20 18:04:42 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 445047 | ||||||
Attachments: |
|
Description
Issue Tracker
2007-06-26 18:29:30 UTC
Description of problem: TPS can't do token operation against clone CA How reproducible: 01) setup TPS against master CA/TKS/DRM 02) make sure you successfully do token operation against master CA/TKS/DRM such as enroll and format 03) setup clone CA with HSM. It must be with HSM. 04) after finish clone CA, change TPS's configuration to use clone CA 05) TPS does NOT able to communicate with Clone CA [2007-05-16 09:27:12] 9eaea48 RA_Format_Processor - Origin is 4090006200010000003A, Current is 4090006200010000003A [2007-05-16 09:27:12] 9eaea48 HttpConnection::getResponse - Send request to host ca-ma.epki.sstest.office.aol.com:443 servlet /ca/subsystem/ca/doRevoke [2007-05-16 09:27:12] 9eaea48 HttpConnection::getResponse - Send request to host ca-ma.epki.sstest.office.aol.com:443 servlet /ca/subsystem/ca/doRevoke [2007-05-16 09:27:12] 9eaea48 HttpConnection::getResponse - Send request to host ca-ma.epki.sstest.office.aol.com:443 servlet /ca/subsystem/ca/doRevoke [2007-05-16 09:27:12] 9eaea48 CertEnroll::sendReqToCA - Failed connecting to CA after 3 retries This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Kent - The TPS subsystem does not support cloning (it is the only subsystem with this characteristic). Therefore, it is not possible to set up a TPS subsystem on the master and merely redirect the ESC client component to point at the clone. For failover support, multiple TPS instances must be created. Here are the details from Chapter 7 of the Administrator's Guide: 7.1.1. Configuring Failover Support The subsystem instance to which the TPS connects is set in the conn.subsystem#.hostport parameter of the CS.cfg configuration file. For example, the CA instance is set in the following parameter: conn.ca1.hostport=aCA.example.com:9443 To configure failover support, list multiple instances in the conn.subsystem#.hostport parameter, separated by commas. For example: conn.ca1.hostport=aCA.example.com:9443,bCA.example.com:9543,cCA.example.com:9643 For failover support to be properly configured, all of the subsystem instances must have the same policies and configuration; this means all of the subsystems must be clones. For example, if the TPS is configured to communicate with three CAs, the three CAs must be clones of each other. This means that the values of the other configuration parameters are the same between the instances. The CA configuration parameters are listed in Table 7.2, “CA Connection Settings”. The TKS configuration parameters are listed in Table 7.3, “TKS Connection Settings”. The DRM configuration parameters are listed in Table 7.4, “DRM Connection Settings”. Thanks, Marco This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Kent, I use master TKS subsystem. The TPS subsystem can use either master TKS or clone TKS. I do NOT have clone TKS subsystem. My TPS is using 01) Clone CA 02) Master TKS 03) Master DRM Thanks, Fu Internal Status set to 'Waiting on Support' Status set to: Waiting on Tech This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Fu, We need to have the TKS subsystem on the clone CA. kent Internal Status set to 'Waiting on Customer' Status set to: Waiting on Client This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 What do you mean that? Do you mean the Clone CA can't work with Master TKS? I want to use clone CA with Master TKS. Look at our Architecture picture. The TPS can configure connect to either master CA/TKS/DRM or any clone CA/TKS/DRM. The production is using Clone CA, Master TKS, Master DRM. For failure over, if Master DRM down, I can change TPS to point to any Clone DRM. Do you really understand how CS 7.2 work? Why I have to spent time to educate you? If you don't know, ask developers. I want TPS to work on following combination. 01)TPS do token operation with Clone CA, Master TKS, and Master DRM 02)TPS do token operation with Clone CA, Clone TKS, and Master DRM 03)TPS do token operation with Clone CA, Clone TKS, and Clone DRM 04)TPS do token operation with Clone CA, Master TKS, and Clone DRM 05)TPS do token operation with Master CA, Clone TKS, and Clone DRM 06)TPS do token operation with Master CA, Master TKS, and Master DRM 07)TPS do token operation with Master CA, Clone TKS, and Master DRM 08)TPS do token operation with Master CA, Master TKS, and Clone DRM Thanks, Fu Internal Status set to 'Waiting on Support' Status set to: Waiting on Tech This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Kent, I'm not sure what you're asking. It sounds like you're saying that the TKS must reside on the same host as the CA. That is not how we currently have CS 7.1 deployed and running in production, please refer back to our architecture diagram. If the TKS must run on the CA host this is a new requirement to us. Can you please clarify? This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Fu, Based on the info below, I now have a more complete picture of what you are trying to do. The diagram you refrenced does not tell me 1) where TPS resides in your environment or 2) what you logic flow is. Without that information, we may occasionaly assume a configuration (like tps on master ca) that is incorrect for your environment. In the future, if you could include the server names based on that diagram, it will help us troubleshoot your issues faster and lower frustrations by having correct information. An example "TPS on oslo-ds2 cannot communicate with ca-mb". With that information and your diagram, we have the exact information on whats happening and how your specialized environment comes into play. Hope this makes sense. Now, on to the problem.... You are correct with TPS in a standalone environment. If you can please provide me with the corresponding logs, that will help us find whats happening. kent This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Fu, Based on the info below, I now have a more complete picture of what you are trying to do. The diagram you refrenced does not tell me 1) where TPS resides in your environment or 2) what you logic flow is. Without that information, we may occasionaly assume a configuration (like tps on master ca) that is incorrect for your environment. In the future, if you could include the server names based on that diagram, it will help us troubleshoot your issues faster and lower frustrations by having correct information. An example "TPS on oslo-ds2 cannot communicate with ca-mb". With that information and your diagram, we have the exact information on whats happening and how your specialized environment comes into play. Hope this makes sense. Now, on to the problem.... You are correct with TPS in a standalone environment. If you can please provide me with the corresponding logs and the CS.cfg from the TPS instance, that will help us find whats happening. kent Internal Status set to 'Waiting on Customer' Status set to: Waiting on Client This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Kent - Just went through the debug logs - seems to be an issue with retrieving certs for the CA/TKS/DRM subsystems from the HSM that is connected to the clone. I will need to replicate Fu's environment using a few virtual machines and run tests against the clone. -Marco mrhodes assigned to issue for SEG - Certificate System Engineering. This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Fu, Looks like there is an issue retrieving certs from the HSM that is connects to the clone. Will keep you updated.... This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Kent, Would you mind give me an explanation why you think that is "an issue retrieving certs from the HSM"? What evidence you think that is relate to HSM? Thanks in advance, Fu This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Internal Status set to: 'Waiting on Support' In the tps-error log, here is what I see at the start of the capture --> [2007-05-03 08:38:47] 9d62e10 RA::InitializeHttpConnections - A ca certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! [2007-05-03 08:38:47] 9d62e10 RA::InitializeHttpConnections - A tks certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! [2007-05-03 08:38:47] 9d62e10 RA::InitializeHttpConnections - A drm certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! [2007-05-03 08:38:51] 9d62e10 RA::InitializeHttpConnections - A ca certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! [2007-05-03 08:38:51] 9d62e10 RA::InitializeHttpConnections - A tks certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! [2007-05-03 08:38:51] 9d62e10 RA::InitializeHttpConnections - A drm certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! I've had a chance to dig a bit deeper into the log and I see that the certificates are eventually recognized --> 2007-05-03 08:44:39] 8752e10 RA::InitializeHttpConnections - A ca certificate nicknamed "epki-core-tps:subsystemCert cert-rhpki-tps9001" was found in the certificate database for connection 1. [2007-05-03 08:44:39] 8752e10 RA::InitializeHttpConnections - A tks certificate nicknamed "epki-core-tps:subsystemCert cert-rhpki-tps9001" was found in the certificate database for connection 1. [2007-05-03 08:44:39] 8752e10 RA::InitializeHttpConnections - A drm certificate nicknamed "epki-core-tps:subsystemCert cert-rhpki-tps9001" was found in the certificate database for connection 1. Please let Fu know that this was our initial analysis and that after further discussion with Eng, it's safe to ignore those initial errors. Given this, I will need to perform further log analysis, replicate their topology and configuration for further testing. These tasks are all in progress. Thanks, - Marco This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Fu, that was our initial response reading your logs: [2007-05-03 08:38:47] 9d62e10 RA::InitializeHttpConnections - A ca certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! [2007-05-03 08:38:47] 9d62e10 RA::InitializeHttpConnections - A tks certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! [2007-05-03 08:38:47] 9d62e10 RA::InitializeHttpConnections - A drm certificate nicknamed "[HSM_LABEL][NICKNAME]" could NOT be found in the certificate database for connection 1! Later, though, it does connect: 2007-05-03 08:44:39] 8752e10 RA::InitializeHttpConnections - A ca certificate nicknamed "epki-core-tps:subsystemCert cert-rhpki-tps9001" was found in the certificate database for connection 1. [2007-05-03 08:44:39] 8752e10 RA::InitializeHttpConnections - A tks certificate nicknamed "epki-core-tps:subsystemCert cert-rhpki-tps9001" was found in the certificate database for connection 1. [2007-05-03 08:44:39] 8752e10 RA::InitializeHttpConnections - A drm certificate nicknamed "epki-core-tps:subsystemCert cert-rhpki-tps9001" was found in the certificate database for connection 1. After further research, its safe to ignore those errors. Currently, we are replicating the topology and configuration to further test this out. Internal Status set to 'Waiting on SEG' This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Kent, If you look near the bottom of the log you'll see similar but different errors that seem to indicate that the certificates CAN be found in the HSM/database, but the connect fails between the CA and TPS. Fu did a tcpdump on the CA and noticed that the just before the error message, the TPS was not even initiating a connection to the CA. Perhaps indicating that it may not be a cert database issue. This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Kent, The TPS was not even initiating a connection to CLONE CA. [2007-05-16 09:27:12] 9eaea48 RA_Format_Processor - Origin is 4090006200010000003A, Current is 4090006200010000003A[2007-05-16 09:27:12] 9eaea48 HttpConnection::getResponse - Send request to host ca-ma.epki.sstest.office.aol.com:443 servlet /ca/subsystem/ca/doRevoke[2007-05-16 09:27:12] 9eaea48 HttpConnection::getResponse - Send request to host ca-ma.epki.sstest.office.aol.com:443 servlet /ca/subsystem/ca/doRevoke[2007-05-16 09:27:12] 9eaea48 HttpConnection::getResponse - Send request to host ca-ma.epki.sstest.office.aol.com:443 servlet /ca/subsystem/ca/doRevoke[2007-05-16 09:27:12] 9eaea48 CertEnroll::sendReqToCA - Failed connecting to CA after 3 retries[2007-05-16 09:34:43] 8115e10 mod_tokendb::mod_tokendb_terminate - The Tokendb module has been terminated! This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 Fu, Just confirming what we talked about on our call today: Seeing both cama and cada in the tps-debug logs is ok. You were troubleshooting the issue, and attempted against another CA. Also, seeing both ports 443 and 1443 is ok. Again, you were troubleshooting the connectivity and changed ports. As you guys are preparing to move facilities, you will get us the tks debug and tks configs when you are able to do so. I think thats all for this ticket from todays call. If I missed anything, please correct me. thanks kent Internal Status set to 'Waiting on Customer' Status set to: Waiting on Client Version set to: '7.1' This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 File uploaded: tks-da-debug This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 it_file 94073 File uploaded: tks-da-CS.cfg This event sent from IssueTracker by msauton [SEG - Certificate System Engineering] issue 121474 it_file 94074 Created attachment 157948 [details]
tps log
Reply to Comment #22, I think, please correct me if i am wrong, the customer reported that TPS failed to connect to the clone CA. So it will be helpful to see the TPS debug log along with the clone CA log so that we can tell if the clone CA received the request or not. Can we ask the customer to change TPS debug level to 10 so that we can get some additional information in the TPS debug log. thanks! No problem, I think Marco mentioned that to me. OK, I saw about 14 "Failed connecting to CA ..." in the TPS log. Most of them are for connecting to ca-ma.epki.sstest.office.aol.com:443. Just want to confirm if "ca-ma.epki.sstest.office.aol.com" is the clone CA. And I dont see "ca-ma" in the tps configuration provided in Comment #23. Can we do this? Can we ask the customer to try to reproduce the problem one more time? Change tps debug level to 10. Then, run the test and send us tps cfg, tps log, and clone ca log. Closing Issue Tracker ticket 121474 per Fu. Token operations against the clone CA are working correctly after a correction to the server cert nickname was made in the clone's configuration file. SEG was not able to reproduce the error. User nkwan's account has been closed operator issue. not a bug. closing bug. |