Bug 1571897 - OSP13 Deployment with TLS everywhere fails - http://freeipa-0.redhat.local/ipa/crl/MasterCRL.bin returned 404 Not Found
Summary: OSP13 Deployment with TLS everywhere fails - http://freeipa-0.redhat.local/ip...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 13.0 (Queens)
Assignee: Harry Rybacki
QA Contact: Pavan
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-25 15:56 UTC by Itzik Brown
Modified: 2018-11-02 20:28 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: CA not creating CRL. Consequence: Deployments that fail because the CRL is not present when puppet goes to get it. Fix: Modify config to ensure CRL is generated and then restart the CA's daemon. Result: This fix will ensure the CRL is generated at the start to avoid this issue.
Clone Of:
Environment:
Last Closed: 2018-11-02 20:28:52 UTC
Target Upstream Version:


Attachments (Terms of Use)
Deployment log (552.56 KB, text/plain)
2018-04-25 15:56 UTC, Itzik Brown
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 571805 0 'None' MERGED Ensure IPA CA generates CRL on startup 2020-09-28 13:26:57 UTC

Description Itzik Brown 2018-04-25 15:56:01 UTC
Created attachment 1426746 [details]
Deployment log

Description of problem:
Deployment with TLS anywhere fails.
The details are in the log

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-8.0.2-0.20180416194362.29a5ad5.el7ost.noarch

How reproducible:


Steps to Reproduce:
1. 
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Tim Rozet 2018-05-03 13:06:00 UTC
This is seen sometimes because the curl fails to get the CA cert, but if you curl again after deployment fails it will work.  Looks like some kind of timing/network issue.  We could fix this by modifying this code:
https://github.com/openstack/puppet-tripleo/blob/master/manifests/certmonger/ca/crl.pp#L106

Instead of a file resource there to pull from a http site, we could modify it to be an exec resource with retries.

Comment 4 Prasanth Anbalagan 2018-06-01 12:49:28 UTC
Observing this consistently. Raising the priority as it is blocking Regression testing of OSP13.

2018-05-31 16:05:30Z [overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.0]: CREATE_IN_PROGRESS  state changed
2018-05-31 16:05:31Z [overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.1]: CREATE_IN_PROGRESS  state changed
2018-05-31 16:06:29Z [overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.1]: SIGNAL_IN_PROGRESS  Signal: deployment 77046748-fdc8-46fa-806c-2914bfe74421 failed (2)
2018-05-31 16:06:30Z [overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.1]: CREATE_FAILED  Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
2018-05-31 16:06:30Z [overcloud.AllNodesDeploySteps.ComputeDeployment_Step1]: CREATE_FAILED  Resource CREATE failed: Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
2018-05-31 16:06:30Z [overcloud.AllNodesDeploySteps.ComputeDeployment_Step1]: CREATE_FAILED  Error: resources.ComputeDeployment_Step1.resources[1]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2018-05-31 16:06:30Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED  Resource CREATE failed: Error: resources.ComputeDeployment_Step1.resources[1]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2018-05-31 16:06:30Z [overcloud.AllNodesDeploySteps]: CREATE_FAILED  Error: resources.AllNodesDeploySteps.resources.ComputeDeployment_Step1.resources[1]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2
2018-05-31 16:06:30Z [overcloud]: CREATE_FAILED  Resource CREATE failed: Error: resources.AllNodesDeploySteps.resources.ComputeDeployment_Step1.resources[1]: Deployment to server failed: deploy_status_code: Deployment exited with non-zero status code: 2

 Stack overcloud CREATE_FAILED 

overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.1:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: 77046748-fdc8-46fa-806c-2914bfe74421
  status: CREATE_FAILED
  status_reason: |
    Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    ...
            "Warning: ModuleLoader: module 'openstacklib' has unresolved dependencies - it will only see those that are resolved. Use 'puppet module list --tree' to see information about modules", 
            "Error: /Stage[main]/Tripleo::Certmonger::Ca::Crl/File[tripleo-ca-crl]: Could not evaluate: Could not retrieve information from environment production source(s) http://ipa-ca/ipa/crl/MasterCRL.bin", 
            "Warning: /Stage[main]/Tripleo::Certmonger::Ca::Crl/Exec[tripleo-ca-crl-process-command]: Skipping because of failed dependencies"
        ]
    }
    	to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/dfea285f-27b5-484d-a373-f514bf095a14_playbook.retry
    
    PLAY RECAP *********************************************************************
    localhost                  : ok=23   changed=11   unreachable=0    failed=1   
    
    (truncated, view all with --long)
  deploy_stderr: |

overcloud.AllNodesDeploySteps.ComputeDeployment_Step1.0:
  resourceHeat Stack create failed.
Heat Stack create failed.
_type: OS::Heat::StructuredDeployment
  physical_resource_id: c5314772-fe16-46f5-8a3a-b949e3ac28de
  status: CREATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |
    ...
            "Warning: ModuleLoader: module 'openstacklib' has unresolved dependencies - it will only see those that are resolved. Use 'puppet module list --tree' to see information about modules", 
            "Error: /Stage[main]/Tripleo::Certmonger::Ca::Crl/File[tripleo-ca-crl]: Could not evaluate: Could not retrieve information from environment production source(s) http://ipa-ca/ipa/crl/MasterCRL.bin", 
            "Warning: /Stage[main]/Tripleo::Certmonger::Ca::Crl/Exec[tripleo-ca-crl-process-command]: Skipping because of failed dependencies"
        ]
    }
    	to retry, use: --limit @/var/lib/heat-config/heat-config-ansible/94f38581-aea4-46ef-8c54-456d5424d9de_playbook.retry
    
    PLAY RECAP *********************************************************************
    localhost                  : ok=23   changed=11   unreachable=0    failed=1   
    
    (truncated, view all with --long)
  deploy_stderr: |

(undercloud) [stack@undercloud-0 ~]$

Comment 5 Ade Lee 2018-06-01 19:26:41 UTC
This is caused by the IPA server not having generated a CRL yet.

Making the tripleo code not require a CRL really isn't a good option because HA proxy wont restart without it.

The better fix is to ensure that IdM generates the CRL when it first starts up.
This is done through a configuration change in CS.cfg on the Dogtag CA. 

The CA not having a CRL present occurs only when the CA is first deployed.  This would occur pretty much only in a testing environment.

There has been a change to configure the CA to do this when infrared deploys the Dogtag CA.  

https://review.gerrithub.io/c/redhat-openstack/infrared/+/408733/

We need a similar change to oooq for when that component deploys the CA.

Finally, we should document that the CRL needs to be present before deploying TLS everywhere.

Comment 6 Ade Lee 2018-06-01 19:27:58 UTC
https://projects.engineering.redhat.com/browse/RHOSINFRA-1617 is the case in infrared.

Comment 7 Harry Rybacki 2018-06-01 20:00:23 UTC
Issue is resolved within Infrared per comment#5 and comment#6.

Upstream patch submitted to fix the issue within OOOQ-Extras[1].



[1] - https://review.openstack.org/571805

Comment 9 Harry Rybacki 2018-06-06 14:15:23 UTC
Upstream review[1] has merged. Moving bug to POST.

[1] - https://review.openstack.org/571805

Comment 10 Harry Rybacki 2018-06-07 14:53:43 UTC
Ade, is there anything we need to do in puppet-tripleo? At this point we have the fix in both deployment tools used for CI. If there is nothing to do inside of puppet-tripleo, I propose we move this bug to Documentation and ensure the requirement is clear.

Comment 11 Jon Schlueter 2018-07-19 02:22:00 UTC
If you want this in OSP 13 you need to get it backported to stable/queens branch upstream.


Note You need to log in before you can comment on or make changes to this bug.