Bug 1825171
| Summary: | Octavia Service Unavailable (HTTP 503) after deployment | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | rohit londhe <rlondhe> |
| Component: | openstack-octavia | Assignee: | Carlos Goncalves <cgoncalves> |
| Status: | CLOSED DUPLICATE | QA Contact: | Bruna Bonguardo <bbonguar> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 16.0 (Train) | CC: | cgoncalves, ihrachys, lpeer, majopela, scohen |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-06 14:29:22 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
rohit londhe
2020-04-17 10:01:51 UTC
I believe this is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1797670. The fix is scheduled to be available in 16.0.2. As a temporary workaround, please remove the OVN provider driver from the enabled_provider_drivers list in octavia.conf and restart the octavia-api container in all controllers. Hey Carlos, You are fast enough :) appreciated! Sure, I'll check the fix and workaround. Hello, Disabled OVN provider driver from the enabled_provider_drivers list in octavia.conf and restart the octavia-api container in all controllers. Then the API does not throw any error when executing openstack loadbalancer list. But, the following error occurs when i try to create a load balancer : 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server [-] Exception during message handling: octavia.common.exceptions.CertificateGenerationException: Could not sign the certificate request: Failed to load CA Certificate /etc/octavia/certs/ca_01.pem. 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/octavia/certificates/generator/local.py", line 49, in _validate_cert 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server ca_cert = open(CONF.certificates.ca_certificate, 'rb').read() 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server FileNotFoundError: [Errno 2] No such file or directory: '/etc/octavia/certs/ca_01.pem' 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server During handling of the above exception, another exception occurred: 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 274, in dispatch 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/octavia/controller/queue/v1/endpoints.py", line 45, in create_load_balancer 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server self.worker.create_load_balancer(load_balancer_id, flavor) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 292, in wrapped_f 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server return self.call(f, *args, **kw) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 358, in call 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server do = self.iter(retry_state=retry_state) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 319, in iter 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server return fut.result() 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server return self.__get_result() 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server raise self._exception 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/tenacity/__init__.py", line 361, in call 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server result = fn(*args, **kwargs) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/octavia/controller/worker/v1/controller_worker.py", line 344, in create_load_balancer 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server create_lb_tf.run() 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/taskflow/engines/action_engine/engine.py", line 247, in run 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server for _state in self.run_iter(timeout=timeout): 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/taskflow/engines/action_engine/engine.py", line 340, in run_iter 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server failure.Failure.reraise_if_any(er_failures) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/taskflow/types/failure.py", line 339, in reraise_if_any 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server failures[0].reraise() 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/taskflow/types/failure.py", line 346, in reraise 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server six.reraise(*self._exc_info) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server raise value 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server result = task.execute(**arguments) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/octavia/controller/worker/v1/tasks/cert_task.py", line 47, in execute 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server validity=CONF.certificates.cert_validity_time) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/octavia/certificates/generator/local.py", line 234, in generate_cert_key_pair 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server cert = cls.sign_cert(csr, validity, **kwargs) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/octavia/certificates/generator/local.py", line 91, in sign_cert 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server cls._validate_cert(ca_cert, ca_key, ca_key_pass) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server File "/usr/lib/python3.6/site-packages/octavia/certificates/generator/local.py", line 53, in _validate_cert 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server .format(CONF.certificates.ca_certificate) 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server octavia.common.exceptions.CertificateGenerationException: Could not sign the certificate request: Failed to load CA Certificate /etc/octavia/certs/ca_01.pem. 2020-04-20 23:17:55.045 23 ERROR oslo_messaging.rpc.server The file "/etc/octavia/certs/ca_01.pem" does not exist on the 3 controllers. The file /var/lib/config-data/puppet-generated/octavia/etc/octavia/certs/ca_01.pem does not exist either. Then applied https://access.redhat.com/solutions/4909471 and then modified again all octavia.conf and restarted octavia-api : same result. Deleted the overcloud and cleaned the node, the redeploy, the Ocatavia directory error is back : TASK [Write octavia inventory] ************************************************* Monday 20 April 2020 21:27:25 +0200 (0:00:00.757) 0:41:59.657 ********** fatal: [undercloud]: FAILED! => {"changed": false, "checksum": "f95c84fe009e8462fc8fde4e3faae97b012e839c", "msg": "Destination /var/lib/mistral/overcloud/octavia-ansible not writable"} Reapplied https://bugs.launchpad.net/tripleo/+bug/1847608, deployed again & applied the fix on ocatavia.conf. Still fails with the non existing ca01.pem. As a workaround, i could force this certificate with OctaviaCaCert (and others) but i cannot find the procedure to generate those certificates. OK, I think I understand why it is failing for you. Let me go over step by step to help explain my thought. (In reply to rohit londhe from comment #5) > Disabled OVN provider driver from the enabled_provider_drivers list in > octavia.conf and restart the octavia-api container in all controllers. > Then the API does not throw any error when executing openstack loadbalancer > list. [...] > The file "/etc/octavia/certs/ca_01.pem" does not exist on the 3 controllers. > The file > /var/lib/config-data/puppet-generated/octavia/etc/octavia/certs/ca_01.pem > does not exist either. Right. The certificate files were already missing prior to disabling the OVN provider driver so disabling the provider driver won't change anything in that regard. > Then applied https://access.redhat.com/solutions/4909471 and then modified > again all octavia.conf and restarted octavia-api : same result. Just applying the workaround in file octavia-deployment-config.yaml is not enough. You need to run an overcloud update. > > Deleted the overcloud and cleaned the node, the redeploy, the Ocatavia > directory error is back : [...] Deleting the overcloud was the reason why the certificate files were not generated. In the step before you changed TripleO to generate the certificates on UPDATE but since you've deleted the overcloud and deployed a new one it is a CREATE stack action. > TASK [Write octavia inventory] > ************************************************* > Monday 20 April 2020 21:27:25 +0200 (0:00:00.757) 0:41:59.657 > ********** > fatal: [undercloud]: FAILED! => {"changed": false, "checksum": > "f95c84fe009e8462fc8fde4e3faae97b012e839c", "msg": "Destination > /var/lib/mistral/overcloud/octavia-ansible not writable"} This will be fixed in 16.0.2. See https://bugzilla.redhat.com/show_bug.cgi?id=1824068. > Reapplied https://bugs.launchpad.net/tripleo/+bug/1847608, deployed again & > applied the fix on ocatavia.conf. Still fails with the non existing > ca01.pem. As a workaround, i could force this certificate with OctaviaCaCert > (and others) but i cannot find the procedure to generate those certificates. I'm guessing you deleted the overcloud and created a new one again, yes? If so, in doing it the stack action is of value CREATE than an UPDATE so contradicts the workaround in https://access.redhat.com/solutions/4909471. Hello, Thanks for your suggestions. It worked after the following operations : - Delete overcloud - Reset the stack action is of value 'CREATE' - Redeploy (without any permission error so the the first deploy is successful - if an error occurs delete again & fix the error & deploy again) - Modified all octavia.conf and restarted octavia-api to disable OVN provider driver from the enabled_provider_drivers list(as in https://bugzilla.redhat.com/show_bug.cgi?id=1825171 ) - Loadbalancer commands are now OK and certificates are deployed in octavia directory. Reproduced these steps successfully on 2 platforms which had the same problem, Octavia is finally working. Looks like https://access.redhat.com/solutions/4909471 is not working if the first deploy fails (due to the right permission for example which was the initial problem). *** This bug has been marked as a duplicate of bug 1797670 *** |