Description of problem: Under pressure, ipa certificate processing have a hard time: ~~~ [Fri Jul 14 03:55:03.251161 2023] [:error] [pid 14263] data = read_input(environ) [Fri Jul 14 03:55:03.251187 2023] [:error] [pid 14263] File "/usr/lib/python2.7/site-packages/ipaserver/rpcserver.py", line 200, in read_input [Fri Jul 14 03:55:03.251210 2023] [:error] [pid 14263] return environ['wsgi.input'].read(length).decode('utf-8') [Fri Jul 14 03:55:03.251227 2023] [:error] [pid 14263] IOError: request data read error [Fri Jul 14 03:55:03.251482 2023] [:error] [pid 14263] ipa: INFO: [xmlserver] host/compute4.localdomain@LOCALDOMAIN: None: InternalError ~~~ In this case we're resubmitting a couple hundred certificates to the IPA and some of them return CA_UNREACHABLE . Once the overcloud deployment failed, we restart certmonger on the hosts with failed resubmit and resubmit goes through. Version-Release number of selected component (if applicable): RHEL 7.6 / ipa-server-4.6.4-10.el7_6.3.x86_64 How reproducible: Always but random host each time Steps to Reproduce: 1. Deploy an overcloud with more than 100 computes 2. Run an overcloud_deploy.sh 3. Actual results: Failure to resubmit certs on random hosts Expected results: No failures Additional info:
Is the Apache error log available from the IPA server during the failure(s)?
Yes, we should have sosreports from both IPA servers attached to the case but I'm not a IDM support engineer so my knowledge of where the logs are is pretty limited beside pointing you back to supportshell.