Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1843175

Summary: Baremetal Deployment OSP16 overcloud stops with json.decoder.JSONDecodeError
Product: Red Hat OpenStack Reporter: Jason Grosso <jgrosso>
Component: tripleo-ansibleAssignee: Alex Schultz <aschultz>
Status: CLOSED ERRATA QA Contact: David Rosenfeld <drosenfe>
Severity: high Docs Contact:
Priority: medium    
Version: 16.0 (Train)CC: aschultz, elicohen, emacchi, hbrock, jslagle, mburns, morazi, nweinber
Target Milestone: ---Keywords: Reopened, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tripleo-ansible-0.5.1-0.20200706173411.c53bf61.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1850715 (view as bug list) Environment:
Last Closed: 2021-09-15 07:08:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 3 Alex Schultz 2020-06-04 20:00:34 UTC
For future records, when you get this error it is basically an execution timeout. In this case the NetworkConfig executed, the systems become unavailable, and ansible just hangs.  Eventually the mistral -> zaqar connection either ends or a blank response is sent and the json decode error is thrown because the response is empty.  


To trouble shoot, check the `openstack workflow execution list` and see if there is a 'failed' task. You would be able to to do an `openstack workflow execution show <id> -f yaml` to get the error. If you hit this, it will tell you to look in the ansible.log for the failure. If the last task executing is NetworkConfig then the situation described above occurred. You will need to hop on the  system's console and troubleshoot the network config. 


The improvement here might be to try and improve the 'timeout' messaging by catching this and doing some improved error handling. In future versions we've cleaned up this condition by removing the mistral to zaqar interactions.

Comment 15 David Rosenfeld 2021-07-21 13:16:53 UTC
DF doesn't have baremetal servers. Checked with submitter and problem is no longer seen.

Comment 17 errata-xmlrpc 2021-09-15 07:08:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:3483