Bug 2227776 - bulk_pull error after compute node reboot resulting in timeout [NEEDINFO]
Summary: bulk_pull error after compute node reboot resulting in timeout
Keywords:
Status: ASSIGNED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 13.0 (Queens)
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
: ---
Assignee: Slawek Kaplonski
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-07-31 12:27 UTC by Paul Jany
Modified: 2023-08-16 20:20 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:
skaplons: needinfo? (pgodwin)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-27039 0 None None None 2023-07-31 12:29:49 UTC

Description Paul Jany 2023-07-31 12:27:22 UTC
Description of problem:
After compute node reboot, there is bulk_pull error and the instances lose connectivity. The suggestions as per bugzilla: 2212348 did not help 

Version-Release number of selected component (if applicable):
13

How reproducible:
The case is re-opened.  
From the suggestions we have provided in Comment-3, customer could only perform increase of rpc_response_timeout and he that did not help. 

Below are the steps customer has performed.
- Reboot of the compute node and bulk pull is observed.
- Increased rpc_response_timeout and  restarted neutron, ovsagent and neutron-sriov containers. That did not help
- Customer has to re-deploy the instances to restore the connectivity.

I have asked for the possibility of upgrade as this is not on the latest version of OSP-13, and I got the reply that "Red Hat OpenStack software" running in custom containers and without OSP Director, whereas the upgrade has to be performed using their custom CVIM software.


Note You need to log in before you can comment on or make changes to this bug.