Bug 1734813

Summary: [OSP 15] rbd connection timeouts will lead to n-cpu hanging
Product: Red Hat OpenStack Reporter: Lee Yarwood <lyarwood>
Component: openstack-novaAssignee: Lee Yarwood <lyarwood>
Status: CLOSED DEFERRED QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: medium Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: dasmith, eglynn, jhakimra, kchamart, sbauza, sclewis, sgordon, vromanso
Target Milestone: z1Keywords: Triaged, ZStream
Target Release: 15.0 (Stein)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1734814 (view as bug list) Environment:
Last Closed: 2019-09-19 14:28:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1734814, 1734815, 1734818    

Description Lee Yarwood 2019-07-31 14:16:09 UTC
Description of problem:
This was originally reported upstream but has recently been seen downstream by a number of customers with connectivity issues in their env.

Nova waits indefinitely on ceph client hangs due to network problems
https://bugs.launchpad.net/nova/+bug/1834048


Version-Release number of selected component (if applicable):
OSP 15.beta

How reproducible:
Always.

Steps to Reproduce:
1. Deploy environment.
2. Later block access to the rbd cluster.

Actual results:
n-cpu will wait indefinitely while attempting to connect.

Expected results:
n-cpu should fail and mark the host as DOWN until connectivity is re-established. 

Additional info: