Bug 1594768 - OpenShift 3.7 Problems with different availability zones when integrating with OpenStack
Summary: OpenShift 3.7 Problems with different availability zones when integrating wi...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Documentation
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Vikram Goyal
QA Contact: Vikram Goyal
Vikram Goyal
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-25 11:58 UTC by Christian Stark
Modified: 2022-03-13 15:09 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-23 11:01:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Christian Stark 2018-06-25 11:58:33 UTC
Description of problem:

Customer is Integrating OpenShift with Openstack and there seems to be a problem
with the different availability zones.

In OpenStack there are separated concepts of availability for storage (cinder) 
and compute (nova)

When scheduling pods with cinder pv's Kubernetes tries to find a Node which is in the availability zone of the corresponding cinder volume for which the PV has been created. As the storage availability zones was called “nova”, but the availabity-zones "a" or "b" it did not work

From the project:
failure-domain.beta.kubernetes.io/zone=nova is related to volume AZ, not to compute AZ. Currently there is only one volume AZ, called nova, but the nodes are labelled to non existent volume AZs like production. 
So we have to implement Volume AZs named as the same compute AZs, or label the nodes under AZ nova

Conclusion after some internal discussion was formerly:
Currently there is no way to get OCP working with zoned OpenStack, where Cinder is also zoned. Kubernetes always expects the zone names be the same between compute and cinder zones. 
Storage class is no help here, as kubernetes looks for cinder zone.


Concrete Customer Problem (while installing OpenShift 3.7 on OpenStack):
When trying to start Cassandra-Pods they fail with: NoVolumeZoneConflict          -------------   --------  ------                   -------
  3h            3m              666     default-scheduler                       Warning   FailedScheduling 0/7 nodes are available: 4 CheckServiceAffinity, 4 MatchNodeSelector, 7 NoVolumeZoneConflict.


----------------------------------------------------------------------------------------------------------------------------------------------------------------

A workaround which has been discussed is to remove the predicate
NoVolumeZoneConflict from the scheduler as described here:
https://access.redhat.com/solutions/3251651
Customer already confirmed that this worked.
(the other workaround relabeling works as well but seems quite complex)


Version-Release number of selected component (if applicable):
OpenShift 3.7


Additional info:

We don't see any information in the Reference Guide
https://access.redhat.com/documentation/en-us/reference_architectures/2018/html-single/deploying_and_managing_openshift_3.9_on_red_hat_openstack_platform_10/
as we find that Availability Zones should be covered there.



Questions from this bug:
1. Can docu be improved regarding Openshift/OpenStack Integration
   availability zones. 
2. Can the architecture-guide be enhanced regarding availability zones?
3. Are there any concerns with the workaround, can this generally be used with OpenShift/OpenStack integrations?






Document URL: 
https://docs.openshift.com/container-platform/3.7/install_config/configuring_openstack.html

https://access.redhat.com/documentation/en-us/reference_architectures/2018/html-single/deploying_and_managing_openshift_3.9_on_red_hat_openstack_platform_10/

Comment 2 Vikram Goyal 2018-07-19 05:23:56 UTC
Hey Christian,

The OpenShift product docs team doesn't manage the reference architecture guide. There is a separate team that does that, and I will transfer this bug to that team after confirmation from you and I can liaise with them to get a resolution.

We maintain the product docs which are hosted here [1] or here [2].

I know that you have linked to the Configuring OpenStack document in the product docs but I wasn't sure what you wanted us to improve in there. Did you want us to add information about the Openshift/OpenStack Integration zones in there along with the ref arch?

[1] https://docs.openshift.com/index.html

[2] https://access.redhat.com/documentation/en-us/openshift_container_platform/3.9/

[3] https://docs.openshift.com/container-platform/3.7/install_config/configuring_openstack.html

Comment 3 rlopez 2018-07-19 16:28:07 UTC
Will investigate this for OCP3.10 documentation with the goal of backporting to OCP3.9 documentation.


Note You need to log in before you can comment on or make changes to this bug.