Bug 1420432
Summary: | osp-d 11 - ipv6/vlan deployment fails on "unable to get cib" - missing corosync.conf | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Pavel Sedlák <psedlak> | ||||||||
Component: | openstack-tripleo-image-elements | Assignee: | Michele Baldessari <michele> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Amit Ugol <augol> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 11.0 (Ocata) | CC: | aschultz, dbecker, jschluet, mburns, mcornea, michele, mkrcmari, morazi, psedlak, rhel-osp-director-maint, royoung | ||||||||
Target Milestone: | rc | Keywords: | Automation | ||||||||
Target Release: | 11.0 (Ocata) | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | openstack-tripleo-image-elements-6.0.0-0.20170131024050.8597926 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2017-05-17 19:57:45 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Pavel Sedlák
2017-02-08 15:52:36 UTC
Created attachment 1248632 [details]
/home/stack/virt/debug.yaml
Created attachment 1248633 [details]
/home/stack/virt/hostnames.yml
Created attachment 1248635 [details]
/home/stack/virt/network/network-environment-v6.yaml
original ipv6/ipv4 ranges replaced with dummy example
It seems that It's caused by failed command setting cluster authentication: "/sbin/pcs cluster auth controller-0 controller-1 controller-2 -u hacluster -p ***** --force", because the pacemaker communication is being blocked by iptables at the time of command execution, the rules are added for ipv6 later after the command is being executed. It's reproducible only on ipv6 based deployments because ipv4 iptables rules are empty at the command execution: [heat-admin@controller-0 ~]$ sudo iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination becuase: [heat-admin@controller-0 ~]$ sudo cat /etc/sysconfig/iptables # empty ruleset created by tripleo-image-elements But ipv6 iptables includes some iptable rules at the time of command execution: [heat-admin@controller-0 ~]$ sudo ip6tables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all anywhere anywhere state RELATED,ESTABLISHED ACCEPT ipv6-icmp anywhere anywhere ACCEPT all anywhere anywhere ACCEPT tcp anywhere anywhere state NEW tcp dpt:ssh ACCEPT udp anywhere fe80::/64 udp dpt:dhcpv6-client state NEW REJECT all anywhere anywhere reject-with icmp6-adm-prohibited Chain FORWARD (policy ACCEPT) target prot opt source destination REJECT all anywhere anywhere reject-with icmp6-adm-prohibited Chain OUTPUT (policy ACCEPT) target prot opt source destination And [heat-admin@controller-0 ~]$ sudo cat /etc/sysconfig/ip6tables # sample configuration for ip6tables service # you can edit this manually or use system-config-firewall # please do not ask us to add additional ports/services to this default configuration *filter :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT -A INPUT -p ipv6-icmp -j ACCEPT -A INPUT -i lo -j ACCEPT -A INPUT -p tcp -m state --state NEW -m tcp --dport 22 -j ACCEPT -A INPUT -d fe80::/64 -p udp -m udp --dport 546 -m state --state NEW -j ACCEPT -A INPUT -j REJECT --reject-with icmp6-adm-prohibited -A FORWARD -j REJECT --reject-with icmp6-adm-prohibited COMMIT So from an initial look this is the downstream ipv6 manifestation of this bug https://bugs.launchpad.net/tripleo/+bug/1657108/. The super short version is that if we start off an image that has prepopulated /etc/sysconfig/ip[6]tables rules (and the iptables package does ship such rules that only allow ssh and icmp), pcs will be executed when the firewall modules has not yet kicked in to open up the pacemaker/pcs ports and so it will fail. To verify/disprove this theory can you try the following on the undercloud: echo '' > /tmp/iptables echo '' > /tmp/ip6tables virt-copy-in -a overcloud-full.qcow2 /tmp/iptables /etc/sysconfig/ virt-copy-in -a overcloud-full.qcow2 /tmp/ip6tables /etc/sysconfig/ openstack overcloud image upload --image-path . --update-existing And then try and redeploy? Note that we already have fixes in order to empty these stock rules from the image building process. I assume that they have not yet hit downstream, because if that were the case we would not see the entries in ip[6]tables at comment 5. (In reply to Michele Baldessari from comment #7) > And then try and redeploy? Note that we already have fixes in order to empty > these stock rules from the image building process. I assume that they have > not yet hit downstream, because if that were the case we would not see the > entries in ip[6]tables at comment 5. I am confirming Michele's assumption - The deployment was successful after placing empty iptables rules into overcloud image and relabeling selinux. Mike, any idea when we will build images that have the following t-i-e patch? https://review.openstack.org/#/c/426144/ thanks, Michele Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1245 |