Bug 1463227

Summary: [osp11][update] Minor update of OSP11 to osp11 + rhel7.4 failed due to missed nat rules for br-ctrlplane - compute node cannot reach external repos
Product: Red Hat OpenStack Reporter: Artem Hrechanychenko <ahrechan>
Component: rhosp-directorAssignee: Sofer Athlan-Guyot <sathlang>
Status: CLOSED DUPLICATE QA Contact: Amit Ugol <augol>
Severity: low Docs Contact:
Priority: low    
Version: 11.0 (Ocata)CC: ahrechan, amuller, dbecker, emacchi, mbultel, mburns, mcornea, morazi, rhel-osp-director-maint, sasha, sathlang
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-28 13:38:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
install-undercloud.log
none
sosreport from undercloud none

Description Artem Hrechanychenko 2017-06-20 12:10:27 UTC
Created attachment 1289621 [details]
install-undercloud.log

Description of problem:
minor update for OSP11 was failed on overcloud upgrade stage: compute node
http://pastebin.test.redhat.com/495720

compute node cannot reach external networks
[heat-admin@compute-0 ~]$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms



on undercloud node NAT table:
[stack@undercloud-0 ~]$ sudo iptables -nL -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
nova-api-PREROUTING  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
nova-api-OUTPUT  all  --  0.0.0.0/0            0.0.0.0/0           
DOCKER     all  --  0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
nova-api-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0           
nova-postrouting-bottom  all  --  0.0.0.0/0            0.0.0.0/0           
MASQUERADE  all  --  172.17.0.0/16        0.0.0.0/0           

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain nova-api-OUTPUT (1 references)
target     prot opt source               destination         

Chain nova-api-POSTROUTING (1 references)
target     prot opt source               destination         

Chain nova-api-PREROUTING (1 references)
target     prot opt source               destination         

Chain nova-api-float-snat (1 references)
target     prot opt source               destination         

Chain nova-api-snat (1 references)
target     prot opt source               destination         
nova-api-float-snat  all  --  0.0.0.0/0            0.0.0.0/0           

Chain nova-postrouting-bottom (1 references)
target     prot opt source               destination         
nova-api-snat  all  --  0.0.0.0/0            0.0.0.0/0           


Version-Release number of selected component (if applicable):
OSP11 + rhel7.4
How reproducible:


Steps to Reproduce:
1.deploy osp11 using infrared
infrared virsh -v --host-address 10.9.76.22 --host-key ~/.ssh/id_rsa --cleanup yes && infrared virsh -v --host-address 10.9.76.22 --host-key ~/.ssh/id_rsa  --topology-nodes undercloud:1,controller:1,compute:1  -e  override.controller.cpu=8 -e override.controller.memory=32768 -e  override.undercloud.disks.disk1.size=100G && infrared tripleo-undercloud --version 11 --images-task=rpm && infrared tripleo-overcloud -v --introspect yes --tagging yes --post no --deployment-files virt --version 11 --deploy yes

2. run sudo rhos-release 11 -r 7.4 on undercloud && overcloud nodes

3.perform minor update:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/11/html/upgrading_red_hat_openstack_platform/


Actual results:
overcloud update failed

Expected results:
overcloud update passed

Additional info:

Comment 1 Red Hat Bugzilla Rules Engine 2017-06-20 12:10:32 UTC
This bugzilla has been removed from the release and needs to be reviewed and Triaged for another Target Release.

Comment 2 Alexander Chuzhoy 2017-06-20 16:51:06 UTC
Related to https://bugzilla.redhat.com/show_bug.cgi?id=1463375 ?

Comment 3 Artem Hrechanychenko 2017-06-20 16:53:38 UTC
maybe. I didn't checked nat rules after undercloud upgrade , because upgrade failed only on compute node, after controller node upgrade was finished

Comment 4 Sofer Athlan-Guyot 2017-06-26 13:05:12 UTC
Hi,

could we have the sos-report from the undercloud ?

Comment 5 Artem Hrechanychenko 2017-06-26 13:11:45 UTC
reproduced

Comment 6 Artem Hrechanychenko 2017-06-26 13:12:17 UTC
Created attachment 1291977 [details]
sosreport from undercloud

Comment 7 Sofer Athlan-Guyot 2017-06-27 09:24:38 UTC
Hi,

the sos report looks a little empty.  

As a side note, this is particular to test environments, I don't think we've seen that before on non virtual test environment.  But it would be nice to get to the bottom of it.

As a workaround I run this after the upgrade steps:

if ! /usr/sbin/ip a | grep vlan10; then
    sudo ifup ifcfg-vlan10
fi

if ! sudo /usr/sbin/iptables -L BOOTSTACK_MASQ -nvx -t nat | grep 10.0.0.0; then
    sudo iptables -t nat -A BOOTSTACK_MASQ -o eth0 -s 10.0.0.0/24 -j MASQUERADE
fi

I think a similar (need to adjust to particular infrared setup) command should help you move forward.

Comment 8 Artem Hrechanychenko 2017-06-27 10:04:44 UTC
yep, I have w/a for that issue and this isn't blocker.

Comment 9 Sofer Athlan-Guyot 2017-06-28 09:47:25 UTC
Hi Artem,

putting low priority on this one.  We will look at it as time permit.

Thanks for the report, if you have the complete sos-report that may be handy when we look back at it.

Comment 10 Artem Hrechanychenko 2017-07-24 11:56:41 UTC
(In reply to Sofer Athlan-Guyot from comment #9)
> Hi Artem,
> 
> putting low priority on this one.  We will look at it as time permit.
> 
> Thanks for the report, if you have the complete sos-report that may be handy
> when we look back at it.

The root cause of this bz is the same as for https://bugzilla.redhat.com/show_bug.cgi?id=1460116

Comment 11 Sofer Athlan-Guyot 2017-07-25 09:54:19 UTC
Hi,

Noted, will add it to the see also section for reference while we trying to spot the root cause of this other one.

Thanks,

Comment 12 Assaf Muller 2017-08-28 13:38:12 UTC

*** This bug has been marked as a duplicate of bug 1460116 ***