Bug 1463375 - rhosp-director: after rebooting the undercloud node iptables fails to start properly.
rhosp-director: after rebooting the undercloud node iptables fails to start p...
Status: ASSIGNED
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
12.0 (Pike)
Unspecified Unspecified
high Severity medium
: ga
: 12.0 (Pike)
Assigned To: Ben Nemec
Amit Ugol
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-06-20 12:44 EDT by Alexander Chuzhoy
Modified: 2017-07-27 12:45 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
install-undercloud.log (352.02 KB, text/plain)
2017-06-26 18:59 EDT, Alexander Chuzhoy
no flags Details

  None (edit)
Description Alexander Chuzhoy 2017-06-20 12:44:32 EDT
rhosp-director: masquerade_network directive in undercloud.conf is being ignored.

Environment:
instack-undercloud-7.0.1-0.20170609013145.el7ost.noarch


Steps to reproduce:

(undercloud) [stack@undercloud-0 ~]$ cat undercloud.conf 
[DEFAULT]
# Network interface on the Undercloud that will be handling the PXE
# boots and DHCP for Overcloud instances. (string value)
local_interface = eth0

# 192.168.24.0 subnet is by default used since RHOS11
local_ip = 192.168.24.1/24
network_gateway = 192.168.24.1
undercloud_public_vip = 192.168.24.2
undercloud_admin_vip = 192.168.24.3
network_cidr = 192.168.24.0/24
masquerade_network = 192.168.24.0/24
dhcp_start = 192.168.24.5
dhcp_end = 192.168.24.24
inspection_iprange = 192.168.24.100,192.168.24.120


Depoy undercloud.

See the iptables rules:


(undercloud) [stack@undercloud-0 ~]$ sudo iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
nova-api-PREROUTING  all  --  anywhere             anywhere            
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
nova-api-OUTPUT  all  --  anywhere             anywhere            
DOCKER     all  --  anywhere            !loopback/8           ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
nova-api-POSTROUTING  all  --  anywhere             anywhere            
MASQUERADE  all  --  172.17.0.0/16        anywhere            
nova-postrouting-bottom  all  --  anywhere             anywhere            

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere            

Chain nova-api-OUTPUT (1 references)
target     prot opt source               destination         

Chain nova-api-POSTROUTING (1 references)
target     prot opt source               destination         

Chain nova-api-PREROUTING (1 references)
target     prot opt source               destination         

Chain nova-api-float-snat (1 references)
target     prot opt source               destination         

Chain nova-api-snat (1 references)
target     prot opt source               destination         
nova-api-float-snat  all  --  anywhere             anywhere            

Chain nova-postrouting-bottom (1 references)
target     prot opt source               destination         
nova-api-snat  all  --  anywhere             anywhere 



Expected result: 

nat rule for 192.168.24.0
Comment 1 Alexander Chuzhoy 2017-06-20 12:51:55 EDT
Related to http://bugzilla.redhat.com/show_bug.cgi?id=1463227 ?
Comment 2 Ben Nemec 2017-06-20 17:03:46 EDT
Hmm, I'm not seeing this behavior:

[cloud-user@undercloud-rhos ~]$ rpm -qa | grep instack
instack-undercloud-7.0.1-0.20170609013145.el7ost.noarch

[cloud-user@undercloud-rhos ~]$ sudo iptables -L -t nat
[snip]
Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
BOOTSTACK_MASQ  all  --  anywhere             anywhere            
nova-api-POSTROUTING  all  --  anywhere             anywhere            
nova-postrouting-bottom  all  --  anywhere             anywhere            
MASQUERADE  all  --  172.17.0.0/16        anywhere            

Chain BOOTSTACK_MASQ (1 references)
target     prot opt source               destination         
MASQUERADE  all  --  192.168.24.0/24     !192.168.24.0/24
[rest of output snipped]

I used the supplied undercloud.conf, except that I had to change to eth1 since eth0 on my VM is the one providing external connectivity.  I suppose it's possible that is relevant here.  Could you provide the ~/.instack/install-undercloud.log file so I can see what the install is doing with iptables?
Comment 3 Alexander Chuzhoy 2017-06-22 13:28:26 EDT
Hmm. Doesn't always reproduce - I didn't see it on my last deployment.
Will add the log once I catch it again.
Comment 4 Alexander Chuzhoy 2017-06-26 18:59 EDT
Created attachment 1292072 [details]
install-undercloud.log
Comment 5 Ben Nemec 2017-06-28 17:45:10 EDT
Strange.  The commands being run to set up iptables in that log are exactly the same as the ones in my deployment where the rules are correctly configured.  I don't see any reason that the rule would be missing.

The only tiny difference I see is the ordering of the other three POSTROUTING rules.  In the bad deployment they are ordered like this:

nova-api-POSTROUTING  all  --  anywhere             anywhere            
MASQUERADE  all  --  172.17.0.0/16        anywhere            
nova-postrouting-bottom  all  --  anywhere             anywhere            

Whereas in mine they are:

nova-api-POSTROUTING  all  --  anywhere             anywhere            
nova-postrouting-bottom  all  --  anywhere             anywhere            
MASQUERADE  all  --  172.17.0.0/16        anywhere            

Maybe there's a race in setting up the rules in this chain that causes the undercloud masquerade rule to get dropped?  I don't know why that would be the case though. :-/

Is there any external tooling in these environments that might be making changes to the iptables rules after the undercloud deployment completes?  Maybe something that adds a masquerade rule for the overcloud external network?
Comment 6 Artem Hrechanychenko 2017-07-26 04:24:09 EDT
Hi,
Please check iptables service status.
We have the similar issue for deployment osp12, updating osp12, updating & upgrading osp11.
https://bugzilla.redhat.com/show_bug.cgi?id=1460116
https://bugzilla.redhat.com/show_bug.cgi?id=1463227
Comment 7 Alexander Chuzhoy 2017-07-26 12:43:21 EDT
Well, with iptables (vs firewalld), it's only a matter of having the rule.
BTW, the w/a I use in automation is a single command re-adding that nat rule.
Comment 8 Alexander Chuzhoy 2017-07-27 12:31:32 EDT
(undercloud) [stack@undercloud-0 ~]$ sudo service iptables status
Redirecting to /bin/systemctl status iptables.service
● iptables.service - IPv4 firewall with iptables
   Loaded: loaded (/usr/lib/systemd/system/iptables.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2017-07-27 10:04:58 EDT; 2h 20min ago
  Process: 567 ExecStart=/usr/libexec/iptables/iptables.init start (code=exited, status=1/FAILURE)
 Main PID: 567 (code=exited, status=1/FAILURE)

Jul 27 10:04:58 undercloud-0.redhat.local systemd[1]: Starting IPv4 firewall with iptables...
Jul 27 10:04:58 undercloud-0.redhat.local iptables.init[567]: iptables: Applying firewall rules: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
Jul 27 10:04:58 undercloud-0.redhat.local iptables.init[567]: [FAILED]
Jul 27 10:04:58 undercloud-0.redhat.local systemd[1]: iptables.service: main process exited, code=exited, status=1/FAILURE
Jul 27 10:04:58 undercloud-0.redhat.local systemd[1]: Failed to start IPv4 firewall with iptables.
Jul 27 10:04:58 undercloud-0.redhat.local systemd[1]: Unit iptables.service entered failed state.
Jul 27 10:04:58 undercloud-0.redhat.local systemd[1]: iptables.service failed.


Rebooted the undercloud and the service was in failed state again. Chaning the title accordingly.


(undercloud) [stack@undercloud-0 ~]$ sudo iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere            !loopback/8           ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  172.17.0.0/16        anywhere            

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere            




(undercloud) [stack@undercloud-0 ~]$ sudo service iptables start
Redirecting to /bin/systemctl start iptables.service





(undercloud) [stack@undercloud-0 ~]$ sudo iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
REDIRECT   tcp  --  anywhere             169.254.169.254      tcp dpt:http redir ports 8775
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
DOCKER     all  --  anywhere            !loopback/8           ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
BOOTSTACK_MASQ  all  --  anywhere             anywhere            
MASQUERADE  all  --  172.17.0.0/16        anywhere            

Chain BOOTSTACK_MASQ (1 references)
target     prot opt source               destination         
MASQUERADE  all  --  192.168.24.0/24     !192.168.24.0/24     

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere
Comment 9 Alex Schultz 2017-07-27 12:41:24 EDT
Is this 7.4? I want to say there's another bug for iptables not starting correctly on the reboot but I need to find it
Comment 10 Alex Schultz 2017-07-27 12:44:34 EDT
Bug 1465382
Comment 11 Alexander Chuzhoy 2017-07-27 12:45:57 EDT
7.4 yes.

Note You need to log in before you can comment on or make changes to this bug.