Bug 1016619

Summary: RHEV Hypervisor - 6.5 - 20130918.0.el6 - Network error after host approval (icmp blocked by iptables on host)
Product: Red Hat Enterprise Linux 6 Reporter: Martin Pavlik <mpavlik>
Component: ovirt-nodeAssignee: Joey Boggs <jboggs>
Status: CLOSED ERRATA QA Contact: Martin Pavlik <mpavlik>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.5CC: acathrow, alonbl, bsarathy, cshao, fdeutsch, gklein, gouyang, hadong, huiwa, iheim, jboggs, leiwang, mburns, mpavlik, ovirt-maint, tlavigne, yaniwang, ycui, yeylon
Target Milestone: rcKeywords: TestOnly
Target Release: 6.5   
Hardware: x86_64   
OS: Linux   
Whiteboard: network
Fixed In Version: ovirt-node-3.0.1-5.el6 Doc Type: Bug Fix
Doc Text:
After registering a hypervisor to the Manager with the "automatically configure host firewall" option, the host's ICMP ports were blocked by the iptables rules. This caused the hypervisor to become non-responsive. host-deploy now configures iptables rules correctly after a host is registered to the Manager.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-01-21 19:53:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
log_collector none

Description Martin Pavlik 2013-10-08 12:25:57 UTC
Created attachment 809245 [details]
log_collector

Description of problem:
After approving a 6.5 RHEV-H host in RHEV-M host becomes non-responsive. 

It seems that at some point of approval process RHEV-M tries to verify that host is alive via ICMP
14:15:45.939740 IP 10.34.66.51 > 10.34.63.69: ICMP host 10.34.66.51 unreachable - admin prohibited, length 68

this request is dropped by iptables on host by following rule
9    REJECT     all  --  anywhere             anywhere            reject-with icmp-host-prohibited 

This rule needs to be removed or modified to allow ICMP from RHEV-M, otherwise RHEV-M considers the host non-responsive. 

After removing the rule running 
iptables -D INPUT 9 
, host goes immediately UP

Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Manager Version: 3.3.0-0.24.master.el6ev 
RHEV Hypervisor - 6.5 - 20130918.0.el6

How reproducible:
100%

Steps to Reproduce:
1. install host with following PXE line adjusted accordingly  

APPEND rootflags=loop initrd=images/RHEVH/rhev-hypervisor6-6.5-20130918.0.el6ev/initrd0.img root=live:/rhevh-latest-6.iso rootfstype=auto ro liveimg nomodeset check rootflags=ro crashkernel=512M-2G:64M,2G-:128M elevator=deadline processor.max_cstate=1 rd_NO_LVM rd_NO_LUKS rd_NO_MD rd_NO_DM console=tty0 console=ttyS1,115200n81 firstboot storage_init=/dev/sda storage_vol=::::: ssh_pwauth=1 adminpw=LMi16hIGAvm0A ntp=10.34.32.125 edd=off rhevm_admin_password=gPA37ATxRODnA management_server=mp-rhevm33.rhev.lab.eng.brq.redhat.com:443
    IPAPPEND 2

2. approve host in rhevm
3. wait until host is installed and becomes non-responsive

Actual results:
host is unresponsive after adding to RHEV-M

Expected results:
host is UP

Additional info:

RHEV Hypervisor - 6.5 - 20130918.0.el6

2013-10-08 14:13:48,472 ERROR [org.ovirt.engine.core.bll.InstallVdsCommand] (pool-5-thread-6) [6b00b4f2] Host installation failed for host ea706e93-84c3-420c-9a8c-84e66777cee2, dell-r210ii-05.rhev.lab.eng.brq.redhat.com.: org.ovirt.engine.core.bll.InstallVdsCommand$VdsInstallException: Network error during communication with the host
	at org.ovirt.engine.core.bll.InstallVdsCommand.configureManagementNetwork(InstallVdsCommand.java:290) [bll.jar:]
	at org.ovirt.engine.core.bll.InstallVdsCommand.installHost(InstallVdsCommand.java:205) [bll.jar:]
	at org.ovirt.engine.core.bll.InstallVdsCommand.executeCommand(InstallVdsCommand.java:105) [bll.jar:]
	at org.ovirt.engine.core.bll.ApproveVdsCommand.executeCommand(ApproveVdsCommand.java:49) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1135) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1220) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1896) [bll.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:]
	at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:]
	at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1240) [bll.jar:]
	at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:366) [bll.jar:]
	at org.ovirt.engine.core.bll.MultipleActionsRunner.executeValidatedCommand(MultipleActionsRunner.java:175) [bll.jar:]
	at org.ovirt.engine.core.bll.MultipleActionsRunner.RunCommands(MultipleActionsRunner.java:156) [bll.jar:]
	at org.ovirt.engine.core.bll.MultipleActionsRunner$1.run(MultipleActionsRunner.java:94) [bll.jar:]
	at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:71) [utils.jar:]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_40]
	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_40]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_40]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_40]
	at java.lang.Thread.run(Thread.java:724) [rt.jar:1.7.0_40]

2013-10-08 14:13:48,477 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-5-thread-6) [6b00b4f2] START, SetVdsStatusVDSCommand(HostName = dell-r210ii-05.rhev.lab.eng.brq.redhat.com, HostId = ea706e93-84c3-420c-9a8c-84e66777cee2, status=NonResponsive, nonOperationalReason=NONE), log id: 6132809f
2013-10-08 14:13:48,497 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (pool-5-thread-6) [6b00b4f2] FINISH, SetVdsStatusVDSCommand, log id: 6132809f
2013-10-08 14:13:48,530 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-6) [6b00b4f2] Correlation ID: 6b00b4f2, Job ID: 37375e29-f47b-4dd5-96ba-966fe3b232dd, Call Stack: null, Custom Event ID: -1, Message: Host dell-r210ii-05.rhev.lab.eng.brq.redhat.com installation failed. Network error during communication with the host.
2013-10-08 14:13:49,716 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-24) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: java.net.NoRouteToHostException: No route to host


[root@dell-r210ii-05 ~]# iptables -L INPUT --line-numbers
Chain INPUT (policy ACCEPT)
num  target     prot opt source               destination         
1    ACCEPT     all  --  anywhere             anywhere            state RELATED,ESTABLISHED 
2    ACCEPT     icmp --  anywhere             anywhere            
3    ACCEPT     all  --  anywhere             anywhere            
4    ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:16514 
5    ACCEPT     tcp  --  anywhere             anywhere            tcp dpt:ssh 
6    ACCEPT     tcp  --  anywhere             anywhere            multiport dports xprtld:6166 
7    ACCEPT     tcp  --  anywhere             anywhere            multiport dports 49152:49216 
8    ACCEPT     udp  --  anywhere             anywhere            udp dpt:snmp 
9    REJECT     all  --  anywhere             anywhere            reject-with icmp-host-prohibited

Comment 3 Guohua Ouyang 2013-10-11 08:27:02 UTC
(In reply to Martin Pavlik from comment #0)
> [root@dell-r210ii-05 ~]# iptables -L INPUT --line-numbers
> Chain INPUT (policy ACCEPT)
> num  target     prot opt source               destination         
> 1    ACCEPT     all  --  anywhere             anywhere            state
> RELATED,ESTABLISHED 
> 2    ACCEPT     icmp --  anywhere             anywhere            
> 3    ACCEPT     all  --  anywhere             anywhere            

isn't the rule 2 accepting the ICMP packets?

I tested it on build rhevh-6.5-20130930.0.auto665.el6 which has no vdsm, after the host network is up, ping outside is ok.

Comment 4 Guohua Ouyang 2013-10-12 07:02:44 UTC
this should be the vdsm issue, it changes the host's iptables rule with the default option "automatically configure host firewall" under advanced parameters.

1. iptabls output before add the host to rhevm
# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:54321 
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED 
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:16514 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:22 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           multiport dports 5634:6166 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           multiport dports 49152:49216 
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:161 
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-host-prohibited 

2. iptables output after add the host to rhevm, it drops the "icmp" rule.
# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state RELATED,ESTABLISHED 
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:54321 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:22 
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0           udp dpt:161 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           tcp dpt:16514 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           multiport dports 5634:6166 
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0           multiport dports 49152:49216 
REJECT     all  --  0.0.0.0/0            0.0.0.0/0           reject-with icmp-host-prohibited

Comment 5 Alon Bar-Lev 2013-10-14 10:57:50 UTC
host-deploy is not configuring iptables:

2013-10-08 12:13:12 DEBUG otopi.context context.dumpEnvironment:454 ENV NETWORK/iptablesEnable=bool:'False'
2013-10-08 12:13:12 DEBUG otopi.context context.dumpEnvironment:454 ENV NETWORK/iptablesRules=NoneType:'None'

Comment 6 Fabian Deutsch 2013-10-22 08:28:51 UTC
Martin or Ouyang,

could you please provide the file /etc/sysconfig/iptables before and after the registration?

Comment 7 Ying Cui 2013-10-22 08:48:09 UTC
Hi Fabian,
  In bug comment 4, there has provided iptables output before and after adding the host to rhevm.

Thanks
Ying

Comment 8 Fabian Deutsch 2013-10-22 09:06:08 UTC
(In reply to Ying Cui from comment #7)
> Hi Fabian,
>   In bug comment 4, there has provided iptables output before and after
> adding the host to rhevm.
> 
> Thanks
> Ying

Hey Ying,

yes that is the iptables cmd output, but I wonder if this differs from the on disk iptables.
My point is to figure out if some rule got deleted at runtime.

Comment 9 Martin Pavlik 2013-10-22 09:51:35 UTC
It seems that issue does not reproduce any more with 

Red Hat Enterprise Virtualization Manager Version: 3.3.0-0.26.master.el6ev 
and
RHEV Hypervisor - 6.5 - 20131011.0.el6

Comment 10 Fabian Deutsch 2013-10-22 13:05:39 UTC
(In reply to Martin Pavlik from comment #9)
> It seems that issue does not reproduce any more with 
> 
> Red Hat Enterprise Virtualization Manager Version: 3.3.0-0.26.master.el6ev 
> and
> RHEV Hypervisor - 6.5 - 20131011.0.el6

Thanks for testing. That sounds good.

Comment 11 Martin Pavlik 2013-10-25 06:47:53 UTC
Making this verified as per comment 9.

Comment 13 errata-xmlrpc 2014-01-21 19:53:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0033.html