Description of problem: ovirt-ha-agent connects to vdsm via management interface IP address. A network failure may put the management interface down (in case it's not a VM network -> no bridge -> link down). So a network failure causes ovirt-ha-agent to timeout talking to vdsm. This breaks HA failover as the agent gets stuck restarting and trying to connect to vdsm. I believe ha-agent and vdsm should talk via loopback device (localhost/127.0.0.1) (which should never go down), another AF such as AF_UNIX or something else that is more reliable. Version-Release number of selected component (if applicable): vdsm-4.18.11-1.el7ev.x86_64 ovirt-hosted-engine-ha-2.0.3-1.el7ev.noarch How reproducible: 100% Steps to Reproduce: 1. Unplug the network cable of the management network (if not a VM network) 2. The IP address of the interface goes down with it (removed from the FIB). 3. ha-agent get's stuck trying to talk to vdsm. 4. ha-agent does not realize broker is not pinging the gateway anymore. Score is not reduced by 1600 points 5. ha-agent does not shut down the HE vm. 6. other hosts cannot start the HE vm as the lock is still grabbed. As a reproducer, in case the management network is a VM network (bridged), just 'ip link set dev ovirtmgmt down', the management IP address will be gone with it and vdsm and ovirt-ha-agent will lose contact. Actual results: - Host score not lowered - HE does not fail over to another host. Expected results: - HE fails over Additional info: ha-agent should connect to vdsm via 127.0.0.1, so that when the IP address is removed the agent won't get stuck, so the host score will be reduced, the HE stopped and another host will start it. 192.168.7.22 is the management interface IP address for the host: $ cat sos_commands/process/lsof_-b_M_-n_-l | grep ovirt-ha | grep TCP ovirt-ha- 19064 36 14u IPv4 39445 0t0 TCP 192.168.7.22:45752->192.168.7.22:54321 (ESTABLISHED) ovirt-ha- 19064 24267 36 14u IPv4 39445 0t0 TCP 192.168.7.22:45752->192.168.7.22:54321 (ESTABLISHED) Broker pinging gateway: Thread-2039::INFO::2016-09-18 14:41:40,815::ping::52::ping.Ping::(action) Successfully pinged 192.168.7.254 Management network goes down: Thread-2039::WARNING::2016-09-18 14:42:00,846::ping::48::ping.Ping::(action) Failed to ping 192.168.7.254 At the same time the agent fails to talk to vdsm and goes on loop, restarting MainThread::INFO::2016-09-18 14:41:44,867::states::421::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine vm running on localhost MainThread::INFO::2016-09-18 14:41:44,870::hosted_engine::612::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM MainThread::INFO::2016-09-18 14:41:48,394::hosted_engine::639::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Connecting the storage MainThread::INFO::2016-09-18 14:41:48,394::storage_server::218::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Connecting storage server MainThread::INFO::2016-09-18 14:41:51,856::storage_server::232::ovirt_hosted_engine_ha.lib.storage_server.StorageServer::(connect_storage_server) Refreshing the storage domain MainThread::INFO::2016-09-18 14:41:51,921::hosted_engine::666::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_storage_images) Preparing images MainThread::INFO::2016-09-18 14:41:51,922::image::126::ovirt_hosted_engine_ha.lib.image.Image::(prepare_images) Preparing images MainThread::INFO::2016-09-18 14:41:53,926::util::194::ovirt_hosted_engine_ha.lib.image.Image::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:41:55,930::util::194::ovirt_hosted_engine_ha.lib.image.Image::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:41:57,934::util::194::ovirt_hosted_engine_ha.lib.image.Image::(connect_vdsm_json_rpc) Waiting for VDSM to reply ..... MainThread::INFO::2016-09-18 14:51:28,027::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::ERROR::2016-09-18 14:51:28,032::agent::205::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Error: 'Couldnt connect to VDSM within 240 seconds' - trying to restart agent MainThread::WARNING::2016-09-18 14:51:33,038::agent::208::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Restarting agent, attempt '1' And there it goes again... MainThread::INFO::2016-09-18 14:51:33,086::hosted_engine::612::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Initializing VDSM MainThread::INFO::2016-09-18 14:51:35,169::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:51:37,173::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:51:39,176::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:51:41,179::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:51:43,182::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:51:45,186::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply MainThread::INFO::2016-09-18 14:51:47,189::util::194::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(connect_vdsm_json_rpc) Waiting for VDSM to reply
- Given there is no network then the sanlock should expire, cause we write the timestamp to the whiteboard, so I'd expect sanlock to free(kill) the resource(VM) can you output `sanlock log_dump` when this happens? - We clearly see a ping fails, how come we didn't lower the score? can you run this with debug log level?
(In reply to Roy Golan from comment #2) > - Given there is no network then the sanlock should expire, cause we write > the timestamp to the whiteboard, so I'd expect sanlock to free(kill) the > resource(VM) > can you output `sanlock log_dump` when this happens? sanlock may be on FC, or via another interface other than the mgmt, no? > > - We clearly see a ping fails, how come we didn't lower the score? can you > run this with debug log level?
Hi Roy, First of all thank you for the prompt response. It's fibre channel storage, sanlock will not expire. And there is also the case where storage might be accessed via a different NIC and network. It did not lower the score because the agent is just looping (restarting and dying). I can easily reproduce this in my env. Just put the ovirtmgmt network administratively down. Provided you reach your NFS/iSCSI storage via a different network you will see the exact same problem. To me, the main problem is this, where the IP address might be attached to an interface that can go down by unplugging a cable. $ cat sos_commands/process/lsof_-b_M_-n_-l | grep ovirt-ha | grep TCP ovirt-ha- 19064 36 14u IPv4 39445 0t0 TCP 192.168.7.22:45752->192.168.7.22:54321 (ESTABLISHED) ovirt-ha- 19064 24267 36 14u IPv4 39445 0t0 TCP 192.168.7.22:45752->192.168.7.22:54321 (ESTABLISHED) Unplugging a cable should not make vdsm and ha-agent stop talking. If you need more info, please let me know. Cheers, Germano
This change should fix it: Changing 'socket.gethostname()' to 'localhost' in the vdsm lib the agent is using. https://gerrit.ovirt.org/#/c/63308/2/lib/vdsm/jsonrpcvdscli.py Please confirm.
After applying that fix it seems to fall back to using ipv6 lo ovirt-ha- 1503 36 8u IPv6 78992 0t0 TCP [::1]:54156->[::1]:54321 (ESTABLISHED) ovirt-ha- 1503 1912 36 8u IPv6 78992 0t0 TCP [::1]:54156->[::1]:54321 (ESTABLISHED) Looks good.
(In reply to Germano Veit Michel from comment #6) > This change should fix it: > > Changing 'socket.gethostname()' to 'localhost' in the vdsm lib the agent is > using. > > https://gerrit.ovirt.org/#/c/63308/2/lib/vdsm/jsonrpcvdscli.py > > Please confirm. Yes it prevented other problems with the client but this will resolve the problem of the interface going down. One more problem (that may be masqueraded by this fix) is that we keep looping and not lowering the score, even though the ping failed which doesn't look right. Martin?
Guys, I think the same issue will happen to MOM and vdsClient tools. We use the default address the vdsm library provides to us and this should be fixed there as Germano found out. Francesco: can you please confirm this? It might affect any vdsm client, not just HE (what about supervdsm btw?)
(In reply to Martin Sivák from comment #9) > Guys, I think the same issue will happen to MOM and vdsClient tools. We use > the default address the vdsm library provides to us and this should be fixed > there as Germano found out. > > Francesco: can you please confirm this? It might affect any vdsm client, not > just HE (what about supervdsm btw?) An added bonus is that we can probably skip SSL in this case, but I assume it requires some further changes. Would be nice to move to sockets.
> One more problem (that may be masqueraded by this fix) is that we keep > looping and not lowering the score, even though the ping failed which > doesn't look right. Martin? Ping monitor tries to access the network gateway, it is not related to the vdsm socket and should still fail. We keep looping because we never get to the actual score calculation (and we no longer publish any updates). Other nodes should stop seeing updates from this host and act accordingly though. On the other hand.. nobody will kill the local VM if sanlock is still fine.
Ah, I see the localhost patch was already merged to the necessary branches. This requires no code change to hosted engine and is therefore test only.
(In reply to Martin Sivák from comment #9) > Guys, I think the same issue will happen to MOM and vdsClient tools. We use > the default address the vdsm library provides to us and this should be fixed > there as Germano found out. > > Francesco: can you please confirm this? It might affect any vdsm client, not > just HE (what about supervdsm btw?) jsonrpcvdscli is indeed the recommended way to talk with Vdsm, so, yes, it will affect each and every client. Vdsm talsk to Supervdsm using one UNIX domain socket, so this part should be fine.
Verified on vdsm-4.18.13-1.el7ev.x86_64 and ovirt-hosted-engine-ha-2.0.4-1.el7ev.noarch I do not have the possibility to check it on FC, so I checked it on NFS. 1) Configure two interfaces on the host and use one as management network 2) Bring down management interface "ip link set dev ovirtmgmt down" 3) Check that vdsm succeeds to talk with the ovirt-ha-agent 4) Block connection via IP tables to the gateway and check that ovirt-ha-agent drop score of the host to 1800 5) Check lsof: # lsof | grep ovirt-ha | grep TCP ovirt-ha- 119122 vdsm 8u IPv6 1342661 0t0 TCP localhost:52498->localhost:54321 (ESTABLISHED) ovirt-ha- 119122 vdsm 12u IPv6 1362041 0t0 TCP localhost:52964->localhost:54321 (ESTABLISHED) ovirt-ha- 119122 119531 vdsm 8u IPv6 1342661 0t0 TCP localhost:52498->localhost:54321 (ESTABLISHED) ovirt-ha- 119122 119531 vdsm 12u IPv6 1362041 0t0 TCP localhost:52964->localhost:54321 (ESTABLISHED) ovirt-ha- 119122 120865 vdsm 8u IPv6 1342661 0t0 TCP localhost:52498->localhost:54321 (ESTABLISHED) ovirt-ha- 119122 120865 vdsm 12u IPv6 1362041 0t0 TCP localhost:52964->localhost:54321 (ESTABLISHED)