Created attachment 625455[details]
vdsm + engine logs
Description of problem:
Concurrent requests for ReconstructMaster are sent when a request takes longer than 3 minutes and times out on the engine side, even though it is still running on the host
The following is visible in the vdsm logs:
Thread-129::INFO::2012-10-10 16:58:40,382::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb-
Thread-129::INFO::2012-10-10 17:06:05,538::logUtils::39::dispatcher::(wrapper) Run and protect: reconstructMaster, Return response: None
Thread-500::INFO::2012-10-10 17:06:44,042::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb-
Thread-781::INFO::2012-10-10 17:14:55,383::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb-
Thread-500::INFO::2012-10-10 17:17:35,937::logUtils::39::dispatcher::(wrapper) Run and protect: reconstructMaster, Return response: None
Thread-892::INFO::2012-10-10 17:19:08,812::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb-
Thread-892::INFO::2012-10-10 17:19:32,724::logUtils::39::dispatcher::(wrapper) Run and protect: reconstructMaster, Return response: None
Version-Release number of selected component (if applicable):
vdsm-4.9.6-37.0.el6_3.x86_64
rhevm-3.1.0-20.el6ev.noarch
How reproducible:
?
Steps to Reproduce:
1. Setup with 1 host, 2 storage domains (NFS, on different servers)
2. Block connection to master storage domain
3. Check logs to see if concurrent reconstructMaster threads are running on the host
Created attachment 625455 [details] vdsm + engine logs Description of problem: Concurrent requests for ReconstructMaster are sent when a request takes longer than 3 minutes and times out on the engine side, even though it is still running on the host The following is visible in the vdsm logs: Thread-129::INFO::2012-10-10 16:58:40,382::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb- Thread-129::INFO::2012-10-10 17:06:05,538::logUtils::39::dispatcher::(wrapper) Run and protect: reconstructMaster, Return response: None Thread-500::INFO::2012-10-10 17:06:44,042::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb- Thread-781::INFO::2012-10-10 17:14:55,383::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb- Thread-500::INFO::2012-10-10 17:17:35,937::logUtils::39::dispatcher::(wrapper) Run and protect: reconstructMaster, Return response: None Thread-892::INFO::2012-10-10 17:19:08,812::logUtils::37::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='92093b80-07da-489b-b31c-78f9234665a1', poolName='TestDataCenter', masterDom='5efd1c4f-a4cb- Thread-892::INFO::2012-10-10 17:19:32,724::logUtils::39::dispatcher::(wrapper) Run and protect: reconstructMaster, Return response: None Version-Release number of selected component (if applicable): vdsm-4.9.6-37.0.el6_3.x86_64 rhevm-3.1.0-20.el6ev.noarch How reproducible: ? Steps to Reproduce: 1. Setup with 1 host, 2 storage domains (NFS, on different servers) 2. Block connection to master storage domain 3. Check logs to see if concurrent reconstructMaster threads are running on the host