Bug 850859
Summary: | RHEVM - Backend: Remove of DC with one corrupted domain fails and leaves healthy domain stuck in Locked status | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Daniel Paikov <dpaikov> | ||||||||
Component: | ovirt-engine | Assignee: | Federico Simoncelli <fsimonce> | ||||||||
Status: | CLOSED NOTABUG | QA Contact: | Haim <hateya> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 3.1.0 | CC: | abaron, amureini, dyasny, hateya, iheim, lpeer, Rhev-m-bugs, sgrinber, yeylon, ykaul | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | 3.1.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | storage | ||||||||||
Fixed In Version: | SI21 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2012-10-17 09:19:20 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Created attachment 606301 [details]
vdsm.log
Created attachment 606305 [details]
vdsm.log
The logs contain an infinite loop of: Thread-18057::ERROR::2012-08-28 01:04:21,782::SecureXMLRPCServer::77::root::(handle_error) client ('10.35.97.174', 47219) Traceback (most recent call last): File "/usr/lib64/python2.6/SocketServer.py", line 560, in process_request_thread self.finish_request(request, client_address) File "/usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py", line 68, in finish_request request.do_handshake() File "/usr/lib64/python2.6/ssl.py", line 279, in do_handshake self._sslobj.do_handshake() SSLError: [Errno 1] _ssl.c:490: error:1407609C:SSL routines:SSL23_GET_CLIENT_HELLO:http request Are you sure the engine was able to communicate with vdsm? (In reply to comment #3) > The logs contain an infinite loop of: > > Thread-18057::ERROR::2012-08-28 > 01:04:21,782::SecureXMLRPCServer::77::root::(handle_error) client > ('10.35.97.174', 47219) > Traceback (most recent call last): > File "/usr/lib64/python2.6/SocketServer.py", line 560, in > process_request_thread > self.finish_request(request, client_address) > File "/usr/lib64/python2.6/site-packages/vdsm/SecureXMLRPCServer.py", line > 68, in finish_request > request.do_handshake() > File "/usr/lib64/python2.6/ssl.py", line 279, in do_handshake > self._sslobj.do_handshake() > SSLError: [Errno 1] _ssl.c:490: error:1407609C:SSL > routines:SSL23_GET_CLIENT_HELLO:http request > > Are you sure the engine was able to communicate with vdsm? yes, it just means that some other backend tries to communicate with this VDS with no success. I tried to reproduce but the master domain (the consistent one) doesn't remain locked (it is correctly moved to maintenance) and the DC removal is prevented by the error: "Error while executing action: Cannot remove Data Center when there are more than one Storage Domain attached." Are you still affected by this issue? Also update the steps to reproduce if you think I missed something. If you are going to test this again could you provide cleaner logs (both vdsm and engine) without other backends involved? Thanks. Can't reproduce anymore, since we don't allow DC removal of DCs with more than one domain. This bug is now irrelevant since we can't reach this flow. GUI: Error while executing action: Cannot remove Data Center when there are more than one Storage Domain attached. engine.log: 2012-10-17 10:57:05,756 WARN [org.ovirt.engine.core.bll.storage.RemoveStoragePoolCommand] (http--0.0.0.0-8700-5) CanDoAction of action RemoveStoragePool failed. Reasons:VAR__TYPE__STORAGE__POOL,VAR__ACTION__REMOVE,ERROR_CANNOT_REMOVE_STORAGE_POOL_WITH_NONMASTER_DOMAINS |
Created attachment 606300 [details] engine.log * DC with one healthy domain (status = active/master), and one domain with bad metadata (status = inactive). * Maintenance the active domain. * Try to remove DC. * Removal fails, DC is in Maintenance mode, corrupt domain is in Inactive mode, healthy domain is stuck in Locked mode.