| Summary: | [vdsm] Cannot import older VM | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Jiri Belka <jbelka> | ||||||||
| Component: | vdsm | Assignee: | Sergey Gotliv <sgotliv> | ||||||||
| Status: | CLOSED WORKSFORME | QA Contact: | |||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | urgent | ||||||||||
| Version: | 3.3.0 | CC: | abaron, bazulay, danken, hateya, iheim, jbelka, lpeer, mpavlik, scohen, sgotliv, yeylon | ||||||||
| Target Milestone: | --- | Keywords: | Triaged | ||||||||
| Target Release: | 3.3.0 | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | storage | ||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2013-10-25 20:52:54 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Attachments: |
|
||||||||||
Jiri, Did you try to import that VM again? I am asking because initial import failed because Engine lost connectivity to SPM for a few seconds (see Engine log around 13:11:02). Unfortunately, I don't see when this import is started and can't follow the flow because Engine's log is partial, but I see when it failed. By the way, can you attach a full log? During import VM we copying images so in case of failure we have to remove them from target storage domain. To do so, Engine is sending DeleteImage request to VDSM image by image, but since VDSM didn't copy some of them in the first place it may respond with ImageDoesNotExistInSDError. This printing in the log doesn't mean that something wrong. Engine handles this error as success and proceed to delete that image from database. So till now I don't see any problem. Please, try to import VM again and if you succeed close this bug. I am changing summary of this bug because its misleading. Yaniv, Why do you think this is regression? Interesting. I don't see feel any network issue, so why engine is thinking my host is non-responding? Full logs attached. 2013-Oct-07, 17:26 Failed to import Vm jb-w8-x86 to Data Center Default, Cluster Default 2013-Oct-07, 17:26 Storage Pool Manager runs on Host dell-r210ii-13 (Address: 10.34.62.205). 2013-Oct-07, 17:25 State was set to Up for host dell-r210ii-13. 2013-Oct-07, 17:25 Host dell-r210ii-13 is non responsive. 2013-Oct-07, 17:24 Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host dell-r210ii-13, Error: Network error during communication with the Host.). 2013-Oct-07, 17:19 Starting to import Vm jb-w8-x86 to Data Center Default, Cluster Default 2013-Oct-07, 17:16 Failed to import Vm jb-w8-x86 to Data Center Default, Cluster Default ...snip... Created attachment 808907 [details]
logs, 2nd wave
Jiri, You see it in the log, right? 2013-10-07 17:15:25,174 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-70) IrsBroker::Failed::GetStoragePoolInfoVDS due to: ConnectException: Connection refused 2013-10-07 17:15:25,191 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-83) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2013-10-07 17:15:25,192 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-83) Failed to refresh VDS , vds = a53ccb2c-9993-46b5-8fb9-293b02aeb8b3 : dell-r210ii-13, VDS Network Error, continuing. java.net.ConnectException: Connection refused 2013-10-07 17:15:26,166 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (DefaultQuartzScheduler_Worker-84) Command HSMGetAllTasksStatusesVDS execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2013-10-07 17:15:28,200 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-88) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2013-10-07 17:15:28,201 WARN [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-88) Failed to refresh VDS , vds = a53ccb2c-9993-46b5-8fb9-293b02aeb8b3 : dell-r210ii-13, VDS Network Error, continuing. java.net.ConnectException: Connection refused 2013-10-07 17:15:31,208 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-92) Command GetCapabilitiesVDS execution failed. Exception: VDSNetworkException: java.net.Co Engine failes to communicate with VDSM, but VDSM logs you provided are ended in 15:31. Why??? Thread-24::DEBUG::2013-10-07 15:31:47,424::fileSD::239::Storage.Misc.excCmd::(getReadDelay) SUCCESS: <err> = '0+1 records in\n0+1 records out\n374 bytes (374 B) copied, 0.000216256 s, 1.7 MB/s\n'; <rc> = 0 I need all logs in order to understand whats going on your host. Completely fresh install of rhevm (is17) and rhevh (20130918.0.el6). 2013-Oct-09, 11:46 Failed to import Vm jb-w7-x64 to Data Center Default, Cluster Default 2013-Oct-09, 11:46 Storage Pool Manager runs on Host dell-r210ii-13 (Address: 10.34.62.205). 2013-Oct-09, 11:45 State was set to Up for host dell-r210ii-13. 2013-Oct-09, 11:45 Host dell-r210ii-13 is non responsive. 2013-Oct-09, 11:44 Invalid status on Data Center Default. Setting Data Center status to Non Responsive (On host dell-r210ii-13, Error: Network error during communication with the Host.). 2013-Oct-09, 11:43 Used Network resources of host dell-r210ii-13 [98%] exceeded defined threshold [95%]. Created attachment 809834 [details]
logs from logcollector
Sergey, it seems as if you've moved this bug to ON_QA by mistake. If it was intentional, please explain why. See my comment 5. I didn't have appropriate logs to work with so I moved bug to QA and asked to provide missing logs. Anyway If I remember correctly I spoke with Jiri and Haim and we agreed that there is no storage issue here. Jiri confirmed that flow is working perfect on other machines and I checked it on mine. |
Created attachment 806399 [details] engine.log, vdsm.log, ovf file, list of images/$uuid Description of problem: In Admin Portal: Failed to import Vm jb-wxp to Data Center Default, Cluster Default The VM is older VM, exported on 3.2 env (2013-Jun-05, 16:58). 8adc79ee-8269-48b1-993b-407fb829f8ff is my data domain - str-03-jb33 ca414eb9-98ab-422f-8c4e-c2c513231058 is my export domain - str-01-jbelka-export Thread-34372::ERROR::2013-10-02 13:11:21,834::hsm::1518::Storage.HSM::(deleteImage) Empty or not found image 14f8de7c-6c67-496a-8998-6ddbeb8db64d in SD 8adc79ee-8269-48b1-993b-407fb829f8ff. {'1ac141cf-6718-4110-bd81-8760da6baa64': ImgsPar(imgs=('45f88dd5-5219-495a-b971-059aaf146149', 'd542a9a9-47b2-45c0-84be-2b558cb15667'), parent='00000000-0000-0000-0000-000000000000'), '6e4ad20f-d63a-4991-a878-2453cac4c71d': ImgsPar(imgs=('3a2a9619-f79b-4fa2-ac5d-029b50c7cf92', '31170ed0-a452-4a79-9864-16f6c5471ef8'), parent='00000000-0000-0000-0000-000000000000'), '61689bae-e53e-4bcc-8ea8-93f2a871c606': ImgsPar(imgs=('cbe5c385-19d4-4e53-8f1f-86778bc526db',), parent=None), '445adfcf-894e-45df-9043-9a0c9fd472ac': ImgsPar(imgs=('4592eec4-0bbc-4458-879e-f872d950d135',), parent=None), '149f3399-97e7-464a-8005-b1ddb216fab1': ImgsPar(imgs=('8ac5910a-9b90-47c5-bfbe-f6d268c32045',), parent=None), '2faeb0f3-19de-4bcb-bcc2-42967bf3dc50': ImgsPar(imgs=('7fc7bade-4434-4da5-b114-d8be2c74ef6d',), parent=None), '59c8c7e8-9a05-4901-91b3-b18b00fbb3c2': ImgsPar(imgs=('631781c0-b748-4700-b64d-037c625068ba',), parent=None), '862457f0-6269-4ddc-95da-d8e37e8b02cb': ImgsPar(imgs=('31170ed0-a452-4a79-9864-16f6c5471ef8',), parent=None), '5f89ae09-fae6-406f-9346-10796a2c1d9e': ImgsPar(imgs=('631781c0-b748-4700-b64d-037c625068ba',), parent=None), '63c2949f-2738-4abc-bcd0-dbda38eda1b1': ImgsPar(imgs=('5210cbaa-6adf-456f-a7ca-5c7d56697d98', 'cbe5c385-19d4-4e53-8f1f-86778bc526db'), parent='00000000-0000-0000-0000-000000000000'), 'aad255ee-2fa3-4659-a5de-8e775b20d361': ImgsPar(imgs=('f69a71d5-912d-447c-8cf6-ae247371abbd',), parent=None), 'bfde185f-b470-4e94-822d-f1a256a44f13': ImgsPar(imgs=('d542a9a9-47b2-45c0-84be-2b558cb15667',), parent=None), '0c84422e-1276-42f4-af5f-d591a609958b': ImgsPar(imgs=('78441be0-c017-4f69-81f6-fa61fe38d600',), parent=None), '0c25a184-5d4b-418f-b10e-ae179020e6bc': ImgsPar(imgs=('f4e2f9db-7e74-4ccf-abcc-46800110a3b1',), parent=None), 'eb499557-89ff-43d0-ab0d-666242043f95': ImgsPar(imgs=('8ac5910a-9b90-47c5-bfbe-f6d268c32045',), parent=None), 'b06143aa-6b4e-4c4d-853b-7a1fbea2aa2c': ImgsPar(imgs=('e8a6f991-0cd8-42a3-a047-a0ea2c5dbce9',), parent=None), '70aa9a8d-0a3b-4760-9efe-e33608d365f1': ImgsPar(imgs=('3f0d11e6-62fa-4192-bf7b-4d3b956402c3',), parent=None)} Thread-34372::ERROR::2013-10-02 13:11:21,834::task::850::TaskManager.Task::(_setError) Task=`655341c6-9c73-40a4-84d1-7cbfe99e7036`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1519, in deleteImage raise se.ImageDoesNotExistInSD(imgUUID, sdUUID) Thread-34372::ERROR::2013-10-02 13:11:21,841::dispatcher::67::Storage.Dispatcher.Protect::(run) {'status': {'message': "Image does not exist in domain: 'image=14f8de7c-6c67-496a-8998-6ddbeb8db64d, domain=8adc79ee-8269-48b1-993b-407fb829f8ff'", 'code': 268}} Version-Release number of selected component (if applicable): is17 How reproducible: is17 vdsm-4.12.0-170.el6ev.x86_64 libvirt-0.10.2-27.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.406.el6.x86_64 qemu-img-rhev-0.12.1.2-2.406.el6.x86_64 Steps to Reproduce: 1. have a vm from 3.2 on export domain 2. try to import this vm into 3.3 3. Actual results: failure Expected results: Additional info: