Bug 1405042
| Summary: | live merge fails (possibly on image tear down - LV deactivation fails) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Yaniv Kaul <ykaul> | ||||||
| Component: | Core | Assignee: | Ala Hino <ahino> | ||||||
| Status: | CLOSED WORKSFORME | QA Contact: | Raz Tamir <ratamir> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 4.18.30 | CC: | bugs, nsoffer, tnisan, ykaul | ||||||
| Target Milestone: | ovirt-4.1.0-beta | Keywords: | AutomationBlocker, Regression | ||||||
| Target Release: | --- | Flags: | rule-engine:
ovirt-4.1+
rule-engine: blocker+ |
||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2017-01-08 16:14:26 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Created attachment 1232151 [details]
engine log
This fail with: Logical volume 43c129a6-2a33-48c0-bceb-a9585b78a1fd/7d630987-5be2-4cfa-bca8-a5409a64216b in use. Failing on teardown mean that live merge was completed, but we cannot deactivate the volume. This was error ignored silently in the past, but now we fail the flow. Leaving active lv is dangerous, we had image corruption bug caused by this. Is this raw volume? do we have guest lvs on this volume? (In reply to Nir Soffer from comment #2) > This fail with: > > Logical volume > 43c129a6-2a33-48c0-bceb-a9585b78a1fd/7d630987-5be2-4cfa-bca8-a5409a64216b in > use. > > Failing on teardown mean that live merge was completed, but we cannot > deactivate > the volume. This was error ignored silently in the past, but now we fail the > flow. > > Leaving active lv is dangerous, we had image corruption bug caused by this. > > Is this raw volume? do we have guest lvs on this volume? I'm now seeing failures in cold merge as well, in similar scenario. However, I've also noticed that there's an even earlier WARNING (and not ERROR, which is why I've missed it in the first place) that deactivation fails. Example: 2016-12-16 04:09:30,505 INFO (tasks/1) [storage.VolumeManifest] Tearing down volume 954c1706-3ed9-4b7f-be73-5d5d4cca209e/b790b46c-3304-4934-bab7-01aca8dcae50 justme False (blockVolume:390) 2016-12-16 04:09:30,506 INFO (tasks/1) [storage.LVM] Deactivating lvs: vg=954c1706-3ed9-4b7f-be73-5d5d4cca209e lvs=[u'b790b46c-3304-4934-bab7-01aca8dcae50'] (lvm:1306) 2016-12-16 04:09:35,354 WARN (tasks/1) [storage.ResourcesFactories] Failure deactivate LV 954c1706-3ed9-4b7f-be73-5d5d4cca209e/b790b46c-3304-4934-bab7-01aca8dcae50 (Cannot deactivate Logical Volume: ('General Storage Exception: (u"5 [] [\' WARNING: Not using lvmetad because config setting use_lvmetad=0.\', \' WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).\', \' Logical volume 954c1706-3ed9-4b7f-be73-5d5d4cca209e/b790b46c-3304-4934-bab7-01aca8dcae50 in use.\']\\n954c1706-3ed9-4b7f-be73-5d5d4cca209e/[u\'b790b46c-3304-4934-bab7-01aca8dcae50\']",)',)) (resourceFactories:58) 2016-12-16 04:09:52,181 INFO (jsonrpc/5) [vdsm.api] START snapshot args=(<virt.vm.Vm object at 0x2b01c90>, [{u'baseVolumeID': u'b790b46c-3304-4934-bab7-01aca8dcae50', u'domainID': u'954c1706-3ed9-4b7f-be73-5d5d4cca209e', u'volumeID': u'85d3cc1e-0bd6-49a6-afa5-d45db60e112d', u'imageID': u'2daa88c9-7646-4e1c-a697-08222c54ecef'}], {'dst': {'device': 'disk', 'imageID': u'78658374-e538-4b9a-8b28-3896883ff19c', 'domainID': u'954c1706-3ed9-4b7f-be73-5d5d4cca209e', 'volumeID': u'13ee4f0f-1efa-491f-b995-abac8e3205b4', 'poolID': u'9fd6b273-aa60-4031-992a-09fb232c8a47'}, 'dstparams': {'device': 'disk', 'imageID': u'8051c36a-b871-418a-8043-74001386dc95', 'domainID': u'954c1706-3ed9-4b7f-be73-5d5d4cca209e', 'volumeID': u'2e74c4f6-5aa5-4009-9c1f-f8deef4ca38a', 'poolID': u'9fd6b273-aa60-4031-992a-09fb232c8a47'}}, False) kwargs={} (api:37) 2016-12-16 04:09:52,251 INFO (jsonrpc/5) [storage.LVM] Refreshing lvs: vg=954c1706-3ed9-4b7f-be73-5d5d4cca209e lvs=['b790b46c-3304-4934-bab7-01aca8dcae50'] (lvm:1294) This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. |
Created attachment 1232138 [details] vdsm log Description of problem: While running ovirt-system-tests tests with live merge, it fails, apparentl on tear down of the image. Version-Release number of selected component (if applicable): vdsm-4.18.999-1162.gite95442e.el7.centos.x86_64.rpm How reproducible: Sometimes - I do not think it fails all the time. I see a deactivation failure: 2016-12-15 06:29:31,528 INFO (jsonrpc/0) [dispatcher] Run and protect: teardownImage(sdUUID=u'43c129a6-2a33-48c0-bceb-a9585b78a1fd', spUUID=u'34f579e9-89b5-49fd-9c61-6cbcd67f672e', imgUUID=u'11629df7-653d-4679- 8094-651e2c92dd39', volUUID=None) (logUtils:49) 2016-12-15 06:29:31,559 INFO (jsonrpc/0) [storage.LVM] Deactivating lvs: vg=43c129a6-2a33-48c0-bceb-a9585b78a1fd lvs=['f99c7c30-6613-4c36-a115-1c877d1a1d2a', '7d630987-5be2-4cfa-bca8-a5409a64216b'] (lvm:1306) 2016-12-15 06:29:33,668 INFO (periodic/3) [dispatcher] Run and protect: getVolumeSize(sdUUID=u'43c129a6-2a33-48c0-bceb-a9585b78a1fd', spUUID=u'34f579e9-89b5-49fd-9c61-6cbcd67f672e', imgUUID=u'11629df7-653d-4679 -8094-651e2c92dd39', volUUID=u'f99c7c30-6613-4c36-a115-1c877d1a1d2a', options=None) (logUtils:49) 2016-12-15 06:29:33,669 INFO (periodic/3) [dispatcher] Run and protect: getVolumeSize, Return response: {'truesize': '1073741824', 'apparentsize': '1073741824'} (logUtils:52) 2016-12-15 06:29:33,670 INFO (periodic/2) [dispatcher] Run and protect: repoStats(options=None) (logUtils:49) 2016-12-15 06:29:33,670 INFO (periodic/2) [dispatcher] Run and protect: repoStats, Return response: {u'894d6b50-b70c-4438-a2c7-f71ac10cc8b6': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000484075', 'lastCheck': '3.4', 'valid': True}, u'4ca13f73-ff87-4b31-99d8-43b5c83e4caf': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.0104277', 'lastCheck': '6.3', 'valid': True}, u'43c129a6-2a33-48c0-bceb-a9585b78a1fd': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.00360224', 'lastCheck': '3.9', 'valid': True}, u'ef9c603b-eb82-4b70-b3a4-edf7c2286239': {'code': 0 , 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.00021639', 'lastCheck': '8.0', 'valid': True}} (logUtils:52) 2016-12-15 06:29:39,229 INFO (jsonrpc/2) [dispatcher] Run and protect: repoStats(options=None) (logUtils:49) 2016-12-15 06:29:39,229 INFO (jsonrpc/2) [dispatcher] Run and protect: repoStats, Return response: {u'894d6b50-b70c-4438-a2c7-f71ac10cc8b6': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000484075', 'lastCheck': '8.9', 'valid': True}, u'4ca13f73-ff87-4b31-99d8-43b5c83e4caf': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000249655', 'lastCheck': '1.9', 'valid': True}, u'43c129a6-2a33-48c0-bceb-a9585b78a1fd': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.00925802', 'lastCheck': '0.4', 'valid': True}, u'ef9c603b-eb82-4b70-b3a4-edf7c2286239': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000233896', 'lastCheck': '3.6', 'valid': True}} (logUtils:52) 2016-12-15 06:29:39,234 INFO (jsonrpc/2) [jsonrpc.JsonRpcServer] RPC call Host.getStats succeeded in 0.01 seconds (__init__:515) 2016-12-15 06:29:39,727 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:515) 2016-12-15 06:29:39,739 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmIoTunePolicies succeeded in 0.00 seconds (__init__:515) 2016-12-15 06:29:48,673 INFO (periodic/0) [dispatcher] Run and protect: repoStats(options=None) (logUtils:49) 2016-12-15 06:29:48,674 INFO (periodic/0) [dispatcher] Run and protect: repoStats, Return response: {u'894d6b50-b70c-4438-a2c7-f71ac10cc8b6': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000246189', 'lastCheck': '8.4', 'valid': True}, u'4ca13f73-ff87-4b31-99d8-43b5c83e4caf': {'code': 0, 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000229334', 'lastCheck': '1.3', 'valid': True} , u'43c129a6-2a33-48c0-bceb-a9585b78a1fd': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.00925802', 'lastCheck': '8.9', 'valid': True}, u'ef9c603b-eb82-4b70-b3a4-edf7c2286239': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000272641', 'lastCheck': '3.0', 'valid': True}} (logUtils:52) 2016-12-15 06:29:50,837 ERROR (jsonrpc/0) [storage.TaskManager.Task] (Task='7db61c74-eb49-40f6-a4a8-9ab0b616d2ab') Unexpected error (task:870) Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 877, in _run return fn(*args, **kargs) File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3187, in teardownImage dom.deactivateImage(imgUUID) File "/usr/share/vdsm/storage/blockSD.py", line 1245, in deactivateImage lvm.deactivateLVs(self.sdUUID, volUUIDs) File "/usr/share/vdsm/storage/lvm.py", line 1307, in deactivateLVs _setLVAvailability(vgName, toDeactivate, "n") File "/usr/share/vdsm/storage/lvm.py", line 846, in _setLVAvailability raise error(str(e)) CannotDeactivateLogicalVolume: Cannot deactivate Logical Volume: ('General Storage Exception: ("5 [] [\' WARNING: Not using lvmetad because config setting use_lvmetad=0.\', \' WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).\', \' Logical volume 43c129a6-2a33-48c0-bceb-a9585b78a1fd/7d630987-5be2-4cfa-bca8-a5409a64216b in use.\', \' Logical volume 43c129a6-2a33-48c0-bceb-a9585b78a1fd/f99c7c30-6613-4c36-a115-1c877d1a1d2a in use.\']\\n43c129a6-2a33-48c0-bceb-a9585b78a1fd/[\'f99c7c30-6613-4c36-a115-1c877d1a1d2a\', \'7d630987-5be2-4cfa-bca8-a5409a64216b\']",)',) 2016-12-15 06:29:50,842 INFO (jsonrpc/0) [storage.TaskManager.Task] (Task='7db61c74-eb49-40f6-a4a8-9ab0b616d2ab') aborting: Task is aborted: 'Cannot deactivate Logical Volume' - code 552 (task:1175) 2016-12-15 06:29:50,842 ERROR (jsonrpc/0) [storage.Dispatcher] {'status': {'message': 'Cannot deactivate Logical Volume: (\'General Storage Exception: ("5 [] [\\\' WARNING: Not using lvmetad because config setting use_lvmetad=0.\\\', \\\' WARNING: To avoid corruption, rescan devices to make changes visible (pvscan --cache).\\\', \\\' Logical volume 43c129a6-2a33-48c0-bceb-a9585b78a1fd/7d630987-5be2-4cfa-bca8-a5409a64216b in use.\\\', \\\' Logical volume 43c129a6-2a33-48c0-bceb-a9585b78a1fd/f99c7c30-6613-4c36-a115-1c877d1a1d2a in use.\\\']\\\\n43c129a6-2a33-48c0-bceb-a9585b78a1fd/[\\\'f99c7c30-6613-4c36-a115-1c877d1a1d2a\\\', \\\'7d630987-5be2-4cfa-bca8-a5409a64216b\\\']",)\',)', 'code': 552}} (dispatcher:77) 2016-12-15 06:29:50,842 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Image.teardown failed (error 552) in 19.32 seconds (__init__:515)