Bug 1506677
| Summary: | Hotplug fail when attaching a disk with cow format on glusterfs | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | Avihai <aefrat> | ||||||||
| Component: | BLL.Storage | Assignee: | Denis Chaplygin <dchaplyg> | ||||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Avihai <aefrat> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | unspecified | ||||||||||
| Version: | 4.2.0 | CC: | aefrat, amureini, bugs, dchaplyg, derez, ratamir | ||||||||
| Target Milestone: | ovirt-4.2.0 | Keywords: | Automation, Regression, Reopened | ||||||||
| Target Release: | 4.2.0 | Flags: | rule-engine:
ovirt-4.2+
rule-engine: blocker+ |
||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||||||
| Doc Text: |
undefined
|
Story Points: | --- | ||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2018-02-12 10:09:51 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | Gluster | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 1509635 | ||||||||||
| Attachments: |
|
||||||||||
|
Description
Avihai
2017-10-26 14:17:52 UTC
Bug was found running automation: test_download_image.py::TestCase18258 Bug marked as 'regression' as it does not occur in 4.1.7.3 I wouldn't be surprised if there's some common root cause that causes both this and bug 1506608. @Denis - have you checked hotplug disk with libgfapi? Raising severity. This fails few TPs from our tier 1 automation. @Daniel, libgfapi is not enabled on this execution (In reply to Raz Tamir from comment #4) > Raising severity. > This fails few TPs from our tier 1 automation. > > @Daniel, > libgfapi is not enabled on this execution It's enabled by default in 4.2 since commit 8dc926eae05a (merged Aug 1st). Are you explicitly switching it off and/or running on a 4.1 DC? No, libgfapi is enabled: LibgfApiSupported: true version: 4.2 Raz, maybe we have a misunderstanding here. To quote your last two comments: (In reply to Raz Tamir from comment #4) > @Daniel, > libgfapi is not enabled on this execution (In reply to Raz Tamir from comment #6) > No, > > libgfapi is enabled: > LibgfApiSupported: true version: 4.2 So which is it? Or did I misunderstand one of these comments? Too much noise from comment #4 Ignore comments related to libgfapi from my side. The needinfo for Daniel is still relevant (In reply to Daniel Erez from comment #3) > @Denis - have you checked hotplug disk with libgfapi? Daniel - this is marked as blocker for oVirt GA. What is the status of the bug? I'm not able to reproduce it in my environment neither with libgfapi enabled nor disabled. In my case disk hotplugging works perfectly. (In reply to Denis Chaplygin from comment #11) > I'm not able to reproduce it in my environment neither with libgfapi enabled > nor disabled. > > In my case disk hotplugging works perfectly. Closing per comment 11. Issue reproduce on my environment (storage-ge-08.scl.lab.tlv.redhat.com) with exact same scenario,
See logs attached.
Engine:
2017-11-21 18:17:01,420+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] Failed in 'HotPlugDiskVDS' method
2017-11-21 18:17:01,446+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM
host_mixed_3 command HotPlugDiskVDS failed: failed to initialize gluster connection (src=0x7fcd3c000d80 priv=0x7fcd3c001d50): Input/output error
2017-11-21 18:17:01,446+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand' return value 'StatusOnlyReturn [status=Status [code=45, message=failed to initialize gluster connection (src=0x7fcd3c000d80 priv=0x7fcd3c001d50): Input/output error]]'
2017-11-21 18:17:01,446+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] HostName = host_mixed_3
2017-11-21 18:17:01,447+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] Command 'HotPlugDiskVDSCommand(HostName = host_mixed_3, HotPlugDiskVDSParameters:{hostId='a1060797-25e3-4ca5-9c62-67d9d830f5e6', vmId='32196200-1e99-4393-8cd5-3ae2bc7441d4', diskId='93b45fd5-5a37-4340-a327-ac35a6a0216b', addressMap='[bus=0, controller=0, unit=1, type=drive, target=0]'})' execution failed: VDSGenericException: VDSErrorException: Failed to HotPlugDiskVDS, error = failed to initialize gluster connection (src=0x7fcd3c000d80 priv=0x7fcd3c001d50): Input/output error, code = 45
2017-11-21 18:17:01,447+02 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] FINISH, HotPlugDiskVDSCommand, log id: 35abedc1
2017-11-21 18:17:01,451+02 ERROR [org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] Command 'org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to HotPlugDiskVDS, error = failed to initialize gluster connection (src=0x7fcd3c000d80 priv=0x7fcd3c001d50): Input/output error, code = 45 (Failed with error FailedToPlugDisk and code 45)
2017-11-21 18:17:01,771+02 ERROR [org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand] (default task-10) [75a44285-6766-40bc-a749-c8a662b6235d] Transaction rolled-back for command 'org.ovirt.engine.core.bll.storage.disk.AttachDiskToVmCommand'.
VDSM:
2017-11-21 18:17:01,381+0200 DEBUG (check/loop) [storage.check] START check u'/rhev/data-center/mnt/glusterSD/gluster01.scl.lab.tlv.redhat.com:_storage__local__ge8__volume__2/db55099b-738b-4182-ba67-6560ae68fc6b
/dom_md/metadata' (delay=0.00) (check:282)
2017-11-21 18:17:01,392+0200 ERROR (jsonrpc/2) [virt.vm] (vmId='32196200-1e99-4393-8cd5-3ae2bc7441d4') Hotplug failed (vm:3564)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3562, in hotplugDisk
self._dom.attachDevice(driveXml)
File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 125, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 586, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 540, in attachDevice
if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', dom=self)
libvirtError: failed to initialize gluster connection (src=0x7fcd3c000d80 priv=0x7fcd3c001d50): Input/output error
Created attachment 1356852 [details]
engine , vdsm logs new reproduction
Adding logs .
New reproduction found on builds:
Engine: 4.2.0-0.0.master.20171118160347.git80b3e22.el7.centos
VDSM: 4.20.7-35.git0f9b415.
Avihai - can you also attach libvirt/qemu's logs? @Denis - can you please take a look at the new logs? (In reply to Allon Mureinik from comment #15) > Avihai - can you also attach libvirt/qemu's logs? Added relevant logs. Pay attention that for some reason libvirt.log is empty, unfortunetly only previous libvirt log has content. Created attachment 1356930 [details]
libvirt logs
Well, in that case it failed with a clear error: libvirtError: failed to initialize gluster connection (src=0x7fcd3c000d80 priv=0x7fcd3c001d50): Input/output error Probably you have an issue with gluster cluster connectivity or the volume itself. You need to verify, that * gluster01.scl.lab.tlv.redhat.com is available from your VM host * Gluster ports are not filtered on both VM host and all gluster hosts * gluster peers can reach each other * Volume storage_local_ge8_volume_0 is started and all bricks are running * Volume is not under heal/split-brain condition. There are two problems: 1)Gluster cluster is in inconsistent state and when you try to hot plug a new drive, operation fails due to gluster error. That issue should disappear after gluster cluster repair 2)For some reason, even with libgfapi enabled, first disk of freshly started VM is a 'file' disk, not 'network' disk and i can reproduce it on my environment too. This definitely needs to be fixed urgently and i'm working on that right now. Unfortunately, when it will be fixed, Avihai's VM will not be able to start, as it will requires working gluster cluster at startup phase. (In reply to Denis Chaplygin from comment #20) > There are two problems: > > 1)Gluster cluster is in inconsistent state and when you try to hot plug a > new drive, operation fails due to gluster error. That issue should disappear > after gluster cluster repair > 2)For some reason, even with libgfapi enabled, first disk of freshly started > VM is a 'file' disk, not 'network' disk and i can reproduce it on my > environment too. This definitely needs to be fixed urgently and i'm working > on that right now. Unfortunately, when it will be fixed, Avihai's VM will > not be able to start, as it will requires working gluster cluster at startup > phase. Thanks Denis! So I'm moving the bug to Gluster team for further investigation. Leave it on me, i just found a root cause inside of engine and fill make a patch tomorrow. (In reply to Denis Chaplygin from comment #19) > Well, in that case it failed with a clear error: > > libvirtError: failed to initialize gluster connection (src=0x7fcd3c000d80 > priv=0x7fcd3c001d50): Input/output error > > > Probably you have an issue with gluster cluster connectivity or the volume > itself. > > You need to verify, that > * gluster01.scl.lab.tlv.redhat.com is available from your VM host > * Gluster ports are not filtered on both VM host and all gluster hosts > * gluster peers can reach each other > * Volume storage_local_ge8_volume_0 is started and all bricks are running > * Volume is not under heal/split-brain condition. As root cause is known my input is not necessary. My gluster cluster is now up & fixed finally. Please tell me if you are missing any more info . Bug verification is blocked due to Bug 1497511. Verified on: Engine 4.2.1.4-0.1.el7 VDSM 4.20.17-1 Libvirt 3.9.0-7 Qemu 2.10.0-18 This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |