Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1610738

Summary: Failed to add Gluster SD with mount options
Product: [oVirt] vdsm Reporter: Evelina Shames <eshames>
Component: GlusterAssignee: sankarshan <sankarshan>
Status: CLOSED NOTABUG QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: high    
Version: 4.20.31CC: bugs, eshames, ratamir, sabose, sankarshan, tnisan
Target Milestone: ovirt-4.2.7Keywords: Automation, AutomationBlocker, Regression
Target Release: ---Flags: rule-engine: ovirt-4.2+
rule-engine: blocker+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-20 14:59:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Gluster RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Evelina Shames 2018-08-01 11:13:05 UTC
Created attachment 1472058 [details]
logs

Description of problem:
When trying to add gluster storage domain with mount options="backup-volfile-servers=<other gluster SDs>", it fails with the following errors:


vdsm log:
2018-08-01 12:05:05,443+0300 ERROR (jsonrpc/4) [storage.HSM] Could not connect to storageServer (hsm:2398)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2395, in connectStorageServer
    conObj.connect()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 166, in connect
    self.validate()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 296, in validate
    if not self.volinfo:
  File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 283, in volinfo
    self._volinfo = self._get_gluster_volinfo()
  File "/usr/lib/python2.7/site-packages/vdsm/storage/storageServer.py", line 328, in _get_gluster_volinfo
    self._volfileserver)
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 55, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in <lambda>
    **kwargs)
  File "<string>", line 2, in glusterVolumeInfo
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
GlusterVolumesListFailedException: Volume list failed
error: Volume does not exist
return code: 30806


supervdsm log:
MainProcess|jsonrpc/4::DEBUG::2018-08-01 12:05:05,281::supervdsm_server::96::SuperVdsm.ServerCallback::(wrapper) call volumeInfo with ('gluster01.scl.lab.tlv.redhat.com',) {}
MainProcess|jsonrpc/4::DEBUG::2018-08-01 12:05:05,281::logutils::317::root::(_report_stats) ThreadedHandler is ok in the last 2075 seconds (max pending: 5)
MainProcess|jsonrpc/4::DEBUG::2018-08-01 12:05:05,282::commands::65::root::(execCmd) /usr/bin/taskset --cpu-list 0-0 /usr/sbin/gluster --mode=script volume info --remote-host=gluster01.scl.lab.tlv.redhat.com storage_local_ge11_replica_3 --xml (cwd None)
MainProcess|jsonrpc/4::DEBUG::2018-08-01 12:05:05,441::commands::86::root::(execCmd) SUCCESS: <err> = ''; <rc> = 0
MainProcess|jsonrpc/4::ERROR::2018-08-01 12:05:05,442::supervdsm_server::100::SuperVdsm.ServerCallback::(wrapper) Error in volumeInfo
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/supervdsm_server.py", line 98, in wrapper
    res = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/gluster/cli.py", line 504, in volumeInfo
    raise ge.GlusterVolumesListFailedException(rc=e.rc, err=e.err)
GlusterVolumesListFailedException: Volume list failed
error: Volume does not exist
return code: 30806


engine log:
2018-08-01 12:05:05,262+03 INFO  [org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand] (default task-9) [storagedomains_create_d2b250cf-6877-] Running command: AddStorageServerConnecti
onCommand internal: false. Entities affected :  ID: aaa00000-0000-0000-0000-123456789aaa Type: SystemAction group CREATE_STORAGE_DOMAIN with role type ADMIN
2018-08-01 12:05:05,267+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (default task-9) [storagedomains_create_d2b250cf-6877-] START, ConnectStorageServerVDSCommand(HostName 
= host_mixed_3, StorageServerConnectionManagementVDSParameters:{hostId='2f128c7b-1ae4-416f-9f7e-256e83c6b3c7', storagePoolId='00000000-0000-0000-0000-000000000000', storageType='GLUSTERFS', connectionList='[Stor
ageServerConnections:{id='null', connection='gluster01.scl.lab.tlv.redhat.com:/storage_local_ge11_replica_3', iqn='null', vfsType='glusterfs', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo=
'null', iface='null', netIfaceName='null'}]', sendNetworkEventOnFailure='true'}), log id: 4ae99575
2018-08-01 12:05:05,462+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] (default task-9) [storagedomains_create_d2b250cf-6877-] FINISH, ConnectStorageServerVDSCommand, return:
 {00000000-0000-0000-0000-000000000000=4149}, log id: 4ae99575
2018-08-01 12:05:05,480+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-9) [storagedomains_create_d2b250cf-6877-] EVENT_ID: STORAGE_DOMAIN_ERROR(996), The error mess
age for connection gluster01.scl.lab.tlv.redhat.com:/storage_local_ge11_replica_3 returned by VDSM was: Failed to fetch Gluster Volume List
2018-08-01 12:05:05,480+03 ERROR [org.ovirt.engine.core.bll.storage.connection.FileStorageHelper] (default task-9) [storagedomains_create_d2b250cf-6877-] The connection with details 'gluster01.scl.lab.tlv.redhat
.com:/storage_local_ge11_replica_3' failed because of error code '4149' and error message is: failed to fetch gluster volume list
2018-08-01 12:05:05,481+03 ERROR [org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand] (default task-9) [storagedomains_create_d2b250cf-6877-] Command 'org.ovirt.engine.core.bll.storag
e.connection.AddStorageServerConnectionCommand' failed: EngineException: GlusterVolumeListFailed (Failed with error GlusterVolumeListFailed and code 4149)
2018-08-01 12:05:05,503+03 ERROR [org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand] (default task-9) [storagedomains_create_d2b250cf-6877-] Transaction rolled-back for command 'org.
ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand'.
2018-08-01 12:05:05,510+03 INFO  [org.ovirt.engine.core.bll.storage.connection.AddStorageServerConnectionCommand] (default task-9) [storagedomains_create_d2b250cf-6877-] Lock freed to object 'EngineLock:{exclusi
veLocks='[gluster01.scl.lab.tlv.redhat.com:/storage_local_ge11_replica_3=STORAGE_CONNECTION]', sharedLocks=''}'
2018-08-01 12:05:05,518+03 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-9) [] Operation Failed: [Failed to fetch Gluster Volume List]


Version-Release number of selected component (if applicable):
vdsm: vdsm-4.20.35-1.el7ev.x86_64
engine: ovirt-engine-4.2.5.2-0.1.el7ev.noarch

How reproducible:
100%

Steps to Reproduce:
Create gluster SD with mount options="backup-volfile-servers=<other gluster SDs>"

Actual results:
Failure.

Expected results:
Should succeed

Additional info:
logs are attached.

Comment 1 Elad 2018-08-01 13:22:33 UTC
Marking as an AutomationBlocker as this prevents testing Gluster domains with additional nodes test cases.
Marking as a regression as this operation used to work before.

Comment 2 Sahina Bose 2018-08-16 13:17:32 UTC
Does this volume exist and is it started - gluster01.scl.lab.tlv.redhat.com:/storage_local_ge11_replica_3

Comment 3 Red Hat Bugzilla Rules Engine 2018-08-16 13:18:13 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 4 Raz Tamir 2018-08-16 14:54:19 UTC
This was discovered in 4.2.5 + regression + blocker.
Please retarget to 4.2.6

Comment 5 Sahina Bose 2018-08-20 13:06:15 UTC
(In reply to Raz Tamir from comment #4)
> This was discovered in 4.2.5 + regression + blocker.
> Please retarget to 4.2.6

We can retarget if we have the requested info

Comment 6 Elad 2018-08-20 14:59:52 UTC
(In reply to Sahina Bose from comment #2)
> Does this volume exist and is it started -
> gluster01.scl.lab.tlv.redhat.com:/storage_local_ge11_replica_3

Indeed, the volume did not exist. The test passes when it does.
Sorry for the noise, closing the bug.