Created attachment 1528880 [details] full system logs from var/logs Description of problem: oVirt Error creating a storage domain (On Cisco UCS Only) Version-Release number of selected component (if applicable): oVirt 4.2 How reproducible: perfectly each time reproducible Steps to Reproduce: 1. Install oVirt Node using node-ng iso: https://resources.ovirt.org/pub/ovirt-4.3/iso/ovirt-node-ng-installer/ 2. Follow guide to deploy hyperconverged cluster: https://www.ovirt.org/documentation/gluster-hyperconverged/chap-Single_node_hyperconverged.html 3. Proceed to creating hosted-engine. Hosted-engine deployment will fail when creating shared storage Actual results:[ ERROR ] Error: Fault reason is "Operation Failed". Fault detail is "[Error creating a storage domain]". HTTP response code is 400. [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[Error creating a storage domain]\". HTTP response code is 400."} Expected results: Not to fail. Additional info: I have created a article on oVirt forums for this with more details and findings from redhat software engineers https://lists.ovirt.org/archives/list/users@ovirt.org/thread/4TDNVOXEA6B4YXGHGZAGB4AT53OKZ7FG/ Full system logs are attached. Please not this only happens on Cisco UCS servers, i have had success when using other hardware brands
Dominic, can you please review this bug?
According to vdsm.log there is the error 'Invalid argument' during dd tries to format the index for lockspace. Nir, do you have an idea why this is failing? 2019-01-27 10:55:10,968+0400 INFO (jsonrpc/6) [vdsm.api] FINISH connectStorageServer return={'statuslist': [{'status': 0, 'id': u'7af1e739-db3a-46f0-b5b5-62abb9da02fb'}]} from=::ffff:192.168.124.101,34866, flow_id=717fdcaf, task_id=2d93cc27-f77a-46d3-bfe9-72c6d4582d56 (api:52) 2019-01-27 10:55:10,968+0400 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call StoragePool.connectStorageServer succeeded in 0.00 seconds (__init__:573) 2019-01-27 10:55:10,975+0400 INFO (jsonrpc/1) [vdsm.api] START createStorageDomain(storageType=7, sdUUID=u'e06a2edb-3b14-4459-b185-e8cd2db93b2c', domainName=u'hosted_storage', typeSpecificArg=u'z1ppvh05.pki.darkmatter.ae:/engine', domClass=1, domVersion=u'4', options=None) from=::ffff:192.168.124.101,34866, flow_id=717fdcaf, task_id=4331548a-3158-4a9b-a146-763db3551484 (api:46) 2019-01-27 10:55:11,226+0400 INFO (jsonrpc/1) [storage.StorageDomain] sdUUID=e06a2edb-3b14-4459-b185-e8cd2db93b2c domainName=hosted_storage remotePath=z1ppvh05.pki.darkmatter.ae:/engine domClass=1 (nfsSD:70) 2019-01-27 10:55:11,238+0400 INFO (jsonrpc/1) [IOProcessClient] (e06a2edb-3b14-4459-b185-e8cd2db93b2c) Starting client (__init__:308) 2019-01-27 10:55:11,246+0400 INFO (ioprocess/33031) [IOProcess] (e06a2edb-3b14-4459-b185-e8cd2db93b2c) Starting ioprocess (__init__:434) 2019-01-27 10:55:11,273+0400 INFO (jsonrpc/1) [storage.StorageDomain] Creating domain metadata directory u'/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md' (fileSD:450) 2019-01-27 10:55:11,292+0400 INFO (jsonrpc/1) [storage.xlease] Formatting index for lockspace u'e06a2edb-3b14-4459-b185-e8cd2db93b2c' (version=1) (xlease:653) 2019-01-27 10:55:11,318+0400 INFO (jsonrpc/1) [vdsm.api] FINISH createStorageDomain error=Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.00666646 s, 0.0 kB/s\n" from=::ffff:192.168.124.101,34866, flow_id=717fdcaf, task_id=4331548a-3158-4a9b-a146-763db3551484 (api:50) 2019-01-27 10:55:11,318+0400 ERROR (jsonrpc/1) [storage.TaskManager.Task] (Task='4331548a-3158-4a9b-a146-763db3551484') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in createStorageDomain File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2608, in createStorageDomain storageType, domVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 87, in create remotePath, storageType, version) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 466, in _prepareMetadata cls.format_external_leases(sdUUID, xleases_path) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 1219, in format_external_leases xlease.format_index(lockspace, backend) File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 661, in format_index index.dump(file) File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 761, in dump file.pwrite(INDEX_BASE, self._buf) File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 994, in pwrite self._run(args, data=buf[:]) File "/usr/lib/python2.7/site-packages/vdsm/storage/xlease.py", line 1011, in _run raise cmdutils.Error(args, rc, "[suppressed]", err) Error: Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.00666646 s, 0.0 kB/s\n" 2019-01-27 10:55:11,318+0400 INFO (jsonrpc/1) [storage.TaskManager.Task] (Task='4331548a-3158-4a9b-a146-763db3551484') aborting: Task is aborted: u'Command [\'/usr/bin/dd\', \'iflag=fullblock\', u\'of=/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases\', \'oflag=direct,seek_bytes\', \'seek=1048576\', \'bs=256512\', \'count=1\', \'conv=notrunc,nocreat,fsync\'] failed with rc=1 out=\'[suppressed]\' err="/usr/bin/dd: error writing \'/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases\': Invalid argument\\n1+0 records in\\n0+0 records out\\n0 bytes (0 B) copied, 0.00666646 s, 0.0 kB/s\\n"' - code 100 (task:1181) 2019-01-27 10:55:11,318+0400 ERROR (jsonrpc/1) [storage.Dispatcher] FINISH createStorageDomain error=Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.00666646 s, 0.0 kB/s\n" (dispatcher:86) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/dispatcher.py", line 73, in wrapper result = ctask.prepare(func, *args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 108, in wrapper return m(self, *a, **kw) File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 1189, in prepare raise self.error Error: Command ['/usr/bin/dd', 'iflag=fullblock', u'of=/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases', 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' err="/usr/bin/dd: error writing '/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases': Invalid argument\n1+0 records in\n0+0 records out\n0 bytes (0 B) copied, 0.00666646 s, 0.0 kB/s\n" 2019-01-27 10:55:11,319+0400 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call StorageDomain.create failed (error 351) in 0.34 seconds (__init__:573)
(In reply to Dominik Holler from comment #2) > Error: Command ['/usr/bin/dd', 'iflag=fullblock', > u'of=/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/ > e06a2edb-3b14-4459-b185-e8cd2db93b2c/dom_md/xleases', > 'oflag=direct,seek_bytes', 'seek=1048576', 'bs=256512', 'count=1', > 'conv=notrunc,nocreat,fsync'] failed with rc=1 out='[suppressed]' > err="/usr/bin/dd: error writing > '/rhev/data-center/mnt/glusterSD/z1ppvh05.pki.darkmatter.ae:_engine/e06a2edb- > 3b14-4459-b185-e8cd2db93b2c/dom_md/xleases': Invalid argument\n1+0 records > in\n0+0 records out\n0 bytes (0 B) copied, 0.00666646 s, 0.0 kB/s\n" We are writing 256512 bytes at offset 1048576. This write is aligned to 512 bytes, but the underlying storage may use 4k drives, or vdo, with 4k alignment. We don't support 4k yet. It is expected to be supported in 4.3.z and only on specific setups (e.g. RHHI) and only with certain storage (glusterfs). Denis, can you help with detecting the underling block size of this gluster based setup?
Something like dd if=/dev/zero of=/tmp/test iflag=fullblock bs=512 count=1 should fail on 4k only device, and changing bs to 4096 should work
*** This bug has been marked as a duplicate of bug 798947 ***