Bug 1574900
Summary: | VDSErrorException: Failed to CreateStorageDomainVDS, error = Error creating a storage domain: ('storageType=7, ..), code = 351 | ||
---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Petr Balogh <pbalogh> |
Component: | BLL.Storage | Assignee: | Maor <mlipchuk> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Elad <ebenahar> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.2.2 | CC: | bugs, kdhananj, mlipchuk, pbalogh, sabose, tnisan |
Target Milestone: | ovirt-4.3.0 | Keywords: | Automation |
Target Release: | --- | Flags: | rule-engine:
ovirt-4.3+
|
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-07-18 09:15:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Petr Balogh
2018-05-04 09:59:24 UTC
Have you seen anything on VDSM? Gluster logs? Maor, can you take a look please? It seems like the VDSM logs fomr the time the exception occured (2018-05-04 11:24:37) are missing. Can you please add the VDSM logs of host_mixed_2 and host_mixed_1 which includes the entire hour. From what we have so far I can tell that the reason which you probably succeeded to create the SD from the GUI was because host_mixed_2 was used to create the storage domain. The exception which occured in 11:24 was when host_mixed_1 was used to create the storage domain. I reproduced issue when I ran again on the same env, so hope I attached all the logs you need now. It seems like the problem is because the glusterfs is in read only mode (see [1]), that is why the Host can't create the storage domain. [1] 2018-05-09 17:26:15,020+0300 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='a1c7ef58-1570-4b89-a885-74a5c4f4a1e5') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<string>", line 2, in createStorageDomain File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 2591, in createStorageDomain storageType, domVersion) File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 83, in create version) File "/usr/lib/python2.7/site-packages/vdsm/storage/nfsSD.py", line 50, in _preCreateValidation fileSD.validateFileSystemFeatures(sdUUID, domPath) File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 104, in validateFileSystemFeatures oop.getProcessPool(sdUUID).directTouch(testFilePath) File "/usr/lib/python2.7/site-packages/vdsm/storage/outOfProcess.py", line 320, in directTouch ioproc.touch(path, flags, mode) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 567, in touch self.timeout) File "/usr/lib/python2.7/site-packages/ioprocess/__init__.py", line 451, in _sendCommand raise OSError(errcode, errstr) OSError: [Errno 30] Read-only file system Created attachment 1434239 [details]
rhev-data-center-mnt-glusterSD-gluster01.lab.eng.tlv2.redhat.com:_GE__he2__volume01
Created attachment 1434240 [details]
rhev-data-center-mnt-glusterSD-gluster01.lab.eng.tlv2.redhat.com:_GE__he2__volume02
Created attachment 1434241 [details]
rhev-data-center-mnt-glusterSD-gluster01.lab.eng.tlv2.redhat.com:_GE__he2__volume03
Created attachment 1434242 [details]
mountpoint
Created attachment 1434244 [details]
cli log
I've attached the gluster logs from Petr attachment. Sahina, can you please help with understand why the glusterfs became readonly? Krutika, could you take a look please? There are messages of the form "failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running." in the mount logs (In reply to Sahina Bose from comment #16) > Krutika, could you take a look please? > There are messages of the form "failed to get the port number for remote > subvolume. Please run 'gluster volume status' on server to see if brick > process is running." in the mount logs Sure. Funny thing is that the there is no log from GE_he2_volume03-client-1 indicating whether it successfully connected to the client or not. And a fuse client is not supposed to process any IO (such as the "create" on __DIRECT_IO_TEST__) from the application until AFR has heard from all of its children. In this case, AFR either hasn't heard from GE_he2_volume03-client-1 at all and still notified FUSE to go ahead with IO, or there is some codepath in which a CONNECT/DISCONNECT event from GE_he2_volume03-client-1 is NOT logged. Let me dig into client translator code and get back. Keeping the needinfo intact until then. -Krutika Could you please attach glusterd logs as well from all 3 hosts? You'll find them under /var/log/glusterfs and named glusterd.log. -Krutika (In reply to Krutika Dhananjay from comment #18) > Could you please attach glusterd logs as well from all 3 hosts? > > You'll find them under /var/log/glusterfs and named glusterd.log. > > -Krutika Also attach all glusterd.log* files in case they got rotated over the weekend. -Krutika And also the log files under /var/log/glusterfs/bricks of all 3 hosts please ... Seems like this issue is more glusterfs oriented. and it is currently wait for needinfo, therefore posponing it to 4.3 (In reply to Krutika Dhananjay from comment #18) > Could you please attach glusterd logs as well from all 3 hosts? > > You'll find them under /var/log/glusterfs and named glusterd.log. > > -Krutika Hey, we cannot reproduce this issue with latest builds. So cannot provide any another logs, and env was already provisioned multiple times... All logs from first host were attached in Comment 8 , you can find copied whole folder /var/log/glusterfs in logs under ./HostsLogs/tmp/ovirt-logs-hypervisor/glusterfs, but anyway I see there is no glusterd.log. Due to the bug not being able to reproduce and the lack of additional logs due to that closing as INSUFFICIENT_DATA, please reopen if you manage to reproduce in the future. |