Bug 1417165
Summary: | Unable to attach a disk to a VM (disk owned by kvm and not qemu?) | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Fabrice Bacchella <fabrice.bacchella> | ||||||||||||||||
Component: | General | Assignee: | Idan Shaby <ishaby> | ||||||||||||||||
Status: | CLOSED NOTABUG | QA Contact: | Raz Tamir <ratamir> | ||||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||||
Priority: | medium | ||||||||||||||||||
Version: | 4.18.21 | CC: | amureini, bugs, fabrice.bacchella, ishaby, nsoffer, tjelinek, tnisan, ylavi | ||||||||||||||||
Target Milestone: | ovirt-4.1.4 | Flags: | rule-engine:
ovirt-4.1+
|
||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||||
OS: | Unspecified | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2017-07-04 12:50:55 UTC | Type: | Bug | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Attachments: |
|
Description
Fabrice Bacchella
2017-01-27 11:38:21 UTC
please provide {super,}vdsm.log and /var/log/messages it seems the the udev rule did not kick in for this disk on time. How can I send that privately ? /var/log/messages contains a lot of non-public informations Even if you sent it to me in private, I would have to share it with others anyway in order to solve the issue. I don't have any trick but to sanitize your log with many `sed` lines. BTW, which version of udev, systemd, libvirt do you have? and what about the vdsm logs? Those informations are not secret. I just don't want them to be published on a public facing site where any random crawler can get it. It's running on an up-to-date fully patched CentOS Linux release 7.3.1611. I will give you more details tomorrow. Created attachment 1245867 [details]
rpms versions
Created attachment 1245868 [details]
output from dmesg
Created attachment 1245869 [details]
libvirt logs for the VM failing to attach the disk
Created attachment 1245870 [details]
journactl -b output
Created attachment 1245871 [details]
/var/log/messages
Created attachment 1245872 [details]
vdsm.log
Informations about the users in the hosts: id vdsm uid=36(vdsm) gid=36(kvm) groups=36(kvm),179(sanlock),107(qemu) id qemu uid=107(qemu) gid=107(qemu) groups=107(qemu),11(cdrom) moving to storage for further investigation Fabrice, the problematic disk is on a block domain or a file domain? An export of the domain return: <StorageDomain href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671" id="7c5291d3-11e2-420f-99ad-47a376013671"> <actions> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/isattached" rel="isattached"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/updateovfstore" rel="updateovfstore"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/refreshluns" rel="refreshluns"/> </actions> <name>XXX</name> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/permissions" rel="permissions"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/templates" rel="templates"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/vms" rel="vms"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/disks" rel="disks"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/storageconnections" rel="storageconnections"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/disksnapshots" rel="disksnapshots"/> <link href="/ovirt-engine/api/storagedomains/7c5291d3-11e2-420f-99ad-47a376013671/diskprofiles" rel="diskprofiles"/> <data_centers> <data_center id="17434f4e-8d1a-4a88-ae39-d2ddd46b3b9b"/> </data_centers> <type>data</type> <external_status> <state>ok</state> </external_status> <master>true</master> <storage> <type>localfs</type> <path>/data/ovirt/data</path> </storage> <available>1989643599872</available> <used>56908316672</used> <committed>300647710720</committed> <storage_format>v3</storage_format> <wipe_after_delete>false</wipe_after_delete> <warning_low_space_indicator>10</warning_low_space_indicator> <critical_space_action_blocker>5</critical_space_action_blocker> </StorageDomain> Is that what you need ? Moving out all non blocker\exceptions. Hi Fabrice, A few questions and requests: 1. Can you please provide the output of "ls -lZ /data/ovirt/data"? 2. The right ownership of images and snapshots is vdsm:kvm. It seems like the problematic disk has the right ownership and all the rest don't. Any idea if the ownership of the image or the storage domain's directory have changed? 3. From the output of "id qemu" I can see that it doesn't belong to the "36(kvm)" group. Any idea if it was changed? Anyway, a workaround for this bug can be to add qemu to the kvm group. 4. Can you please provide the vdsm and engine logs from the time that disk was created? 5. Is this bug still reproducible in your system? Thanks! 1. ~$ ls -lZ /data/ovirt/data drwxr-xr-x vdsm qemu ? 7c5291d3-11e2-420f-99ad-47a376013671 -rwxr-xr-x vdsm qemu ? __DIRECT_IO_TEST__ 2. I don't think so. But it was a long time ago 3. Idem 4 and 5: I just added a new disks, it failed: -rw-rw---- 1 vdsm kvm 10G Jul 3 16:16 /data/ovirt/data/7c5291d3-11e2-420f-99ad-47a376013671/images/0243d40d-d1de-478f-93db-591d1955314c/b1f8aee3-c99f-4960-a32a-b942f8d9226b -rw-r--r-- 1 vdsm kvm 323 Jul 3 16:16 /data/ovirt/data/7c5291d3-11e2-420f-99ad-47a376013671/images/0243d40d-d1de-478f-93db-591d1955314c/b1f8aee3-c99f-4960-a32a-b942f8d9226b.meta Created attachment 1293904 [details]
requested logs
I tried the workaround, indeed it works. Thanks Fabrice! To me it sounds like this can be the root cause for this bug. We need to find out why qemu was not added to the kvm group. I think it might be my fault. I'm creating users using puppet. I thought I checked that they match exactly what oVirt created. So I probably made a mistake. All that is missing from a ovirt is a check and a slightly better log message. OK, anyway I've just checked it with a fresh new VM with CentOS 7.3.1611, installed vdsm-4.18.21-1.el7.centos.x86_64 and got this result for 'id qemu': uid=107(qemu) gid=107(qemu) groups=107(qemu),11(cdrom),36(kvm) Therefore, in a clean system no bug should occur. |