Bug 1461214
Summary: | RFE: Enhance libvirt to allow existing file for memoryBacking type file | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Zack Cornelius <zack.cornelius> | |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
Status: | CLOSED ERRATA | QA Contact: | Luyao Huang <lhuang> | |
Severity: | medium | Docs Contact: | ||
Priority: | low | |||
Version: | 7.4 | CC: | ailan, dyuan, jdenemar, jsuchane, kchamart, knoel, lmiksik, mprivozn, mtessun, plancast, rbalakri, sgordon, xuzhang, yalzhang, zack.cornelius | |
Target Milestone: | rc | Keywords: | FutureFeature | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-3.9.0-11.el7 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1541570 (view as bug list) | Environment: | ||
Last Closed: | 2018-04-10 10:48:37 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1460848 | |||
Bug Blocks: | 1541570, 1594272, 1795933 |
Description
Zack Cornelius
2017-06-13 23:10:07 UTC
(In reply to Zack Cornelius from comment #0) > Description of problem: > When using memoryBacking source type 'file' with qemu, libvirt passes the > directory from qemu.conf's memory_backing_dir as the mem-path argument for > the object. This leads to qemu using a tmpfile for the file backing the > memory. > > Our use case uses a libvirt hook script to create a symlink to an existing > file for qemu to use as the backing store. For this to work, libvirt needs > to specify a specific filename instead of just the directory for mem-path. > > I think this could be accomplished via having an option to use a predefined > filename (such as the guest's UUID) or allowing the XML to specify the > filename for the backing file. UUID is not enough. Thing is, a domain can have multiple memory-object-files. There's <memory model='dimm'/> which can be repeated multiple times in domain definition. And each time we want to have a different path for it. In this light letting users specify the filename in domain XML looks better. However, there might be some drivers (hypervisors) that don't have traditional UNIX path representation of objects which is the reason we haven't exposed the mem-path just yet and worked around it so far. http://libvirt.org/formatdomain.html#elementsMemory Zack, I've started discussion on the upstream list: https://www.redhat.com/archives/libvir-list/2017-July/msg01248.html The design is still a bit unclear. For instance, what do you need the path for? Is is enough to learn it once qemu has started or do you need to know it upfront (e.g. because Kove creates the file and qemu then just merely mmap()-s it)? Also, as Dan pointed out, if you have a kernel module that implements its own version of tmpfs, shouldn't that be enough since you'll learn the paths once the module handles mmap() issued by qemu? Kove dynamically creates the file(s) in a virtual filesystem used by qemu, based on allocating from a hardware backing device. We expect to then use the libvirt prepare hooks to symlink the file created to the location libvirt/qemu is expecting. With this, we'll need to know or be able to determine the filename upfront. Because of the need to allocate, and track allocations on the hardware device, we don't act as a standard tmpfs, and do not allow creation of files in the virtual filesystem, outside of our allocation and connection management, so we won't be able to point memory_backing_dir to our virtual filesystem without being able to create the files using some form of predicatable names, prior to running qemu (In reply to Zack Cornelius from comment #4) > Zack, I don't know if you follow the upstream discussion, but the digest is that upstream doesn't want to expose paths anywhere because that is Linux specific. For instance for hugepages we have the following: <memoryBacking> <hugepages> <page size='2' unit='MiB'/> </hugepages> </memoryBacking> This is generic enough to work on any future systems (e.g. *BSDs), where hugepages are not necessarily represented as paths. Now, if we blindly allow users to set -mem-path by exposing it in the domain XML all bets are off. However, if Kove's kernel module would create tmpfs-like FS (just like hugetlbfs is), libvirt can detect it on its start and then no path needs to be exposed since libvirt already puts all the files under one directory. Anyway, it'd be great if you could join the upstream discussion: https://www.redhat.com/archives/libvir-list/2017-September/msg00089.html @Zack and Kove team: Do you have plans for management UI changes to support the new features? Do we need Nova and/or RHEV BZs too? After some discussion upstream, I think we finally have a clear consensus on the design. So I've implemented it: https://www.redhat.com/archives/libvir-list/2017-October/msg01063.html I found a problem when trying to verify this bug: 1. make guest use file as memory backend <memoryBacking> <source type='file'/> <access mode='shared'/> </memoryBacking> 2. start guest: # virsh start vm1 Domain vm1 started 3. check the memory backing file: # ll /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ total 346948 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:39 ram-node0 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:39 ram-node1 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:39 ram-node2 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:39 ram-node3 4. attach a memory device: # cat mem.xml <memory model='dimm' access='private'> <target> <size unit='MiB'>256</size> <node>0</node> </target> </memory> # virsh attach-device vm1 mem.xml Device attached successfully # ll /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ total 373712 -rw-r--r--. 1 qemu qemu 268435456 Jan 11 01:41 dimm0 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node0 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node1 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node2 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node3 5. detach this memory device: # virsh detach-device vm1 mem.xml Device detached successfully # ll /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ total 388616 -rw-r--r--. 1 qemu qemu 268435456 Jan 11 01:41 dimm0 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node0 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node1 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node2 -rw-r--r--. 1 qemu qemu 524288000 Jan 11 01:41 ram-node3 6. attach a mem device which size bigger than the first one: # cat mem2.xml <memory model='dimm' access='private'> <target> <size unit='MiB'>512</size> <node>0</node> </target> </memory> # virsh attach-device vm1 mem2.xml error: Failed to attach device from mem2.xml error: internal error: unable to execute QEMU command 'object-add': backing store (null) size 0x10000000 does not match 'size' option 0x20000000 You can see that when attach->detach->attach libvirt will use the same name dimm0 and if the memory device size bigger than the first time attached, qemu will reject the attach request. Since kove will manage the file in the vfs, maybe they will create the dimm memory backing file and delete it after detach device, then this problem won't be happened in kove system. Hi Michal, Could you please help to check if this is a bug ? Thanks in advance for your reply ! (In reply to Luyao Huang from comment #12) > > # virsh attach-device vm1 mem2.xml > error: Failed to attach device from mem2.xml > error: internal error: unable to execute QEMU command 'object-add': backing > store (null) size 0x10000000 does not match 'size' option 0x20000000 > This is because qemu/libvirt does not unlink the file after the first detach so it is laying around. Then, when you try to hotplug it again with changed size we advertise qemu new size on the monitor but the file itself is left untouched and this confuses qemu. I'm not quite sure who should unlink the file - whether it should be libvirt or qemu (who creates the file in the first place). Let me discuss with qemu developers and get back to you (not clearing the needinfo flag for now). So after some IRC discussion I came to conclusion that it'd be for the best if libvirt removes the file on hot unplug. I've proposed the patch here: https://www.redhat.com/archives/libvir-list/2018-January/msg00350.html However, I'm not quite sure whether this fits properly into Kove's use case. Zack, can you please take a look? This patch as-proposed will work for Kove's use cases. According to comment 14, one more patch need backport to fix the issue in comment 12, move this bug status to ASSIGNED (In reply to Michal Privoznik from comment #17) > V2: > > https://www.redhat.com/archives/libvir-list/2018-February/msg00051.html Ah, sorry. Updated wrong bug. Ignore that comment please. verify this bug with libvirt-3.9.0-11.el7.x86_64: 1. prepare a guest which config memory shared=on and backend=file: <memoryBacking> <source type='file'/> <access mode='shared'/> </memoryBacking> ... <cpu mode='host-model' check='full'> <model fallback='allow'/> <numa> <cell id='0' cpus='0' memory='1048576' unit='KiB'/> <cell id='1' cpus='1' memory='1048576' unit='KiB'/> </numa> </cpu> 2. start guest: # virsh start vm1 Domain vm1 started 3. check guest cmdline: # ps aux|grep qemu ... -object memory-backend-file,id=ram-node0,mem-path=/var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ram-node0,share=yes,size=1073741824 -numa node,nodeid=0,cpus=0,memdev=ram-node0 -object memory-backend-file,id=ram-node1,mem-path=/var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ram-node1,share=yes,size=1073741824 -numa node,nodeid=1,cpus=1,memdev=ram-node1 4. check the memory dir: # ll -Z /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c41,c283 ram-node0 -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c41,c283 ram-node1 5. attach a memory device: # cat mem.xml <memory model='dimm' access='private'> <target> <size unit='MiB'>128</size> <node>0</node> </target> </memory> # virsh attach-device vm1 mem.xml Device attached successfully 6. recheck the memory dir: # ll -Z /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c41,c283 dimm0 -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c41,c283 ram-node0 -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c41,c283 ram-node1 7. detach memory device: # virsh detach-device vm1 mem.xml Device detached successfully 8. recheck memory dir, cannot find dimm0: # ll -Z /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c41,c283 ram-node0 -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c41,c283 ram-node1 9. destroy guest and the memory dir for this guest will been deleted: # virsh destroy vm1 Domain vm1 destroyed # ll -Z /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/ ls: cannot access /var/lib/libvirt/qemu/ram/libvirt/qemu/12-vm1/: No such file or directory 10. change the memory_backing_dir in qemu.conf and retest steps 1-9, and get the same result (except the dir path) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0704 |