Bug 1135491
Summary: | <iothread> cpuset CPU binding support | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Stefan Hajnoczi <stefanha> |
Component: | libvirt | Assignee: | John Ferlan <jferlan> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.2 | CC: | dyuan, dzheng, honzhang, jferlan, lhuang, mzhan, rbalakri, shyu |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libvirt-1.2.14-1.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2015-11-19 05:48:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1101574 | ||
Bug Blocks: | 1161617 |
Description
Stefan Hajnoczi
2014-08-29 12:56:56 UTC
John, I have assigned it to you since you are currently handling the <iothread> implementation. Feel free to assign to someone else if you will be unable to work on this. Since 1.2.8 was the last release to allow API changes into RHEL7.1 - I'll use this BZ as the means/marker for the API changes into RHEL7.2. I changed the release flag too. Still probably have some changes that can use the other BZ including adding underpinnings necessary to get the thread id's via qmp and attach-disk/update-device type changes to add an --iothread parameter. Extending emulatorpin is not the right answer. Sure it "allows" one to cheat short term, but once it is there it's tough to remove it. I'm going to look into perhaps adding a --cpuset argument that would define the set of CPU's IOThreads would use. Then in qemuProcessStart add some code that will get the thread tid's and assign them after the qemu process has started... No guarantees, but it might suffice at least short term... Longer term, the following is what I think will suffice - if other thoughts come up - then we can work those out... A new 'virsh' command that will "manage" IOThreads. This will use new libvirt API's (and also have libvirt-python API's). They'll be named something like 'virDomainIothread{G|S}et()' - I think that's all I'd need. The virsh command would be: virsh domiothreads domain [--list] [[--config] [--live] | [--current]] [--pin thread_id] [--cpuset "string"]) where [--list] would be the default if nothing is provided and would list: IOThread Name Thread Id CPU Id Resource(s) ------------- --------- ------ ----------- Where "IOThread Name" and "Thread Id" come from (QEMU) query-iothreads {u'return': [{u'id': u'jaftest1', u'thread-id': 30992}, {u'id': u'jaftest2', u'thread-id': 30993}]} (QEMU) "CPU Id" would be a get of which CPU the current Thread Id is running on "Resource(s)" would be empty or "list" of resources (currently only disks) using the thread (possibly/hopefully). [--config] would allow modifying the existing config to add/remove, but not effect the running system [--live] would add iothread objects (object-add in some matter) [--current] would be the domain's current state (but exclusive of live/config - following other examples) [--pin thread id] would be a way to pin an iothread to a specific CPU set (or single). It's only valid for live domain of course. [--cpuset "string"] to manage having IOThreads assigned to specific set at startup. The [--pin] would conceivably override. That's enough thinking for now! (In reply to John Ferlan from comment #3) > The virsh command would be: > > virsh domiothreads domain > > [--list] > [[--config] [--live] | [--current]] > [--pin thread_id] > [--cpuset "string"]) > > where [--list] would be the default if nothing is provided and would list: > > IOThread Name Thread Id CPU Id Resource(s) > ------------- --------- ------ ----------- > > Where "IOThread Name" and "Thread Id" come from > (QEMU) query-iothreads > {u'return': [{u'id': u'jaftest1', u'thread-id': 30992}, {u'id': u'jaftest2', > u'thread-id': 30993}]} > (QEMU) > > "CPU Id" would be a get of which CPU the current Thread Id is running on > > "Resource(s)" would be empty or "list" of resources (currently only disks) > using the thread (possibly/hopefully). > > [--config] would allow modifying the existing config to add/remove, but not > effect the running system > > [--live] would add iothread objects (object-add in some matter) > > [--current] would be the domain's current state (but exclusive of > live/config - following other examples) > > [--pin thread id] would be a way to pin an iothread to a specific CPU set > (or single). It's only valid for live domain of course. > > [--cpuset "string"] to manage having IOThreads assigned to specific set at > startup. The [--pin] would conceivably override. Sounds good to me. After a few review cycles and separate commits for the info vs. change functionality, the code has been pushed upstream. git describe 1cfc0a9990866b423e1110997f9a06f1d6d869c9 v1.2.13-142-g1cfc0a9 NB: Displaying the "Resource(s)" column was rejected upstream since it's part of the XML. commit 1cfc0a9990866b423e1110997f9a06f1d6d869c9 Author: John Ferlan <jferlan> Date: Thu Mar 5 19:08:04 2015 -0500 virsh: Add iothreadpin command https://bugzilla.redhat.com/show_bug.cgi?id=1135491 $ virsh iothread --help NAME iothreadpin - control domain IOThread affinity SYNOPSIS iothreadpin <domain> <iothread> <cpulist> [--config] [--live] [--current] DESCRIPTION Pin domain IOThreads to host physical CPUs. OPTIONS [--domain] <string> domain name, id or uuid [--iothread] <number> IOThread ID number [--cpulist] <string> host cpu number(s) to set --config affect next boot --live affect running domain --current affect current domain Using the output from iothreadsinfo, allow changing the pinned CPUs for a single IOThread. $ virsh iothreadsinfo $dom IOThread ID CPU Affinity --------------------------------------------------- 1 2 2 3 3 0-1 $ virsh iothreadpin $dom 3 0-2 Then view the change $ virsh iothreadsinfo $dom IOThread ID CPU Affinity --------------------------------------------------- 1 2 2 3 3 0-2 If an invalid value is supplied or require option missing, then an error will be displayed: $ virsh iothreadpin $dom 4 3 error: invalid argument: iothread value out of range 4 > 3 $ virsh iothreadpin $dom 3 error: command 'iothreadpin' requires <cpulist> option Verify this bug with libvirt-1.2.17-2.el7.x86_64 and qemu-kvm-rhev-2.3.0-12.el7.x86_64: 1. prepare a running guest with iothread: # virsh iothreadinfo rhel7.0-rhel IOThread ID CPU Affinity --------------------------------------------------- 1 1 2. check the cgroup settings: # cgget -g cpuset /machine.slice/machine-qemu\\x2drhel7.0\\x2drhel.scope/iothread1 /machine.slice/machine-qemu\x2drhel7.0\x2drhel.scope/iothread1: cpuset.memory_spread_slab: 0 cpuset.memory_spread_page: 0 cpuset.memory_pressure: 0 cpuset.memory_migrate: 1 cpuset.sched_relax_domain_level: -1 cpuset.sched_load_balance: 1 cpuset.mem_hardwall: 0 cpuset.mem_exclusive: 0 cpuset.cpu_exclusive: 0 cpuset.mems: 0 cpuset.cpus: 1 3. bind iothread to another cpu: # virsh iothreadpin rhel7.0-rhel 1 3 4. recheck the cgroup and taskset: # cgget -g cpuset /machine.slice/machine-qemu\\x2drhel7.0\\x2drhel.scope/iothread1 /machine.slice/machine-qemu\x2drhel7.0\x2drhel.scope/iothread1: cpuset.memory_spread_slab: 0 cpuset.memory_spread_page: 0 cpuset.memory_pressure: 0 cpuset.memory_migrate: 1 cpuset.sched_relax_domain_level: -1 cpuset.sched_load_balance: 1 cpuset.mem_hardwall: 0 cpuset.mem_exclusive: 0 cpuset.cpu_exclusive: 0 cpuset.mems: 0 cpuset.cpus: 3 # virsh qemu-monitor-command rhel7.0-rhel '{"execute": "query-iothreads"}' --pretty { "return": [ { "thread-id": 30228, "id": "iothread1" } ], "id": "libvirt-14" } # taskset -p 30228 pid 30228's current affinity mask: 8 5. check the xml # virsh dumpxml rhel7.0-rhel |grep iothreadpin <iothreadpin iothread='1' cpuset='3'/> 6. restart libvirtd and recheck: # service libvirtd restart Redirecting to /bin/systemctl restart libvirtd.service # virsh dumpxml rhel7.0-rhel |grep iothreadpin <iothreadpin iothread='1' cpuset='3'/> 7. managedsave and restart, recheck: # virsh managedsave rhel7.0-rhel Domain rhel7.0-rhel state saved by libvirt # virsh start rhel7.0-rhel Domain rhel7.0-rhel started # virsh dumpxml rhel7.0-rhel |grep iothreadpin <iothreadpin iothread='1' cpuset='3'/> # cgget -g cpuset /machine.slice/machine-qemu\\x2drhel7.0\\x2drhel.scope/iothread1 /machine.slice/machine-qemu\x2drhel7.0\x2drhel.scope/iothread1: cpuset.memory_spread_slab: 0 cpuset.memory_spread_page: 0 cpuset.memory_pressure: 0 cpuset.memory_migrate: 1 cpuset.sched_relax_domain_level: -1 cpuset.sched_load_balance: 1 cpuset.mem_hardwall: 0 cpuset.mem_exclusive: 0 cpuset.cpu_exclusive: 0 cpuset.mems: 0 cpuset.cpus: 3 # virsh qemu-monitor-command rhel7.0-rhel '{"execute": "query-iothreads"}' --pretty { "return": [ { "thread-id": 1300, "id": "iothread1" } ], "id": "libvirt-103" } # taskset -p 1300 pid 1300's current affinity mask: 8 And test with qemu-kvm-1.5.3-97.el7.x86_64, prepare a guest with iothread, seems there is a issue here: 1. # virsh dumpxml r7 |grep iothreads <iothreads>1</iothreads> 2. bind iothread1 to a cpu: # virsh iothreadpin r7 1 3 3. # virsh dumpxml r7 |grep iothread <iothreads>1</iothreads> <iothreadids> <iothread id='1'/> </iothreadids> <iothreadpin iothread='1' cpuset='3'/> 4. xml and command show success, but actual we failed: # lscgroup cpuset:/machine.slice/machine-qemu\x2dr7.scope/iothread1 # cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dr7.scope/iothread1/tasks (no tasks here) # cat /proc/22868/task/22940/status | grep Cpus Cpus_allowed: e Cpus_allowed_list: 1-3 Hi John, Would you please help to check comment 8 ? i think this api was broken when use qemu-kvm, can i reopen this bug for this issue? or need open a new bug ? thanks a lot for your reply Luyao I have absolutely no idea what you're testing and what the question/issue is. The first sequence of commands does one thing and the last one does something else. The first sequence seems to be using qemu 2.3 and the second using 1.5? Is one ok? Is one broken? What about the taskset output for the second sequence? Perhaps use '--cpu-list' switch in order to see the list rather than the mask so it's easier/clearer to see. Mixing in cgroups and cpuset is confusing. It's just not clear what's being tested. What does a similar vcpupin command sequence show? That is if you pinned the cpuset would you get a similar result? Both use the same sequence to do the job, so I would think both would have similar results. What are you expecting? (In reply to John Ferlan from comment #10) > I have absolutely no idea what you're testing and what the question/issue > is. The first sequence of commands does one thing and the last one does > something else. The first sequence seems to be using qemu 2.3 and the > second using 1.5? Is one ok? Is one broken? What about the taskset output Sorry, seems my comment is not clearly, i should give more explanation. I found it works well with qemu 2.3 (qemu-kvm-rhev), but it is broken with qemu 1.5 (qemu-kvm), taskset output is like this: # cat /proc/22868/task/22940/status | grep Cpus Cpus_allowed: e <------ taskset will output Cpus_allowed_list: 1-3 <------ more pretty list (taskset with --cpu-list) > for the second sequence? Perhaps use '--cpu-list' switch in order to see the > list rather than the mask so it's easier/clearer to see. Mixing in cgroups > and cpuset is confusing. > Okay, good idea, > It's just not clear what's being tested. > Test libvirt really bind the right pid (iothread's pid) to the right host cpus, and check libvirt set the right cpuset in cgroup for iothread if cpuset group is available. > What does a similar vcpupin command sequence show? That is if you pinned the > cpuset would you get a similar result? Both use the same sequence to do the > job, so I would think both would have similar results. > vcpupin command works well with qemu 1.5 , the test result: 1. libvirt output of vcpupin: # virsh vcpupin test4 VCPU: CPU Affinity ---------------------------------- 0: 1 1: 0-3 2. check libvirt set cpuset for vcpu0 in cgroup: # virsh qemu-monitor-command test4 --hmp info cpus * CPU #0: pc=0x00000000000f7fa8 (halted) thread_id=9589 CPU #1: pc=0x00000000000f7bef (halted) thread_id=9591 # cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dtest4.scope/vcpu0/tasks 9589 (this step is very important we need make sure we set right task in cgroup) # cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dtest4.scope/vcpu0/cpuset.cpus 1 3. check the taskset or status in /proc # ps aux|grep qemu qemu 9580 1.5 0.4 990736 35420 ? ... # ll /proc/9580/task/ total 0 dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:31 9580 <----emulator dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:32 9589 <----vcpu0 dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:32 9591 <----vcpu1 dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:32 9600 <----??? # taskset --cpu-list -p 9589 pid 9589's current affinity list: 1 So vcpupin works as expected. > What are you expecting? Hmm... i am not sure my idea is right, maybe i missed some important thing or information for iothread. i think libvirt should forbid set the iothreadpin with old qemu (or cannot start a guest with iothread with old qemu) Also there are some strange things when i test it again(test with qemu 1.5): 1. check the libvirt iothreadpin: # virsh dumpxml test4 |grep iothread <iothreads>1</iothreads> <iothreadids> <iothread id='1'/> </iothreadids> <iothreadpin iothread='1' cpuset='2'/> 2. check libvirt set in cgroup (in this step i get a very strange result, why libvirt set the pin to one of libvirtd thread ?) # ll /proc/9580/task/ total 0 dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:31 9580 <----emulator dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:32 9589 <----vcpu0 dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:32 9591 <----vcpu1 dr-xr-xr-x. 6 qemu qemu 0 Jul 27 10:32 9600 <----??? # cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dtest4.scope/iothread1/cpuset.cpus 2 # cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dtest4.scope/iothread1/tasks 4402 # ps -eLf |grep 4402 root 4391 1 4402 0 16 10:28 ? 00:00:00 /usr/sbin/libvirtd root 16347 19012 16347 0 1 11:15 pts/0 00:00:00 grep --color=auto 4402 # taskset --cpu-list -p 4402 pid 4402's current affinity list: 2 Then try to change the iothreadpin and recheck the result: # virsh iothreadpin test4 1 3 # taskset --cpu-list -p 4402 pid 4402's current affinity list: 3 # cat /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dtest4.scope/iothread1/cpuset.cpus 3 And i notice there is no "-object iothread,id=iothread1" in qemu CLI, i have another question, is iothread really work on qemu 1.5.3 ? Since qemu-kvm-1.5.3-97.el7.x86_64 not support iothread, i have open a new bug 1249981 to track the left issue, and verify this bug with comment 8. Thanks, Luyao Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2202.html |