Bug 1721434
Summary: | virsh command delay issue on DPDK node | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Meiyan Zheng <mzheng> | |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
Status: | CLOSED ERRATA | QA Contact: | Luyao Huang <lhuang> | |
Severity: | high | Docs Contact: | ||
Priority: | urgent | |||
Version: | 8.0 | CC: | apevec, bdobreli, cchen, cfields, cfontain, chhu, chrisw, dyuan, dzheng, fbaudin, fhallal, jdenemar, jlibosva, jraju, jsuchane, knoha, ktraynor, lhuang, maxime.coquelin, mkim, mprivozn, rbalakri, rhodain, rhos-maint, xuzhang, yalzhang | |
Target Milestone: | rc | Keywords: | Automation | |
Target Release: | 8.1 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | libvirt-4.5.0-31.el8 | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
When creating a child process (e.g. to run another binary), libvirt closes filedescriptors right after the child process is created and keep only handful of them open. This is to enhance security, because we can't blindly trust that binary will not misuse any of the passed FDs.
So far, this so called mass close of FDs was done by iterating throug every possible FD in range of 3 to 'limit of open files' and either called close() or made FD survive exec(). Problem is, only a handful of FDs need to be preserved and also only very few FDs need to be closed. The majority of FDs that we are trying to close() are not open really. While there is no technical problem, every close() is a syscall and as such adds some overhead. This is more visible if the limit for open files is way bigger than the default (1024). It that's the case, spawning a runnig a binary in a child process becomes very expensive (time-wise).
Consequence:
Fetching stats for an OVS interface is delayed and very annoying.
Fix:
Closing of FDs was changed. Now libvirt utilizes /proc filesystem to learn all FDs that are open and iterate only through them instead of the full range of possible FDs. Because the proportion between the two, this approach takes way less time and achieves the same result. I mean, like 99% of FDs within the range were never open really and thus are closed already, but we were trying to close() them anyway.
Result:
Fetching statistics is now as smooth as baby skin.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1759904 (view as bug list) | Environment: | ||
Last Closed: | 2019-11-05 20:50:27 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1759904 |
Comment 17
Jaison Raju
2019-06-25 15:41:14 UTC
Can we get libvirt debug logs please? Dave Gilbert suggests to attach to the qemu process with gdb while the issue happens, and dump all its threads backtraces using: gdb> thread apply all bt full Do you think you can arrange that Jaison? I'm wondering if (at least on Linux) libvirt can use /proc to find which FDs it holds opened and then close only those ones instead of iterating over 0..MAX. Note, for specific cases (like this one), libvirt wants to 'leak' some FDs to child. For instance, because it attaches one end of a pipe to children process' stdio. However, if a library libvirt is linking with has opened a file because some other thread is working with the library, then we definitely do not want to leak the FD into the child. Patches posted upstream: https://www.redhat.com/archives/libvir-list/2019-July/msg00089.html Here's a scratch build with those three patches applied: https://mprivozn.fedorapeople.org/ovs2/ With this build, the idea from comment 53 is implemented, so the workaround from comment 59 should be no longer needed. Pushed upstream: 432faf259b virCommand: use procfs to learn opened FDs c1a9bfbbba vircommand: Separate mass FD closing into a function cc34260f5a test: Introduce virnetdevopenvswitchtest c297eab525 virNetDevOpenvswitchInterfaceStats: Optimize for speed v5.5.0-173-g432faf259b Verify this bug with libvirt-4.5.0-32.module+el8.1.0+4010+d6842f29.x86_64: 1. prepare openvswitch + dpdk env for vhost-user VIF # ovs-vsctl show 70ad1586-5ac7-4a4f-9898-2c239eb9886b Bridge "ovsbr0" Port "vhost-user2" Interface "vhost-user2" type: dpdkvhostuser Port "vhost-user1" Interface "vhost-user1" type: dpdkvhostuser Port "ovsbr0" Interface "ovsbr0" type: internal 2. check default nofile limit of libvirtd # prlimit -p `pidof libvirtd` |grep NOFILE NOFILE max number of open files 8192 8192 files 3. start a guest which have 2 vhost-user interface # virsh dumpxml vm1 <interface type='vhostuser'> <mac address='52:54:00:93:51:dd'/> <source type='unix' path='/var/run/openvswitch/vhost-user2' mode='client'/> <target dev='vhost-user2'/> <model type='virtio'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </interface> <interface type='vhostuser'> <mac address='52:54:00:a8:55:a1'/> <source type='unix' path='/var/run/openvswitch/vhost-user1' mode='client'/> <target dev='vhost-user1'/> <model type='virtio'/> <alias name='net2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </interface> # virsh start vm1 Domain vm1 started 4. run domstats command: # time virsh domstats Domain: 'vm1' ... real 0m0.065s user 0m0.021s sys 0m0.006s 5. check libvirtd debug log and check the ovs-vsctl command runtimes, and verify that libvirtd only use one ovs-vsctl command for each vhost-user interface: 2019-08-28 02:14:59.997+0000: 23696: debug : virCommandRunAsync:2555 : About to run ovs-vsctl --timeout=5 --if-exists --format=list --data=json --no-headings --columns=statistics list Interface vhost-user2 2019-08-28 02:14:59.999+0000: 23696: debug : virCommandRunAsync:2558 : Command result 0, with PID 23760 2019-08-28 02:15:00.013+0000: 23696: debug : virCommandRun:2406 : Result status 0, stdout: '["map",[["rx_1024_to_1522_packets",0],["rx_128_to_255_packets",0],["rx_1523_to_max_packets",0],["rx_1_to_64_packets",0],["rx_256_to_511_packets",0],["rx_512_to_1023_packets",0],["rx_65_to_127_packets",0],["rx_bytes",0],["rx_dropped",0],["rx_errors",0],["rx_packets",0],["tx_bytes",0],["tx_dropped",0],["tx_packets",0]]] ' stderr: '' 2019-08-28 02:15:00.013+0000: 23696: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_1024_to_1522_packets val=0 2019-08-28 02:15:00.013+0000: 23696: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_128_to_255_packets val=0 2019-08-28 02:15:00.013+0000: 23696: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_1523_to_max_packets val=0 2019-08-28 02:15:00.013+0000: 23696: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_1_to_64_packets val=0 2019-08-28 02:15:00.013+0000: 23696: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_256_to_511_packets val=0 2019-08-28 02:15:00.013+0000: 23696: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_512_to_1023_packets val=0 2019-08-28 02:15:00.013+0000: 23696: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_65_to_127_packets val=0 2019-08-28 02:15:00.013+0000: 23696: debug : virCommandRunAsync:2555 : About to run ovs-vsctl --timeout=5 --if-exists --format=list --data=json --no-headings --columns=statistics list Interface vhost-user1 2019-08-28 02:15:00.014+0000: 23696: debug : virCommandRunAsync:2558 : Command result 0, with PID 23761 2019-08-28 02:15:00.026+0000: 23696: debug : virCommandRun:2406 : Result status 0, stdout: '["map",[["rx_1024_to_1522_packets",0],["rx_128_to_255_packets",0],["rx_1523_to_max_packets",0],["rx_1_to_64_packets",0],["rx_256_to_511_packets",0],["rx_512_to_1023_packets",0],["rx_65_to_127_packets",0],["rx_bytes",0],["rx_dropped",0],["rx_errors",0],["rx_packets",0],["tx_bytes",0],["tx_dropped",0],["tx_packets",0]]] ' stderr: '' 6. retest 4 and 5 step when libvirtd run with a big NOFILE limit # cat /usr/lib/systemd/system/libvirtd.service |grep LimitNOFILE LimitNOFILE=1048576 # prlimit -p `pidof libvirtd` |grep NOFILE NOFILE max number of open files 1048576 1048576 files # time virsh domstats ... real 0m0.081s user 0m0.023s sys 0m0.004s # libvirtd log 2019-08-28 02:30:19.281+0000: 24272: debug : virCommandRunAsync:2555 : About to run ovs-vsctl --timeout=5 --if-exists --format=list --data=json --no-headings --columns=statistics list Interface vhost-user2 2019-08-28 02:30:19.284+0000: 24272: debug : virCommandRunAsync:2558 : Command result 0, with PID 24351 2019-08-28 02:30:19.297+0000: 24272: debug : virCommandRun:2406 : Result status 0, stdout: '["map",[["rx_1024_to_1522_packets",0],["rx_128_to_255_packets",0],["rx_1523_to_max_packets",0],["rx_1_to_64_packets",0],["rx_256_to_511_packets",0],["rx_512_to_1023_packets",0],["rx_65_to_127_packets",0],["rx_bytes",0],["rx_dropped",0],["rx_errors",0],["rx_packets",0],["tx_bytes",0],["tx_dropped",0],["tx_packets",0]]] ' stderr: '' 2019-08-28 02:30:19.297+0000: 24272: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_1024_to_1522_packets val=0 2019-08-28 02:30:19.297+0000: 24272: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_128_to_255_packets val=0 2019-08-28 02:30:19.297+0000: 24272: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_1523_to_max_packets val=0 2019-08-28 02:30:19.297+0000: 24272: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_1_to_64_packets val=0 2019-08-28 02:30:19.297+0000: 24272: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_256_to_511_packets val=0 2019-08-28 02:30:19.297+0000: 24272: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_512_to_1023_packets val=0 2019-08-28 02:30:19.297+0000: 24272: debug : virNetDevOpenvswitchInterfaceParseStats:376 : Unused ovs-vsctl stat key=rx_65_to_127_packets val=0 2019-08-28 02:30:19.297+0000: 24272: debug : virCommandRunAsync:2555 : About to run ovs-vsctl --timeout=5 --if-exists --format=list --data=json --no-headings --columns=statistics list Interface vhost-user1 2019-08-28 02:30:19.299+0000: 24272: debug : virCommandRunAsync:2558 : Command result 0, with PID 24352 2019-08-28 02:30:19.312+0000: 24272: debug : virCommandRun:2406 : Result status 0, stdout: '["map",[["rx_1024_to_1522_packets",0],["rx_128_to_255_packets",0],["rx_1523_to_max_packets",0],["rx_1_to_64_packets",0],["rx_256_to_511_packets",0],["rx_512_to_1023_packets",0],["rx_65_to_127_packets",0],["rx_bytes",0],["rx_dropped",0],["rx_errors",0],["rx_packets",0],["tx_bytes",0],["tx_dropped",0],["tx_packets",0]]] ' stderr: '' Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:3345 Covered in automated test: # avocado run --vt-type libvirt virtual_network.iface_options.iface_type.type_vhostuser JOB ID : 23e2ff8128d5fa6c8b2ab2cab0835e04c57e986d JOB LOG : /root/avocado/job-results/job-2020-09-22T00.37-23e2ff8/job.log (1/7) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_type.type_vhostuser.queue_size_check: PASS (56.37 s) (2/7) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_type.type_vhostuser.multi_queue.hotplug: PASS (56.44 s) (3/7) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_type.type_vhostuser.multi_queue.passthrough: PASS (179.81 s) (4/7) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_type.type_vhostuser.multi_guests.statistics_check: PASS (233.26 s) (5/7) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_type.type_vhostuser.multi_guests.interface_check.with_share_mem: PASS (224.36 s) (6/7) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_type.type_vhostuser.multi_guests.interface_check.without_share_mem: PASS (216.73 s) (7/7) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_type.type_vhostuser.check_performance: PASS (187.43 s) RESULTS : PASS 7 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0 JOB TIME : 1155.91 s # avocado run --vt-type libvirt virtual_network.iface_options.iface_driver.driver_vhost_rx_tx JOB ID : 3046a4772a9102f36050ee91cbf6aaef0445a09e JOB LOG : /root/avocado/job-results/job-2020-09-22T03.01-3046a47/job.log (1/1) type_specific.io-github-autotest-libvirt.virtual_network.iface_options.iface_driver.driver_vhost_rx_tx: PASS (48.04 s) RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 | CANCEL 0 JOB TIME : 49.24 s |