Created attachment 616686 [details] Debug log from 30 and 68 guests Description of problem: Version-Release number of selected component (if applicable): 0.9.x and 0.10.x How reproducible: - start a bunch of VMs, - stop libvirt, - start libvirt and immediately issue any call, say, 'virsh version', - delay until call completion may be fitted nearly as quadratic function from number of running VMs, Actual results: slower than linear initial timeout Expected results: linear or sublinear time Additional info: zgrep virEventPollDispatchHandles libvirtd-68.log.gz | wc -l 92603 zgrep virEventPollDispatchHandles libvirtd-30.log.gz | wc -l 21713
Could be completely unrelated but one of the slowdowns I've discovered is for storage pools that are marked autostart. libvirt restarting attempts to start them back up, but they're already started so there's just a bad interaction which takes some time to sort itself out. iSCSI is one of the pool types that experiences this issue and provides a very long delay. LVM also causes a delay but it very much depends on the version of LVM as to how libvirt reacts.
I don't think that dealing with storage pool autostart would cause a quadratic slow down in API handling at startup. In fact, storage pool autostart should block all libvirt API access completely. So I think there's some other explanation here. I think something related to the event loop could well be relevant here. Each VM started adds to the number of file handles managed in the event loop, and these are dealt with using a number of linear array searches. It might be interesting to try and collect some profile data using oprofile.
Closing this as problem seems to be solved by greatly decreased multiplier, though bigger-than-linear dependency stays.