Description of problem: From a mail from Neil Horman ... - Should it be disabled on virt? Maybe. Depends on the hypervisor and how it maps virtual cpus to physical cpus, and how its virtual apic routes interrupts. On Xen, definately disable it, on KVM I think you should also typically disable it, but I'm not 100% sure. ... Hence, something like: ConditionVirtualization=False could be useful in the service file. Version-Release number of selected component (if applicable): F19 current
irqbalance-1.0.5-3.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/irqbalance-1.0.5-3.fc19
Is there actually a bug being fixed here or is this just code churn? I don't see why irqbalance would be disabled on KVM, or any hypervisor for that matter. Don't we still want to spread the interrupt load across the vCPUs? Interrupt affinity in the guest does work in KVM.
Given the relationship of vCPUs to pCPUs in different hypervisors, it seems unclear that redistributing the interrupts actually accomplishes anything if the vCPU is going to be randomly scheduled on a variety of pCPUs. See Neil's comment. I'll buy that it might need refinement based on which hypervisor/how the guests are set up.
Sure, making a decision about the optimal CPU to move the interrupt to is a problem, but irqbalance supports plenty of systems where it doesn't know the optimal affinity of an interrupt. Isn't there an obvious problem with deciding to turn off irqbalance that we potentially now have all interrupts on a single CPU? That's clearly worse than non-optimal affinity per IRQ.
Disabling irq balancing on kvm would be a serious bug. I'm not sure about other hypervisors, but it would likely be a bug too. Generally, hypervisors make effort to behave just like a physical machine, the hypervisor leaf is there for very specific corner cases where we ship paravirtualization and should always be done entirely in kernel. If userspace software starts detecting hypervisors and adding special code, this will just start an arms race for hypervisor to hide from your code. If a hypervisor wants to disable rebalancing it already has a way to do this in the kernel, and likely already does. Did you make this change? Please don't make it without consulting maintainers of kvm and xen hypervisors.
(In reply to Bill Nottingham from comment #3) > Given the relationship of vCPUs to pCPUs in different hypervisors, it seems > unclear that redistributing the interrupts actually accomplishes anything if > the vCPU is going to be randomly scheduled on a variety of pCPUs. Irqbalancer, when running in a KVM, balances virtual interrupts between virtual cpus. There is no point where the relationship between vCPUs to pCPUs is comming into a play. HW interrupt are also irrelevent. Please always talk to virtualization people before doing changes that affects virtualization.
CC'ing Drew for Xen. (In reply to Bill Nottingham from comment #3) > Given the relationship of vCPUs to pCPUs in different hypervisors, it seems > unclear that redistributing the interrupts actually accomplishes anything if > the vCPU is going to be randomly scheduled on a variety of pCPUs. It does accomplish something: by keeping the irq load homogeneous across VCPUs, the hypervisor's notion of VCPUs carrying equal scheduling weight is not violated. The load you split in the guest the hypervisor can merge, but what you join in the guest the hypervisor can't split. That can be useful sometimes, but probably shouldn't be the default (same as "taskset" is likely the exception for guest processes). Of course this depends on the "homogenity" that irqbalance strives for, and the stats that it uses as input. If irqbalance ensures something more sophisticated than even distribution, then *that* could actually interfere with the hypervisor's default notion of VCPUs being equal -- if the stats irqbalance collects as input make no sense on a hypervisor (they don't actually mean what irqbalance takes them for), then irqbalance's decisions are bound to be bogus. (See also: forwarding TCP-over-IP over TCP-over-IP -- the lower level TCP violates the upper level TCP's assumptions about the middle IP and the timers go haywire.) Just my two cents.
According to discussion above, I've removed irqbalance-1.0.5-3.fc19 update from Bodhi so far.
xen uses event channels for pv and pvhvm guests. Each channel is associated with a single vcpu (which does not have to be the same vcpu). Or, if a guest isn't hypervisor aware (i.e. not paravirt in any way), then xen injects interrupts to the next vcpu that is round-robin selected from a list of eligible vcpus, that the guest has identified by programming is lapic (a vlapic). I don't see any reason to disallow irqbalance for xen guests, but it may not help much either. We can/should leave it to the users to decide.
Gleb caught me on my comment. The round-robin selection is only for lowprio delivery, which is rarely used. So irqbalance is likely to benefit the guest.
I second Laszlo's comment 7. Removing visibility from the hypervisor of the guest's desired policy can only do harm. On KVM, the hypervisor (actually the Linux scheduler) tries at least to not migrate vCPUs gratuitously from a physical CPU to another, and in some workloads they might well be pinned by management. For example, virtio-scsi (and virtio-net I think) multiqueue completely depends on irqbalance working correctly inside guests.
The patch has been reverted upstream too, and in the meanwhile rawhide was upgraded to 1.0.6.
irqbalance-1.0.5-4.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/irqbalance-1.0.5-4.fc19
irqbalance-1.0.5-4.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report.