Created attachment 1273445 [details] engine log Description of problem: Start VM failed with the exception, if the score module does not respond before the timeout 2017-04-23 04:44:41,346-04 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (org.ovirt.thread.pool-6-thread-41) [cd2602e9-8af0-48c0-9916-be8efbbc1290] Command 'org.ovirt.engine.core.bll.RunVmCommand' failed: null 2017-04-23 04:44:41,348-04 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (org.ovirt.thread.pool-6-thread-41) [cd2602e9-8af0-48c0-9916-be8efbbc1290] Exception: java.lang.NullPointerException at org.ovirt.engine.core.bll.scheduling.SchedulingManager.sumScoreResults(SchedulingManager.java:816) [bll.jar:] at org.ovirt.engine.core.bll.scheduling.SchedulingManager.runExternalFunctions(SchedulingManager.java:810) [bll.jar:] at org.ovirt.engine.core.bll.scheduling.SchedulingManager.runFunctions(SchedulingManager.java:766) [bll.jar:] at org.ovirt.engine.core.bll.scheduling.SchedulingManager.selectBestHost(SchedulingManager.java:456) [bll.jar:] at org.ovirt.engine.core.bll.scheduling.SchedulingManager.schedule(SchedulingManager.java:330) [bll.jar:] at org.ovirt.engine.core.bll.RunVmCommand.getVdsToRunOn(RunVmCommand.java:826) [bll.jar:] at org.ovirt.engine.core.bll.RunVmCommand.runVm(RunVmCommand.java:227) [bll.jar:] at org.ovirt.engine.core.bll.RunVmCommand.perform(RunVmCommand.java:401) [bll.jar:] at org.ovirt.engine.core.bll.RunVmCommand.executeVmCommand(RunVmCommand.java:326) [bll.jar:] at org.ovirt.engine.core.bll.VmCommand.executeCommand(VmCommand.java:120) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1251) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1391) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:2055) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1451) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:397) [bll.jar:] at org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.executeValidatedCommand(PrevalidatingMultipleActionsRunner.java:204) [bll.jar:] at org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.runCommands(PrevalidatingMultipleActionsRunner.java:176) [bll.jar:] at org.ovirt.engine.core.bll.SortedMultipleActionsRunnerBase.runCommands(SortedMultipleActionsRunnerBase.java:20) [bll.jar:] at org.ovirt.engine.core.bll.PrevalidatingMultipleActionsRunner.lambda$invokeCommands$3(PrevalidatingMultipleActionsRunner.java:182) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:92) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_121] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_121] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121] 2017-04-23 04:44:41,359-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-41) [cd2602e9-8af0-48c0-9916-be8efbbc1290] EVENT_ID: USER_FAILED_RUN_VM(54), Correlation ID: cd2602e9-8af0-48c0-9916-be8efbbc1290, Job ID: ffd1b3a8-87cb-4e31-a929-05d9d6a84c58, Call Stack: null, Custom Event ID: -1, Message: Failed to run VM golden_env_mixed_virtio_0 (User: admin@internal-authz). Version-Release number of selected component (if applicable): ovirt-scheduler-proxy-0.1.3-7.el7ev.noarch How reproducible: Always Steps to Reproduce: 1. Install and enable via the engine ovirt-scheduler-proxy 2. Create simple python script under /usr/share/ovirt-scheduler-proxy/plugins/test_plugin.py #!/bin/env python from time import sleep class ExternalSchedulerTimeoutPlugin(object): def do_score(self, hosts, vm, args): sleep(125) print [("some_host_uuid", 50)] # chmod +x /usr/share/ovirt-scheduler-proxy/plugins/test_plugin.py 3. Restart ovirt-scheduler-proxy and the ovirt-engine services 4. Create new scheduling policy with the weight ExternalSchedulerTimeoutPlugin module 5. Attach scheduling policy to the cluster 6. Start the VM Actual results: Start the VM fails, with the traceback above Expected results: Start the VM succeeds with the warning that the engine skipped external scheduling score modules(the same behavior that we have for the filter modules) Additional info:
This bug is on modified, with all referenced patches merged on ovirt-engine while the bug is on ovirt-scheduler-proxy. Please check if the product is the right one, I suppose it should be ovirt-engine. Please also check modified status is correct, looks like one of the referenced patches has not been backported to 4.1 branch.
One of the patches is for ovirt-scheduler-proxy [1]. It has not been backported because scheduler-proxy does not have 4.1 branch. [1] - https://gerrit.ovirt.org/#/c/76500/
(In reply to Andrej Krejcir from comment #2) > One of the patches is for ovirt-scheduler-proxy [1]. It has not been > backported because scheduler-proxy does not have 4.1 branch. > > [1] - https://gerrit.ovirt.org/#/c/76500/ Can you please build it for being released in 4.1.3 and add it to 4.1.3_rc1 release configuration as in https://gerrit.ovirt.org/77960 ?
Verified on rhevm-4.1.3.2-0.1.el7.noarch 2017-06-18 07:30:33,454-04 INFO [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (default task-30) [77df6c5e-e853-4925-9883-058535975d3a] FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 1540741b 2017-06-18 07:30:33,543-04 INFO [org.ovirt.engine.core.bll.RunVmCommand] (org.ovirt.thread.pool-6-thread-32) [77df6c5e-e853-4925-9883-058535975d3a] Running command: RunVmCommand internal: false. Entities affected : ID: f3ee19f2-8f89-43a9-ba10-445c10e861f2 Type: VMAction group RUN_VM with role type USER 2017-06-18 07:31:03,602-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-32) [77df6c5e-e853-4925-9883-058535975d3a] EVENT_ID: EXTERNAL_SCHEDULER_ERROR(10,501), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Running the external scheduler failed: 'Waiting on score functions timed out'