Bug 1201700 - When all next moves are not doable, optaplanner get stuck in step and if termination is not based on time, it cycles forever
Summary: When all next moves are not doable, optaplanner get stuck in step and if term...
Keywords:
Status: CLOSED EOL
Alias: None
Product: JBoss BRMS Platform 6
Classification: Retired
Component: OptaPlanner
Version: 6.1.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Geoffrey De Smet
QA Contact: Jiri Locker
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-03-13 09:56 UTC by jvahala
Modified: 2020-03-27 19:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-27 19:11:32 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
reproducer (13.80 KB, application/zip)
2015-03-31 11:54 UTC, jvahala
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker PLANNER-241 0 Blocker Open If all entities are immovable, the solver should simply do nothing. It should not log bailing out warnings. 2017-09-06 07:45:02 UTC

Description jvahala 2015-03-13 09:56:39 UTC
Consider TSP with one domicile and one entity. CH builds best solution and then local search is performed. 

Say that local search termination is configured like this: 

<termination>
  <stepCountLimit>1</stepCountLimit>
</termination>

obviously, there is no move what planner could do, but that is a problem. If planner is in state, when there is no way out, it will stuck forever and nothing can terminate it except time termination.

Comment 2 Geoffrey De Smet 2015-03-19 14:48:26 UTC
There's no intrinsic requirement that a step should finish within x amount of time. But it would indeed be helpful that LocalSearch recognizes that there are no doable moves. I think it does if you configure a cacheType PHASE or STEP, instead of the default of JIT.

It's not just doable moves, but also filtered moves that can cause this issue (although filtering has a bail-out). That case too only applies to JIT selection as far as I know.

Jiri, could you verify that cacheType PHASE or STEP move selection don't suffer from this issue? That would worry me and I 'd fix that asap. As for JIT (the default cacheType), I currently don't see a reasonable way of fixing it (without killing the scalability gain of JIT selection).

Comment 3 jvahala 2015-03-20 10:38:44 UTC
Geoffrey, 

I think the easies way how to provide reproducer is just make little alternations on any TSP example you have. 

1. take any TSP Solution instance and get rid of all locations, except two. So there will be only one Domicile and one Visit.

2. use this localSearch

<localSearch>
   <termination>
      <stepCountLimit>1</stepCountLimit>
    </termination>
   <changeMoveSelector>
     <cacheType>PHASE</cacheType>
   </changeMoveSelector>
</localSearch>

3. run solver. 

I hope this is enough to reproduce the problem.

Comment 4 Geoffrey De Smet 2015-03-25 08:31:39 UTC
Jiri, have you tried if it reproduces with cacheType PHASE or STEP? That would be a high priority bug.
The fact that it reproduces with JIT selection, is a less important known issue (because it's intrinsic to JIT selection and any fix might be worse than the problem).

Comment 5 jvahala 2015-03-31 11:54:34 UTC
Created attachment 1008979 [details]
reproducer

just run mvn clean install 

there is 10 seconds timeout. One step on so little problem should take less than 10 seconds.


Note You need to log in before you can comment on or make changes to this bug.