Bug 1044577 - Intermittent ArrayIndexOutOfBoundsException under high load in drools
Summary: Intermittent ArrayIndexOutOfBoundsException under high load in drools
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: JBoss Enterprise BRMS Platform 5
Classification: JBoss
Component: BRE (Expert, Fusion)
Version: BRMS 5.3.1
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: GA
: ---
Assignee: Mario Fusco
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1022758 1044581 1054313
TreeView+ depends on / blocked
 
Reported: 2013-12-18 15:31 UTC by Anton Giertli
Modified: 2025-02-10 03:34 UTC (History)
5 users (show)

Fixed In Version:
Clone Of:
: 1044581 1054313 (view as bug list)
Environment:
Last Closed: 2025-02-10 03:34:41 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
reproducer (822.17 KB, application/zip)
2013-12-18 15:31 UTC, Anton Giertli
no flags Details
Test to reproduce the issue (822.17 KB, application/zip)
2013-12-30 15:11 UTC, Martin Weiler
no flags Details
Test case illustrating need of transactions (7.55 KB, application/zip)
2014-02-17 11:42 UTC, Marek Winkler
no flags Details

Description Anton Giertli 2013-12-18 15:31:05 UTC
Created attachment 838407 [details]
reproducer

Description of problem:
A customer is running CEP with a JPA persisted ksession (created by JPAKnowledgeService.newStatefulKnowledgeSession()) on SOA-P. They are seeing the Exception below intermittently under high-load.

Version-Release number of selected component (if applicable):


How reproducible:
See the attachment

Steps to Reproduce:
- Create one ksession by JPAKnowledgeService.newStatefulKnowledgeSession()
- Insert events to a shared EntryPoint from multiple threads
Run CEPTest
Actual results:
Caused by: java.lang.ArrayIndexOutOfBoundsException: 4
        at org.drools.marshalling.impl.ProtobufOutputMarshaller.orderFacts(ProtobufOutputMarshaller.java:592)
        at org.drools.marshalling.impl.ProtobufOutputMarshaller.writeFactHandles(ProtobufOutputMarshaller.java:533)
        at org.drools.marshalling.impl.ProtobufOutputMarshaller.serializeSession(ProtobufOutputMarshaller.java:139)
        at org.drools.marshalling.impl.ProtobufOutputMarshaller.writeSession(ProtobufOutputMarshaller.java:100)
...

Expected results:


Additional info:

Comment 2 Mario Fusco 2013-12-20 16:23:23 UTC
I tried to investigate this issue but I couldn't reproduce it.
In particular I haven't been able to use the reproducer attached to this ticket since it doesn't have a pom file and uses specific db configurations.
However I tried to reproduce exactly the same things done by that test case in a junit test I developed in the JpaPersistentStatefulSessionTest class on the 5.3.x branch.
I am pasting the complete test at the end of this email. I tried to change multiple times the number of threads (both creating them explicitly and using an ExecutorService), the number of events and the delay of each thread, but I've never been able to reproduce the reported problem.

    @Test
    public void testParallelInsertsOnPersistedSession() {
        String str =
                "import org.drools.persistence.session.JpaPersistentStatefulSessionTest.ParkingEvent;\n" +
                "\n" +
                "declare ParkingEvent\n" +
                "    @role( event )\n" +
                "    @expires(2s)\n" +
                "end\n" +
                "\n" +
                "rule \"Hello Event\"\n" +
                "    when\n" +
                "        $p : ParkingEvent( ) from entry-point \"parking\"\n" +
                "    then\n" +
                "        System.out.println( $p.toString() );\n" +
                "end\n";

        KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder();
        kbuilder.add( ResourceFactory.newByteArrayResource( str.getBytes() ),
                      ResourceType.DRL );
        KnowledgeBase kbase = KnowledgeBaseFactory.newKnowledgeBase();

        if ( kbuilder.hasErrors() ) {
            fail( kbuilder.getErrors().toString() );
        }

        kbase.addKnowledgePackages( kbuilder.getKnowledgePackages() );

        StatefulKnowledgeSession ksession = JPAKnowledgeService.newStatefulKnowledgeSession( kbase, null, env );

        WorkingMemoryEntryPoint ep = ksession.getWorkingMemoryEntryPoint("parking");

        Thread[] threads = new Thread[100];
        for (int i = 0; i < 100; i++) {
            TestEventRunner runner = new TestEventRunner();
            runner.setEp(ep);
            runner.setKsession(ksession);
            threads[i] = new Thread(runner);
        }

        for (Thread thread : threads) {
            thread.start();
        }

        try {
            Thread.sleep(10000);
        } catch (InterruptedException e) {
            throw new RuntimeException(e);
        }

        System.out.println("ksession.fireUntilHalt()");
        ksession.fireUntilHalt();
    }

    public static class TestEventRunner implements Runnable {

        private WorkingMemoryEntryPoint ep;

        private StatefulKnowledgeSession ksession;

        public StatefulKnowledgeSession getKsession() {
            return ksession;
        }

        public void setKsession(StatefulKnowledgeSession ksession) {
            this.ksession = ksession;
        }

        public WorkingMemoryEntryPoint getEp() {
            return ep;
        }

        public void setEp(WorkingMemoryEntryPoint ep) {
            this.ep = ep;
        }

        public void run() {
            try {
                Thread.sleep(500); // make sure ksession started
                for (int i = 0; i < 100; i++) {
                    ep.insert(new ParkingEvent("CarA", "ParkingA", ParkingEvent.ENTER));
                    ep.insert(new ParkingEvent("CarB", "ParkingA", ParkingEvent.ENTER));
                    ep.insert(new ParkingEvent("CarA", "ParkingA", ParkingEvent.EXIT));
                    ep.insert(new ParkingEvent("CarA", "ParkingB", ParkingEvent.ENTER));
                }
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }

    public static class ParkingEvent implements Serializable {

        public static final int ENTER = 0;
        public static final int EXIT = 1;

        private String car;

        private String zone;

        private int type;

        public ParkingEvent() {

        }

        public ParkingEvent(String car, String zone, int type) {
            this.car = car;
            this.zone = zone;
            this.type = type;
        }

        public String getCar() {
            return car;
        }

        public void setCar(String car) {
            this.car = car;
        }

        public String getZone() {
            return zone;
        }

        public void setZone(String zone) {
            this.zone = zone;
        }

        public int getType() {
            return type;
        }

        public void setType(int type) {
            this.type = type;
        }

        @Override
        public String toString() {
            return "ParkingEvent : car = " + car + ", zone = " + zone + ", type = " + type;
        }
    }

Comment 3 Martin Weiler 2013-12-30 15:11:33 UTC
Created attachment 843448 [details]
Test to reproduce the issue

Attaching Toshiya's simpler test which doesn't depend on SOA-P:

- Create one ksession by JPAKnowledgeService.newStatefulKnowledgeSession()
- Insert events to a shared EntryPoint from multiple threads

Comment 4 Toshiya Kobayashi 2014-01-07 05:24:32 UTC
Hi Mario,

I have sent a pull request for reproducer. Sorry about not providing this earlier.

https://github.com/droolsjbpm/drools/pull/304

The key points are:

- STREAM mode
- @expires happens during the test so I added "Thread.sleep(200)" between each insert()

This test case constantly reproduces the ArrayIndexOutOfBoundsException in 5.3.x branch on my side (It shows up in STDOUT instead of surefire report log). If you can't reproduce yet, please try increasing the number of threads.

Thanks!

Comment 5 Toshiya Kobayashi 2014-01-10 09:44:34 UTC
Hi Mario,

The ArrayIndexOutOfBoundsException is caused by the following sequence:

Thread A: Tries to commit a transaction
Thread A: In ProtobufOutputMarshaller.orderFacts(), gets the size of SingleThreadedObjectStore.assertMap by "objectStore.size();"
Thread B: Inserts an Event to entry point so SingleThreadedObjectStore.assertMap increases 
Thread A: ArrayIndexOutOfBoundsException is thrown in the 'for' loop

======
    public static InternalFactHandle[] orderFacts(ObjectStore objectStore) {
        // this method is just needed for testing purposes, to allow round tripping
        int size = objectStore.size();
        InternalFactHandle[] handles = new InternalFactHandle[size];
        int i = 0;
        for ( Iterator< ? > it = objectStore.iterateFactHandles(); it.hasNext(); ) {
            handles[i++] = (InternalFactHandle) it.next();
        }

        Arrays.sort( handles,
                     new HandleSorter() );

        return handles;
    }
======

The suggestion in Comment 1 fixes the particular issue but I guess that ksession (including EntryPoint/ObjectStore) should be protected at upper level (e.g. SessionMarshallingHelper.getSnapshot() or ProtobufMarshaller.marshall(() etc.)

----
8541 10/01 17:16:28,936[pool-11-thread-1] WARN  persistence.jta.JtaTransactionManager.commit  - Unable to commit transaction
javax.persistence.PersistenceException: java.lang.ArrayIndexOutOfBoundsException: 309
	at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:516)
	at bitronix.tm.BitronixTransaction.fireBeforeCompletionEvent(BitronixTransaction.java:478)
	at bitronix.tm.BitronixTransaction.commit(BitronixTransaction.java:193)
	at bitronix.tm.BitronixTransactionManager.commit(BitronixTransactionManager.java:120)
	at org.drools.persistence.jta.JtaTransactionManager.commit(JtaTransactionManager.java:179)
	at org.drools.persistence.SingleSessionCommandService.execute(SingleSessionCommandService.java:363)
	at org.drools.persistence.jpa.JpaTimerJobInstance.call(JpaTimerJobInstance.java:34)
	at org.drools.persistence.jpa.JpaTimerJobInstance.call(JpaTimerJobInstance.java:14)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 309
	at org.drools.marshalling.impl.ProtobufOutputMarshaller.orderFacts(ProtobufOutputMarshaller.java:663)
	at org.drools.marshalling.impl.ProtobufOutputMarshaller.writeFactHandles(ProtobufOutputMarshaller.java:599)
	at org.drools.marshalling.impl.ProtobufOutputMarshaller.serializeSession(ProtobufOutputMarshaller.java:151)
	at org.drools.marshalling.impl.ProtobufOutputMarshaller.writeSession(ProtobufOutputMarshaller.java:106)
	at org.drools.marshalling.impl.ProtobufMarshaller.marshall(ProtobufMarshaller.java:169)
	at org.drools.marshalling.impl.ProtobufMarshaller.marshall(ProtobufMarshaller.java:151)
	at org.drools.persistence.SessionMarshallingHelper.getSnapshot(SessionMarshallingHelper.java:72)
	at org.drools.persistence.info.SessionInfo.update(SessionInfo.java:85)
	at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.hibernate.ejb.event.BeanCallback.invoke(BeanCallback.java:23)
	at org.hibernate.ejb.event.EntityCallbackHandler.callback(EntityCallbackHandler.java:80)
	at org.hibernate.ejb.event.EntityCallbackHandler.preUpdate(EntityCallbackHandler.java:65)
	at org.hibernate.ejb.event.EJB3FlushEntityEventListener.invokeInterceptor(EJB3FlushEntityEventListener.java:41)
	at org.hibernate.event.def.DefaultFlushEntityEventListener.handleInterception(DefaultFlushEntityEventListener.java:330)
	at org.hibernate.event.def.DefaultFlushEntityEventListener.scheduleUpdate(DefaultFlushEntityEventListener.java:270)
	at org.hibernate.event.def.DefaultFlushEntityEventListener.onFlushEntity(DefaultFlushEntityEventListener.java:151)
	at org.hibernate.event.def.AbstractFlushingEventListener.flushEntities(AbstractFlushingEventListener.java:219)
	at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:99)
	at org.hibernate.event.def.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:49)
	at org.hibernate.impl.SessionImpl.flush(SessionImpl.java:1028)
	at org.hibernate.impl.SessionImpl.managedFlush(SessionImpl.java:366)
	at org.hibernate.ejb.AbstractEntityManagerImpl$1.beforeCompletion(AbstractEntityManagerImpl.java:504)
	... 13 more
----

Note: Thanks to "@expires", I could reproduce the issue because it will trigger commit by SelfRemovalJob. But in customer case, a commit happens in ESB thread in SOA-P.

Comment 6 Mario Fusco 2014-01-16 15:36:04 UTC
Fixed by https://github.com/droolsjbpm/drools/commit/39f393ca2

Comment 7 Mario Fusco 2014-01-17 17:02:53 UTC
The operations executed on an entry-point created from a persisted session are now transactional as the ones issued directly on the session itself. However the provided test is not valid because it is not possible to ensure atomicity of serialization on a session running asynchronously. For this reason trying to invoke fireUntilHalt on a persisted session now causes an UnsupportedOperationException.

Done with https://github.com/droolsjbpm/drools/commit/4415313d0

Comment 10 Marek Winkler 2014-02-13 17:55:34 UTC
Verified on 5.3.1.BRMS-P05.

Please be aware that the entry-point operations are now transactional (as Mario commented), so they might now require a transaction to work properly (as it was the case with my test).

Comment 11 Marek Winkler 2014-02-17 11:42:34 UTC
Created attachment 864067 [details]
Test case illustrating need of transactions

Attached a maven project with test indicating the need for manual transaction demarcation with P05 (in contrast to P04). I do not think it is necessarily a bug, but a change in behaviour which could surprise existing 5.3.1 users.

See CepPersistenceTest, lines 209 and 252. With P04, the test passed even without calling begin/commit/rollback on transaction manager. With P05, 
the exception org.drools.persistence.jpa.JpaTimerJobInstance - Unable to execute timer job! is thrown (see below for stacktrace). The root cause is  java.lang.IllegalStateException: no transaction started on this thread.

One more observation - the problem manifests only when all following conditions are true:

- using events with expiration
- realtime clock
- executing entryPoint.insert() without an active transaction

I guess that the problem is that when events expire (with realtime clock), they are getting cleaned from the WM by a timer thread, which is executed after the original transaction has commited.

Full stacktrace of the exception:

org.drools.persistence.jpa.JpaTimerJobInstance - Unable to execute timer job!
java.lang.RuntimeException: Could not commit session or rollback
	at org.drools.persistence.SingleSessionCommandService.rollbackTransaction(SingleSessionCommandService.java:392)
	at org.drools.persistence.SingleSessionCommandService.execute(SingleSessionCommandService.java:373)
	at org.drools.persistence.jpa.JpaTimerJobInstance.call(JpaTimerJobInstance.java:34)
	at org.drools.persistence.jpa.JpaTimerJobInstance.call(JpaTimerJobInstance.java:14)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.RuntimeException: Unable to rollback transaction
	at org.drools.persistence.jta.JtaTransactionManager.rollback(JtaTransactionManager.java:197)
	at org.drools.persistence.SingleSessionCommandService.rollbackTransaction(SingleSessionCommandService.java:389)
	... 9 more
Caused by: java.lang.IllegalStateException: no transaction started on this thread
	at bitronix.tm.BitronixTransactionManager.setRollbackOnly(BitronixTransactionManager.java:171)
	at org.drools.persistence.jta.JtaTransactionManager.rollback(JtaTransactionManager.java:193)

Comment 12 Marek Winkler 2014-02-18 13:48:40 UTC
After discussion with Toshiya, it is confirmed that the problem is caused by the test - by shutting down Transaction manager before the timer threads finish their work. The fix is therefore indeed right.

Thanks Toshiya for help!

Comment 15 Red Hat Bugzilla 2025-02-10 03:34:41 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.


Note You need to log in before you can comment on or make changes to this bug.