Description of problem: There is a transactional/timing issue that can occur when the RHQ server is persisting a change set and subsequently request file content from the agent that will result in an exception that prevents the file content from being persisted. Here is the relevant part of the exception from the server log: 2011-11-30 07:05:38,984 INFO [org.rhq.enterprise.server.drift.JPADriftServerBean] Skipping bad drift file javax.ejb.EJBException: java.lang.IllegalArgumentException: JPADriftFile not found [eec86c6712976844ffe31411c982fc7b6aa6d4e89b6c759273cf6c888872efb1] Here is what is happening when I produce the issue. I have drift definitions for two EAP servers. The agent sends the initial change set report for EAP server 1. The RHQ server processes the report, creating and persisting drift records as well as JPADriftFile instances as needed. The RHQ server sends a request to the agent for the content of each JPADriftFile that is created. While the agent is gathering content, the drift detector task kicks off and generates the initial change set report for EAP server 2. The RHQ server processes it and sends a request for content to the agent. The number of files for which the RHQ server requests content for EAP server 2 will be very small because most of that content has already been requested with EAP server 1. Because the number of files is small, the agent is able to process that request and send the content back to the server before the large transaction that is processing the initial change set has committed. When the agent sends file content to the server, the server assumes that there is already a JPADriftFile in the database for each file that the agent is sending. This makes sense because the agent should only be sending content for stuff that the server knows about. The problem though is that the transaction in which the JPADriftFiles has not yet been committed. So to the thread handling the file content sent from the agent, there are some files that the agent is sending that the server does not know about; hence, the IllegalArgumentException from above. We have a bit of a race condition here and need to ensure that those JPADriftFile entities are persisted and visible to the later transaction handling the file content sent from the agent. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
master commit b31e3a66a1e75dcad0070b5b78bbd3f8e9005533 This should resolve the timing issue where it was possible for the agent to submit DriftFile content before the DriftFile entity was committed to the database, thus generating exceptions due to the missing entity, and a failure to store the required content. This is not easy to test. jsanda had a good reproduction environment and if he verifies that should be sufficient. I have done some mock testing which has been successful. And the code changes are reviewed by john and mazz.
release_jon3.x commit 66a4abdf1e8661a926869f23b8dbd0d357a8c11a