* server can be put into a "maintenance mode" ** in this mode, the agent requests will be denied (maybe just shut down the invoker?) * agent needs to back-off (maybe model after TCP sliding-window?) test will help facilitate the testing as listed towards the bottom of RHQ-678
R1239, intermittent check-in. Introduced HA Server section to Administration and a "List HA Servers" page to allow GUI manipulation of the server's operation mode (normal | maintenance). Updated schema, upgrade, and installer to introduce the new rhq_server.mode column as well as a unique constraint on rhq_server.name.
R1255 - Initial impl of HA server-side Maintenance Mode. Server can come up in Maintenance or Normal Mode, and mode can be changed via the HA Admin Console. In MM all comm services are stopped, in Normal mode all comm services are started. ToDo - need to discuss with Mazz about how to best suppress server-side ServletInvokerServlet errors when comm level is down. ToDo - need to understand whether we can eliminate Server To Agent guaranteed messaging. This is a related topic only in that we don't need/want the server-side comm services stop() to spool any messages. The semantics are unclear about how GM works wrt HA agent failover, so the initial question is whether we need GM at all.
Note, a new Quartz job has been introduced to monitor for and execute operation mode changes. It currently executes at 60 second intervals.
The MM mechanism is now based on a command listener capable of blanket rejection of incoming agent commands. The NotProcessedException is an indicator to the agent that the server is unavailable and the agent should respond as if the server was unreachable. This approach allows us to leave the comm layer up and functioning which protects us from violating certain assumptions for a running server. A server can be toggled between NORMAL and MAINTENANCE mode via the HAAC List Servers page.
btw, the monitoring Quartz job is executing now at 30 second intervals with an initial 60 second delay.
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-672