Bug 536308 (RHQ-672) - server-side "maintenance mode"
Summary: server-side "maintenance mode"
Keywords:
Status: CLOSED NEXTRELEASE
Alias: RHQ-672
Product: RHQ Project
Classification: Other
Component: Agent
Version: unspecified
Hardware: All
OS: All
medium
medium
Target Milestone: ---
: ---
Assignee: Jay Shaughnessy
QA Contact: Corey Welton
URL: http://jira.rhq-project.org/browse/RH...
Whiteboard:
Depends On:
Blocks: RHQ-644
TreeView+ depends on / blocked
 
Reported: 2008-07-14 19:48 UTC by Joseph Marques
Modified: 2009-11-10 21:22 UTC (History)
0 users

Fixed In Version: 1.1
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)

Description Joseph Marques 2008-07-14 19:48:00 UTC
* server can be put into a "maintenance mode"
** in this mode, the agent requests will be denied (maybe just shut down the invoker?)
* agent needs to back-off (maybe model after TCP sliding-window?)

test will help facilitate the testing as listed towards the bottom of RHQ-678

Comment 1 Jay Shaughnessy 2008-08-14 18:58:49 UTC
R1239, intermittent check-in.   Introduced HA Server section to Administration and a "List HA Servers" page to allow GUI manipulation of the server's operation mode (normal | maintenance).  Updated schema, upgrade, and installer to introduce the new rhq_server.mode column as well as a unique constraint on rhq_server.name.

Comment 2 Jay Shaughnessy 2008-08-18 19:33:55 UTC
R1255 - Initial impl of HA server-side Maintenance Mode.   Server can come up in Maintenance or Normal Mode, and mode can be changed via the HA Admin Console.  In MM all comm services are stopped, in Normal mode all comm services are started.

ToDo - need to discuss with Mazz about how to best suppress server-side ServletInvokerServlet errors when comm level is down.

ToDo - need to understand whether we can eliminate Server To Agent guaranteed messaging.  This is a related topic only in that we don't need/want the server-side comm services stop() to spool any messages.  The semantics are unclear about how GM works wrt HA agent failover, so the initial question is whether we need GM at all.



Comment 3 Jay Shaughnessy 2008-08-18 19:34:56 UTC
Note, a new Quartz job has been introduced to monitor for and execute operation mode changes. It currently executes at 60 second intervals.


Comment 4 Jay Shaughnessy 2008-09-11 18:31:22 UTC
The MM mechanism is now based on a command listener capable of blanket rejection of incoming agent commands.  The NotProcessedException is an indicator to the agent that the server is unavailable and the agent should respond as if the server was unreachable.  This approach allows us to leave the comm layer up and functioning which protects us from violating certain assumptions for a running server.  A server can be toggled between NORMAL and MAINTENANCE mode via the HAAC List Servers page.

Comment 5 Jay Shaughnessy 2008-09-11 18:32:43 UTC
btw, the monitoring Quartz job is executing now at 30 second intervals with an initial 60 second delay.

Comment 6 Red Hat Bugzilla 2009-11-10 21:14:28 UTC
This bug was previously known as http://jira.rhq-project.org/browse/RHQ-672



Note You need to log in before you can comment on or make changes to this bug.