Bug 846623

Summary: Group Alert Definition with CLI Script Notification throws NullPointerException when Myself is set for User To Run The Script As
Product: [Other] RHQ Project Reporter: Lukas Krejci <lkrejci>
Component: Resource Grouping, AlertsAssignee: RHQ Project Maintainer <rhq-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.5CC: hrupp, loleary
Target Milestone: ---   
Target Release: RHQ 4.5.0   
Hardware: All   
OS: All   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 836388 Environment:
Last Closed: 2013-09-01 06:10:27 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 836388, 836390    
Bug Blocks: 853170, 853407    

Description Lukas Krejci 2012-08-08 05:34:04 EDT
+++ This bug was initially created as a clone of Bug #836388 +++

Created attachment 595146 [details]
Excerpt from server log showing complete stack trace

Description of problem:
When creating/editing an alert definition on a compatible group which contains a Notification of type CLI Script an error is displayed in the UI and a very large stack trace is logged (error.txt)

Version-Release number of selected component (if applicable):
JON 3.0.1

How reproducible:

Steps to Reproduce:
1.  Install and configure JON server and agent
2.  Start JON server and agent
3.  Add platform to inventory
4.  Create a compatible group of platforms
    1.  From **Inventory > Compatible Groups** select **New**

        *   **Name:** `Some Platforms`
        *   **Category:** **Platform**
        *   Add platform to groups **Assigned resources**
        *   Click **Finish**

5.  Add an alert definition to the compatible group with a CLI Notification
    1.  From **Inventory > Compatible Groups** select **Some Platforms**
    2.  Select the **Alerts** tab
    3.  Select the **Definitions** subtab
    4.  Click **New**

        *   **Name:** `Alert Definition 1`
        *   **Condition 1:**
                **Measurement Absolute Value Threshold**
                **Free Memory** 
                **< (Less than)** 
        *   **Notification 1:**
                **Notification Sender:** **CLI Script**
                **User To Run The Script As:**   **Myself**
                **Repository:** *JBoss Patches**
                Upload new script showMetricsData.js or any CLI script

    5.  Click **Save**

Actual results:
    JON UI Message Center contains the following message: 
        Alert definition creation failed
        	java.lang.RuntimeException:[1340919066677] org.rhq.enterprise.server.alert.AlertDefinitionCreationException:Could not create alert definition child for Resources [10001] with group AlertDefinition[ id=10042, name=Alert Definition 1 ] -> javax.ejb.EJBTransactionRolledbackException:null -> java.lang.NullPointerException:null
            --- STACK TRACE FOLLOWS ---
            [1340919066677] org.rhq.enterprise.server.alert.AlertDefinitionCreationException:Could not create alert definition child for Resources [10001] with group AlertDefinition[ id=10042, name=Alert Definition 1 ] -> javax.ejb.EJBTransactionRolledbackException:null -> java.lang.NullPointerException:null
               at Unknown.java_lang_RuntimeException_$RuntimeException__Ljava_lang_RuntimeException_2Ljava_lang_RuntimeException_2(Unknown source:0)
               at Unknown.com_google_gwt_user_client_rpc_core_java_lang_RuntimeException_1FieldSerializer_instantiate__Lcom_google_gwt_user_client_rpc_SerializationStreamReader_2Ljava_lang_RuntimeException_2(Unknown source:0)
               at Unknown.com_google_gwt_user_client_rpc_impl_SerializerBase$MethodMap_$instantiate__Lcom_google_gwt_user_client_rpc_impl_SerializerBase$MethodMap_2Lcom_google_gwt_user_client_rpc_SerializationStreamReader_2Ljava_lang_String_2Ljava_lang_Object_2(Unknown source:0)
               at Unknown.com_google_gwt_user_client_rpc_impl_SerializerBase_$instantiate__Lcom_google_gwt_user_client_rpc_impl_SerializerBase_2Lcom_google_gwt_user_client_rpc_SerializationStreamReader_2Ljava_lang_String_2Ljava_lang_Object_2(Unknown source:0)
               at Unknown.com_google_gwt_user_client_rpc_impl_AbstractSerializationStreamReader_$readObject__Lcom_google_gwt_user_client_rpc_impl_AbstractSerializationStreamReader_2Ljava_lang_Object_2(Unknown source:0)
               at Unknown.com_google_gwt_user_client_rpc_impl_RequestCallbackAdapter_$onResponseReceived__Lcom_google_gwt_user_client_rpc_impl_RequestCallbackAdapter_2Lcom_google_gwt_http_client_Request_2Lcom_google_gwt_http_client_Response_2V(Unknown source:0)
               at Unknown.org_rhq_enterprise_gui_coregui_client_util_rpc_TrackingRequestCallback_onResponseReceived__Lcom_google_gwt_http_client_Request_2Lcom_google_gwt_http_client_Response_2V(Unknown source:0)
               at Unknown.com_google_gwt_http_client_Request_$fireOnResponseReceived__Lcom_google_gwt_http_client_Request_2Lcom_google_gwt_http_client_RequestCallback_2V(Unknown source:0)
               at Unknown.com_google_gwt_http_client_RequestBuilder$1_onReadyStateChange__Lcom_google_gwt_xhr_client_XMLHttpRequest_2V(Unknown source:0)
               at Unknown.anonymous(Unknown source:0)
               at Unknown.com_google_gwt_core_client_impl_Impl_entry0__Ljava_lang_Object_2Ljava_lang_Object_2Ljava_lang_Object_2Ljava_lang_Object_2(Unknown source:0)
               at Unknown.anonymous(Unknown source:0)
               at Unknown.anonymous(Unknown source:0)

    Server log contains:
        org.rhq.enterprise.server.alert.AlertDefinitionCreationException: Could not create alert definition child for Resources [10001] with group AlertDefinition[ id=10042, name=Alert Definition 1 ]
            at org.rhq.enterprise.server.alert.GroupAlertDefinitionManagerBean.createGroupAlertDefinitions(GroupAlertDefinitionManagerBean.java:177)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        Caused by: javax.ejb.EJBTransactionRolledbackException
            at org.jboss.ejb3.tx.Ejb3TxPolicy.handleInCallerTx(Ejb3TxPolicy.java:87)
            at org.jboss.aspects.tx.TxPolicy.invokeInCallerTx(TxPolicy.java:130)
            at org.jboss.aspects.tx.TxInterceptor$Required.invoke(TxInterceptor.java:195)
            at org.jboss.ejb3.stateless.StatelessLocalProxy.invoke(StatelessLocalProxy.java:84)
            at $Proxy218.checkAuthentication(Unknown Source)
            at org.rhq.enterprise.server.plugins.alertCli.CliSender.validateAndFinalizeConfiguration(CliSender.java:233)
            at org.rhq.enterprise.server.alert.AlertNotificationManagerBean.finalizeNotifications(AlertNotificationManagerBean.java:317)
            at sun.reflect.GeneratedMethodAccessor450.invoke(Unknown Source)
            at org.jboss.ejb3.stateless.StatelessLocalProxy.invoke(StatelessLocalProxy.java:84)
            at $Proxy464.finalizeNotifications(Unknown Source)
            at org.rhq.enterprise.server.alert.AlertDefinitionManagerBean.checkAlertDefinition(AlertDefinitionManagerBean.java:629)
            at org.rhq.enterprise.server.alert.AlertDefinitionManagerBean.createAlertDefinition(AlertDefinitionManagerBean.java:202)
            at $Proxy314.createAlertDefinition(Unknown Source)
            at org.rhq.enterprise.server.alert.GroupAlertDefinitionManagerBean.createGroupAlertDefinitions(GroupAlertDefinitionManagerBean.java:167)
            ... 125 more
        Caused by: java.lang.NullPointerException
            at org.rhq.enterprise.server.auth.SubjectManagerBean._checkAuthentication(SubjectManagerBean.java:372)
            at org.rhq.enterprise.server.auth.SubjectManagerBean.checkAuthentication(SubjectManagerBean.java:363)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Expected results:
No error and CLI Script should be executed if alert condition occurs.

Additional info:
This appears to be confined to an alert definition being defined on a group. When defining the exact same alert definition on an individual resource, all is well. Additionally, if changing from **Myself** to entering my own user name and password, all is fine on the group alert definition.

--- Additional comment from ccrouch@redhat.com on 2012-07-03 11:00:58 EDT ---

Dropping priority as per triage
Comment 1 Lukas Krejci 2012-08-08 07:03:45 EDT
Similar workflow also fails when creating an alert definition with a cli script on templates.
Comment 2 Lukas Krejci 2012-08-08 07:43:40 EDT
master http://git.fedorahosted.org/cgit/rhq/rhq.git/diff/?id=581324eeb3d32f4d8879d4749423fbde0f76de8c
Author: Lukas Krejci <lkrejci@redhat.com>
Date:   Wed Aug 8 13:41:26 2012 +0200

    [BZ 846623] - When creating the "child" alert definitions of group or template alert definitions, pass the real user that creates the alert def and circumvent authz.
    This behaves exactly the same as before but instead of bypassing the authz by passing the overlord when creating the child alert,
    a new local SLSB method is used that doesn't perform the authz checks and can therefore receive the original user that request the creation of the group/template
    alert def.
    This is good for the CLI alert sender that, when creating an alert script to be run as "myself", checks if the user creating the alert def is the same as the one set
    to run it.
Comment 3 Lukas Krejci 2012-08-29 09:45:20 EDT
Reopening, the fix is not complete at all.

See https://bugzilla.redhat.com/show_bug.cgi?id=836388#c5 for details of one of the use cases that didn't get covered.

Investigation is ongoing on what other use cases could trigger the erroneous behavior described by this bug. I will list the use cases in here once I am done with that.
Comment 4 Lukas Krejci 2012-08-30 12:13:55 EDT
The partial fix that was applied by commit http://git.fedorahosted.org/cgit/rhq/rhq.git/diff/?id=581324eeb3d32f4d8879d4749423fbde0f76de8c opened up new areas where this bug manifests itself:

1) Updating any data on a alert definition with a cli script notification set to run as different user than the currently authenticated. This applies to all resource, group or template alert definition. (I.e. create a alert def with cli script notif set to run as "Myself", save, logout, login as different user, try to edit the alert def).
2) Adding a member to a group that has an alert definition with a CLI script notification (applies to both "simple" compatible groups and compat groups generated by a dynagroup expression).
3) Discovering a new resource of a type that has an template alert definition with a CLI script notification.
Comment 5 Lukas Krejci 2012-08-30 12:21:13 EDT
Note that I raise a separate bug 853170 for the issue 1) from the above list.
Comment 6 Lukas Krejci 2012-08-31 08:43:02 EDT
*** Bug 836390 has been marked as a duplicate of this bug. ***
Comment 7 Lukas Krejci 2012-09-13 05:58:11 EDT
master http://git.fedorahosted.org/cgit/rhq/rhq.git/diff/?id=e95b37895324937307723862c6ea229c2f91f375
Author: Lukas Krejci <lkrejci@redhat.com>
Date:   Thu Sep 13 10:10:24 2012 +0200

    [BZ 846623] - Finishing the fix:
    1) Make sure the UI doesn't contain stale data after an update of a alert def.
    2) Provide a "copy-creation" of alert definition used for syncing the defs
       on a resource with the corresponding group and template alert defs
       (this is done by adding a "validateNotificationConfiguration boolean to
       the AlertDefinitionManagerLocal.createAlertDefinition()).
    3) A new method for updating dependent alert defs (similar to previously
       added method for creating dependent alert defs)
    4) Make sure conditions and notifications are loaded before leaving the update methods
       so that lazy load exceptions don't occur.
    5) Only perform validation and finalization on new or changed notifications.
    6) Marking several methods that supported the old JSF UI and are unused
       as deprecated.
    7) new integration tests to check the behavior of alert senders that modify
       the notification configuration.
Comment 8 Heiko W. Rupp 2013-09-01 06:10:27 EDT
Bulk closing of items that are on_qa and in old RHQ releases, which are out for a long time and where the issue has not been re-opened since.