Description of problem: Version-Release number of selected component (if applicable): How reproducible: Often Steps to Reproduce: 1) Unzip JON 3.2.0 installer jar xf ~/Downloads/jon-server-3.2.0.GA.zip 2) Make all *.sh script and rhqctl scripts executable cd $JON_SEVER_HOME find . -name "*.sh"| xargs chmod u+x $1 chmod u+x rhqctl 3) start the installer $ ./rhqctl install 14:53:48,334 INFO [org.jboss.modules] JBoss Modules version 1.2.2.Final-redhat-1 The [jboss.bind.address] property is required but not set in [rhq-server.properties]. Do you want to set [jboss.bind.address] value now? yes|no: yes jboss.bind.address: 10.33.23.152 Is [10.33.23.152] correct? yes|no: yes 4) After the text "Still waiting for server to initialize..." shows up once or twice 14:55:31,920 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] Installing RHQ EAR subsystem 14:55:35,001 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 5) ctrl-C ^C14:55:37,549 WARN [org.rhq.server.control.command.Install] UNDO: Removing server-installed marker file and management user 14:55:37,550 WARN [org.rhq.server.control.command.Install] UNDO: Stopping component: --server Trying to stop the RHQ Server... RHQ Server (pid=14249) is stopping... RHQ Server has stopped. 14:55:54,731 WARN [org.rhq.server.control.command.Install] UNDO: Stopping component: --storage Stopping RHQ storage node... RHQ storage node (pid=14083) is stopping... RHQ storage node has stopped 14:55:58,767 WARN [org.rhq.server.control.command.Install] UNDO: Removing storage node data and install directories 14:55:59,172 WARN [org.rhq.server.control.command.Install] UNDO: Reverting server properties file NOTE: The first time the issue occured, I saw the following console logs before running the ctl-c command ---------- 16:42:14,144 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] Installing RHQ EAR subsystem 16:42:29,287 INFO [org.rhq.server.control.command.Install] It has been over [2] minutes - you may want to ensure your server initialization is proceeding as expected. You can check the log at [/home/lakagwu/jboss/jon-server-3.2.0.GA/logs/server.log]. 16:42:59,685 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 16:43:39,063 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 16:44:13,301 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... ^C16:44:25,635 WARN [org.rhq.server.control.command.Install] UNDO: Removing server-installed marker file and management user 16:44:25,691 WARN [org.rhq.server.control.command.Install] UNDO: Stopping component: --server Trying to stop the RHQ Server... RHQ Server (pid=27390) is stopping... ---------- 6) start the server again, and the install is never successful 14:57:17,128 INFO [org.rhq.server.control.command.Install] The RHQ Server must be started to complete its installation. Starting the RHQ server in preparation of running the server installer... 14:57:17,133 INFO [org.rhq.server.control.command.Install] Waiting for the RHQ Server to start in preparation of running the server installer... Trying to start the RHQ Server... RHQ Server (pid 15413 ) IS starting 14:57:23,817 INFO [org.jboss.modules] JBoss Modules version 1.2.2.Final-redhat-1 14:57:24,456 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] The server is preconfigured and ready for auto-install. 14:57:24,521 INFO [org.xnio] XNIO Version 3.0.7.GA-redhat-1 14:57:24,530 INFO [org.xnio.nio] XNIO NIO Implementation Version 3.0.7.GA-redhat-1 14:57:24,541 INFO [org.jboss.remoting] JBoss Remoting version 3.2.16.GA-redhat-1 14:57:30,480 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] Installing into app server version [7.2.1.Final-redhat-10] 14:57:30,562 INFO [org.rhq.enterprise.server.installer.Installer] The installer has already been told to perform its work. The server should be ready soon. 14:57:30,581 INFO [org.rhq.server.control.command.Install] The RHQ Server is ready to be upgraded by the server installer. 14:57:30,581 INFO [org.rhq.server.control.command.Install] Installing RHQ server 14:57:30,582 INFO [org.rhq.server.control.command.Install] The server installer is running 14:57:30,662 INFO [org.xnio] XNIO Version 3.0.7.GA-redhat-1 14:57:30,669 INFO [org.xnio.nio] XNIO NIO Implementation Version 3.0.7.GA-redhat-1 14:57:30,677 INFO [org.jboss.remoting] JBoss Remoting version 3.2.16.GA-redhat-1 14:57:31,011 INFO [org.jboss.modules] JBoss Modules version 1.2.2.Final-redhat-1 14:57:31,084 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] The server is preconfigured and ready for auto-install. 14:57:31,130 INFO [org.xnio] XNIO Version 3.0.7.GA-redhat-1 14:57:31,136 INFO [org.xnio.nio] XNIO NIO Implementation Version 3.0.7.GA-redhat-1 14:57:31,142 INFO [org.jboss.remoting] JBoss Remoting version 3.2.16.GA-redhat-1 14:57:36,565 INFO [org.rhq.enterprise.server.installer.InstallerServiceImpl] Installing into app server version [7.2.1.Final-redhat-10] 14:57:36,606 INFO [org.rhq.enterprise.server.installer.Installer] The installer has already been told to perform its work. The server should be ready soon. 14:58:01,180 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 14:58:31,507 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 14:59:01,697 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 14:59:31,957 INFO [org.rhq.server.control.command.Install] It has been over [2] minutes - you may want to ensure your server initialization is proceeding as expected. You can check the log at [/home/lakagwu/jboss/jon-server-3.2.0.GA/logs/server.log]. 15:00:02,172 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 15:00:32,522 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 15:01:02,861 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 15:01:33,016 INFO [org.rhq.server.control.command.Install] It has been over [2] minutes - you may want to ensure your server initialization is proceeding as expected. You can check the log at [/home/lakagwu/jboss/jon-server-3.2.0.GA/logs/server.log]. 15:02:03,121 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 15:02:33,206 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 15:03:03,387 INFO [org.rhq.server.control.command.Install] Still waiting for server to initialize... 15:03:33,483 INFO [org.rhq.server.control.command.Install] It has been over [2] minutes - you may want to ensure your server initialization is proceeding as expected. You can check the log at [/home/lakagwu/jboss/jon-server-3.2.0.GA/logs/server.log]. 7) Thread dump shows thread is stuck in AbstractInstall.waitForRHQServerToInitialize(AbstractInstall.java:187) "main" prio=10 tid=0x00007fb950007800 nid=0x3b5c waiting on condition [0x00007fb955138000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.rhq.server.control.command.AbstractInstall.waitForRHQServerToInitialize(AbstractInstall.java:187) at org.rhq.server.control.command.Install.exec(Install.java:166) at org.rhq.server.control.ControlCommand.exec(ControlCommand.java:153) at org.rhq.server.control.RHQControl.exec(RHQControl.java:87) at org.rhq.server.control.RHQControl.main(RHQControl.java:307) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.jboss.modules.Module.run(Module.java:270) at org.jboss.modules.Main.main(Main.java:411) The only remedy is to delete the rhn-data directory and reinstall the again. Actual results: Expected results: Additional info:
The first time the installer was interrupted, the following entry was captured in the EAP 6 logs file --------- 14:55:41,158 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC000001: Failed to start service jboss.deployment.unit."rhq.ear".STRUCTURE: org.jboss.msc.service.StartException in service jboss.deployment.unit."rhq.ear".STRUCTURE: JBAS018733: Failed to process phase STRUCTURE of deployment "rhq.ear" at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:127) [jboss-as-server-7.2.1.Final-redhat-10.jar:7.2.1.Final-redhat-10] at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_27] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_27] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_27] Caused by: java.lang.IllegalStateException: Container is down at org.jboss.msc.service.ServiceContainerImpl.install(ServiceContainerImpl.java:509) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceTargetImpl.install(ServiceTargetImpl.java:201) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$ChildServiceTarget.install(ServiceControllerImpl.java:2228) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceTargetImpl.install(ServiceTargetImpl.java:201) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$ChildServiceTarget.install(ServiceControllerImpl.java:2228) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceBuilderImpl.install(ServiceBuilderImpl.java:307) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.as.server.deployment.SubDeploymentProcessor.deploy(SubDeploymentProcessor.java:67) [jboss-as-server-7.2.1.Final-redhat-10.jar:7.2.1.Final-redhat-10] at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:120) [jboss-as-server-7.2.1.Final-redhat-10.jar:7.2.1.Final-redhat-10] ... 5 more 14:55:41,161 INFO [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment rhq.ear (runtime-name: rhq.ear) in 2572ms --------- On restart, the following was captured in the EAP 6 server.log repeatedly. But the log level needs to be set to DEBUG for the root logger in ../jbossas/standalone/configuration/standalone-full.xml to see this. ----- 15:29:12,121 DEBUG [org.jboss.as.controller.management-operation] (management-handler-thread - 3) JBAS014616: Operation ("read-resource") failed - address: ([("deployment" => "rhq.ear")]) - failure description: "JBAS014807: Management resource '[(\"deployment\" => \"rhq.ear\")]' not found" -----
Something is really strange. Usually, the only time you get those repeating "waiting for server to initialize" is when the server is misconfigured and can't start fully. Might have to ask that you rerun this again on your machine and then attach the full logs (all logs, server and install, would be helpful, but certainly need the server logs). If this is solved by deleting the rhq-data directory, that is not the RHQ Server's directory - that is the storage node directory. So there might be a problem with the storage node starting up which then, in turn, is causing the server to fail to start.
(In reply to John Mazzitelli from comment #3) > Might have to ask that you rerun this again on your machine and then attach > the full logs (all logs, server and install, would be helpful, but certainly > need the server logs). Note: I think the storage node logs is going to be needed to - I suspect the storage node is having a problem.
hmm... ok, now I'm seeing it by doing this: 1) rhqctl install 2) control-c when its trying to deploy the ear 3) rhqctl install Not sure why - the second attempt never even attempted to deploy the ear. I think the installer was canceled in just the right spot to put the server in a bad state.
git commit to master: a329de9 I think this is a good fix. I'll ask Jay to peer review since he knows about the marker file that this fix now uses to determine if the installer aborted.
jay peer-reviewed this fix - it appears this should be good. git commit to release/jon3.2.x: 2f11ae20faad1e26301c2a0a5fbc63ed8ffcf9cb
Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993
# VERIFIED I did tried twice to stop the installer (ctrl+C) every time in different seconds passed during install of EAR subsystem - at the end I did get the system installed correctly however. I consider the problem gone. No such errors in logs either. version === 13:48:57,112 INFO [SystemInfoManager] (http-/0.0.0.0:7080-3) SystemInformation: ******** ACTIVE_DRIFT_PLUGIN: [drift-jpa] AGENT_MAX_QUIET_TIME_ALLOWED: [300000] ALERT_PURGE: [2678400000] AS config dir: [/root/jon-server-3.3.0.ER02/jbossas/standalone/configuration] AS product name: [EAP] AS product version: [6.3.0.GA] AS version: [7.4.0.Final-redhat-19] AVAILABILITY_PURGE: [31536000000] Agent ibm-x3550m3-11.lab.eng.brq.redhat.com: [Agent[id=10001,name=ibm-x3550m3-11.lab.eng.brq.redhat.com,address=10.34.36.137,port=16163,remote-endpoint=socket://10.34.36.137:16163/?rhq.communications.connector.rhqtype=agent&numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200,last-availability-ping=1410868100169,last-availability-report=1410867951485]] AlertCount: [0] AlertDefinitionCount: [8] BuildNumber: [4fbb183:7da54e2] CAM_BASELINE_DATASET: [604800000] CAM_BASELINE_FREQUENCY: [259200000] CAM_BASE_URL: [http://ibm-x3550m3-11.lab.eng.brq.redhat.com:7080/] CAM_DATA_MAINTENANCE: [3600000] CAM_DATA_PURGE_1D: [31536000000] CAM_DATA_PURGE_1H: [1209600000] CAM_DATA_PURGE_6H: [2678400000] CAM_GUIDE_ENABLED: [true] CAM_HELP_PASSWORD: [- non null -] CAM_HELP_USER: [web] CAM_JAAS_PROVIDER: [false] CAM_LDAP_BASE_DN: [o=JBoss,c=US] CAM_LDAP_BIND_DN: [] CAM_LDAP_BIND_PW: [- non null -] CAM_LDAP_FILTER: [] CAM_LDAP_FOLLOW_REFERRALS: [false] CAM_LDAP_LOGIN_PROPERTY: [cn] CAM_LDAP_NAMING_FACTORY_INITIAL: [com.sun.jndi.ldap.LdapCtxFactory] CAM_LDAP_NAMING_PROVIDER_URL: [ldap://localhost/] CAM_LDAP_PROTOCOL: [false] CAM_RT_COLLECT_IP_ADDRS: [true] CAM_SYSLOG_ACTIONS_ENABLED: [false] DATABASE_CONNECTION_URL: [jdbc:postgresql://127.0.0.1:5432/rhq?loginTimeout=0&socketTimeout=0&prepareThreshold=5&unknownLength=2147483647&loglevel=0&tcpkeepalive=false&binaryTransfer=true] DATABASE_DRIVER_NAME: [PostgreSQL Native Driver] DATABASE_DRIVER_VERSION: [PostgreSQL 9.2 JDBC4 (build 1002)] DATABASE_PRODUCT_NAME: [PostgreSQL] DATABASE_PRODUCT_VERSION: [8.4.18] DATA_REINDEX_NIGHTLY: [false] DB_SCHEMA_VERSION: [2.159] DRIFT_FILE_PURGE: [2678400000] ENABLE_AGENT_AUTO_UPDATE: [true] ENABLE_LOGIN_WITHOUT_ROLES: [false] EVENT_PURGE: [1209600000] FullName: [JBoss Operations Network] Name: [JBoss ON] OPERATION_HISTORY_PURGE: [0] PlatformCount: [1] RESOURCE_GENERIC_PROPERTIES_UPGRADE: [false] RHQ_SESSION_TIMEOUT: [3600000] RT_DATA_PURGE: [2678400000] SERVER_HOME_DIR: [/root/jon-server-3.3.0.ER02/jbossas/standalone] SERVER_IDENTITY: [ibm-x3550m3-11.lab.eng.brq.redhat.com] SERVER_INSTALL_DIR: [/root/jon-server-3.3.0.ER02] SERVER_LOCAL_TIME: [16 September 2014 13:48:56 o'clock CEST] SERVER_TIMEZONE: [Central European Time] SERVER_VERSION: [4.12.0.JON330ER02] SchedulesPerMinute: [66] ServerCount: [5] ServiceCount: [292] Storage_Node ibm-x3550m3-11.lab.eng.brq.redhat.com: [storageNode.addresss=ibm-x3550m3-11.lab.eng.brq.redhat.com, hostname=ibm-x3550m3-11.lab.eng.brq.redhat.com, beginTime=1410839337053, beginTime=1410839337053, unackAlerts=0, heapUsed=null, heapPercentageUsed=Min: 0.08552215835960973, Max: 0.22246770424190973, Avg: 0.1450845587534133 (%), load=null, dataUsedPercentage=null, dataDiskUsed=null, tokens=null, actuallyOwns=null] TRAIT_PURGE: [31536000000] Version: [3.3.0.ER02]