Bug 1045496
Summary: | JON 3.2.0 install fails if the installer was terminated previously during the installation of the RHQ EAR subsystem | ||
---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Lami Akagwu <lakagwu> |
Component: | Core Server | Assignee: | John Mazzitelli <mazz> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Garik Khachikyan <gkhachik> |
Severity: | urgent | Docs Contact: | |
Priority: | unspecified | ||
Version: | JON 3.2 | CC: | gkhachik, mazz, mkoci, myarboro |
Target Milestone: | DR01 | ||
Target Release: | JON 3.3.0 | ||
Hardware: | All | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-12-11 14:04:52 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Lami Akagwu
2013-12-20 15:19:12 UTC
The first time the installer was interrupted, the following entry was captured in the EAP 6 logs file --------- 14:55:41,158 ERROR [org.jboss.msc.service.fail] (MSC service thread 1-1) MSC000001: Failed to start service jboss.deployment.unit."rhq.ear".STRUCTURE: org.jboss.msc.service.StartException in service jboss.deployment.unit."rhq.ear".STRUCTURE: JBAS018733: Failed to process phase STRUCTURE of deployment "rhq.ear" at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:127) [jboss-as-server-7.2.1.Final-redhat-10.jar:7.2.1.Final-redhat-10] at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:1811) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1746) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_27] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_27] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_27] Caused by: java.lang.IllegalStateException: Container is down at org.jboss.msc.service.ServiceContainerImpl.install(ServiceContainerImpl.java:509) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceTargetImpl.install(ServiceTargetImpl.java:201) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$ChildServiceTarget.install(ServiceControllerImpl.java:2228) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceTargetImpl.install(ServiceTargetImpl.java:201) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceControllerImpl$ChildServiceTarget.install(ServiceControllerImpl.java:2228) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.msc.service.ServiceBuilderImpl.install(ServiceBuilderImpl.java:307) [jboss-msc-1.0.4.GA-redhat-1.jar:1.0.4.GA-redhat-1] at org.jboss.as.server.deployment.SubDeploymentProcessor.deploy(SubDeploymentProcessor.java:67) [jboss-as-server-7.2.1.Final-redhat-10.jar:7.2.1.Final-redhat-10] at org.jboss.as.server.deployment.DeploymentUnitPhaseService.start(DeploymentUnitPhaseService.java:120) [jboss-as-server-7.2.1.Final-redhat-10.jar:7.2.1.Final-redhat-10] ... 5 more 14:55:41,161 INFO [org.jboss.as.server.deployment] (MSC service thread 1-6) JBAS015877: Stopped deployment rhq.ear (runtime-name: rhq.ear) in 2572ms --------- On restart, the following was captured in the EAP 6 server.log repeatedly. But the log level needs to be set to DEBUG for the root logger in ../jbossas/standalone/configuration/standalone-full.xml to see this. ----- 15:29:12,121 DEBUG [org.jboss.as.controller.management-operation] (management-handler-thread - 3) JBAS014616: Operation ("read-resource") failed - address: ([("deployment" => "rhq.ear")]) - failure description: "JBAS014807: Management resource '[(\"deployment\" => \"rhq.ear\")]' not found" ----- Something is really strange. Usually, the only time you get those repeating "waiting for server to initialize" is when the server is misconfigured and can't start fully. Might have to ask that you rerun this again on your machine and then attach the full logs (all logs, server and install, would be helpful, but certainly need the server logs). If this is solved by deleting the rhq-data directory, that is not the RHQ Server's directory - that is the storage node directory. So there might be a problem with the storage node starting up which then, in turn, is causing the server to fail to start. (In reply to John Mazzitelli from comment #3) > Might have to ask that you rerun this again on your machine and then attach > the full logs (all logs, server and install, would be helpful, but certainly > need the server logs). Note: I think the storage node logs is going to be needed to - I suspect the storage node is having a problem. hmm... ok, now I'm seeing it by doing this: 1) rhqctl install 2) control-c when its trying to deploy the ear 3) rhqctl install Not sure why - the second attempt never even attempted to deploy the ear. I think the installer was canceled in just the right spot to put the server in a bad state. git commit to master: a329de9 I think this is a good fix. I'll ask Jay to peer review since he knows about the marker file that this fix now uses to determine if the installer aborted. jay peer-reviewed this fix - it appears this should be good. git commit to release/jon3.2.x: 2f11ae20faad1e26301c2a0a5fbc63ed8ffcf9cb Moving to ON_QA as available to test with brew build of DR01: https://brewweb.devel.redhat.com//buildinfo?buildID=373993 # VERIFIED I did tried twice to stop the installer (ctrl+C) every time in different seconds passed during install of EAR subsystem - at the end I did get the system installed correctly however. I consider the problem gone. No such errors in logs either. version === 13:48:57,112 INFO [SystemInfoManager] (http-/0.0.0.0:7080-3) SystemInformation: ******** ACTIVE_DRIFT_PLUGIN: [drift-jpa] AGENT_MAX_QUIET_TIME_ALLOWED: [300000] ALERT_PURGE: [2678400000] AS config dir: [/root/jon-server-3.3.0.ER02/jbossas/standalone/configuration] AS product name: [EAP] AS product version: [6.3.0.GA] AS version: [7.4.0.Final-redhat-19] AVAILABILITY_PURGE: [31536000000] Agent ibm-x3550m3-11.lab.eng.brq.redhat.com: [Agent[id=10001,name=ibm-x3550m3-11.lab.eng.brq.redhat.com,address=10.34.36.137,port=16163,remote-endpoint=socket://10.34.36.137:16163/?rhq.communications.connector.rhqtype=agent&numAcceptThreads=1&maxPoolSize=303&clientMaxPoolSize=304&socketTimeout=60000&enableTcpNoDelay=true&backlog=200,last-availability-ping=1410868100169,last-availability-report=1410867951485]] AlertCount: [0] AlertDefinitionCount: [8] BuildNumber: [4fbb183:7da54e2] CAM_BASELINE_DATASET: [604800000] CAM_BASELINE_FREQUENCY: [259200000] CAM_BASE_URL: [http://ibm-x3550m3-11.lab.eng.brq.redhat.com:7080/] CAM_DATA_MAINTENANCE: [3600000] CAM_DATA_PURGE_1D: [31536000000] CAM_DATA_PURGE_1H: [1209600000] CAM_DATA_PURGE_6H: [2678400000] CAM_GUIDE_ENABLED: [true] CAM_HELP_PASSWORD: [- non null -] CAM_HELP_USER: [web] CAM_JAAS_PROVIDER: [false] CAM_LDAP_BASE_DN: [o=JBoss,c=US] CAM_LDAP_BIND_DN: [] CAM_LDAP_BIND_PW: [- non null -] CAM_LDAP_FILTER: [] CAM_LDAP_FOLLOW_REFERRALS: [false] CAM_LDAP_LOGIN_PROPERTY: [cn] CAM_LDAP_NAMING_FACTORY_INITIAL: [com.sun.jndi.ldap.LdapCtxFactory] CAM_LDAP_NAMING_PROVIDER_URL: [ldap://localhost/] CAM_LDAP_PROTOCOL: [false] CAM_RT_COLLECT_IP_ADDRS: [true] CAM_SYSLOG_ACTIONS_ENABLED: [false] DATABASE_CONNECTION_URL: [jdbc:postgresql://127.0.0.1:5432/rhq?loginTimeout=0&socketTimeout=0&prepareThreshold=5&unknownLength=2147483647&loglevel=0&tcpkeepalive=false&binaryTransfer=true] DATABASE_DRIVER_NAME: [PostgreSQL Native Driver] DATABASE_DRIVER_VERSION: [PostgreSQL 9.2 JDBC4 (build 1002)] DATABASE_PRODUCT_NAME: [PostgreSQL] DATABASE_PRODUCT_VERSION: [8.4.18] DATA_REINDEX_NIGHTLY: [false] DB_SCHEMA_VERSION: [2.159] DRIFT_FILE_PURGE: [2678400000] ENABLE_AGENT_AUTO_UPDATE: [true] ENABLE_LOGIN_WITHOUT_ROLES: [false] EVENT_PURGE: [1209600000] FullName: [JBoss Operations Network] Name: [JBoss ON] OPERATION_HISTORY_PURGE: [0] PlatformCount: [1] RESOURCE_GENERIC_PROPERTIES_UPGRADE: [false] RHQ_SESSION_TIMEOUT: [3600000] RT_DATA_PURGE: [2678400000] SERVER_HOME_DIR: [/root/jon-server-3.3.0.ER02/jbossas/standalone] SERVER_IDENTITY: [ibm-x3550m3-11.lab.eng.brq.redhat.com] SERVER_INSTALL_DIR: [/root/jon-server-3.3.0.ER02] SERVER_LOCAL_TIME: [16 September 2014 13:48:56 o'clock CEST] SERVER_TIMEZONE: [Central European Time] SERVER_VERSION: [4.12.0.JON330ER02] SchedulesPerMinute: [66] ServerCount: [5] ServiceCount: [292] Storage_Node ibm-x3550m3-11.lab.eng.brq.redhat.com: [storageNode.addresss=ibm-x3550m3-11.lab.eng.brq.redhat.com, hostname=ibm-x3550m3-11.lab.eng.brq.redhat.com, beginTime=1410839337053, beginTime=1410839337053, unackAlerts=0, heapUsed=null, heapPercentageUsed=Min: 0.08552215835960973, Max: 0.22246770424190973, Avg: 0.1450845587534133 (%), load=null, dataUsedPercentage=null, dataDiskUsed=null, tokens=null, actuallyOwns=null] TRAIT_PURGE: [31536000000] Version: [3.3.0.ER02] |