Bug 1155822
Summary: | rhqctl reports storage node as running when it is not due to an empty or corrupt PID file | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Operations Network | Reporter: | Larry O'Leary <loleary> | ||||||||
Component: | Core Server, Launch Scripts | Assignee: | Michael Burman <miburman> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Armine Hovsepyan <ahovsepy> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | JON 3.2.1 | CC: | ahovsepy, lzoubek, mfoley, miburman, mshirley | ||||||||
Target Milestone: | ER01 | ||||||||||
Target Release: | JON 3.3.1 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2015-02-27 19:58:24 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
Larry O'Leary
2014-10-22 22:58:34 UTC
This was already fixed in BZ 980076. From current master: [michael@miranda bin]$ cat ../rhq-storage/bin/cassandra.pid cat: ../rhq-storage/bin/cassandra.pid: No such file or directory [michael@miranda bin]$ touch ../rhq-storage/bin/cassandra.pid [michael@miranda bin]$ ./rhqctl status OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128M; support was removed in 8.0 10:52:38,757 INFO [org.jboss.modules] JBoss Modules version 1.3.0.Final-redhat-2 RHQ Storage Node (no pid file) is ✘down RHQ Server (no pid file) is ✘down JBossAS Java VM child process (no pid file) is ✘down RHQ Agent (no pid file) is ✘down [michael@miranda bin]$ ./rhqctl start --storage OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128M; support was removed in 8.0 10:56:26,132 INFO [org.jboss.modules] JBoss Modules version 1.3.0.Final-redhat-2 INFO 10:56:26,497 Logging initialized [michael@miranda bin]$ ./rhqctl status OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128M; support was removed in 8.0 10:56:35,680 INFO [org.jboss.modules] JBoss Modules version 1.3.0.Final-redhat-2 RHQ Storage Node (pid 5898 ) is ✔running RHQ Server (no pid file) is ✘down JBossAS Java VM child process (no pid file) is ✘down RHQ Agent (no pid file) is ✘down [michael@miranda bin]$ kill -9 5898 [michael@miranda bin]$ ./rhqctl status OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128M; support was removed in 8.0 10:56:40,518 INFO [org.jboss.modules] JBoss Modules version 1.3.0.Final-redhat-2 RHQ Storage Node (no pid file) is ✘down RHQ Server (no pid file) is ✘down JBossAS Java VM child process (no pid file) is ✘down RHQ Agent (no pid file) is ✘down [michael@miranda bin]$ Mike, could you please double check this. I looked at the BZ you reference and its commit was already in 3.2.1 which is where this bug is being reported. As such, it still exists even with the fix you mention for BZ 980076. Hi, I retested on another machine and can see difference between Fedora 21 and RHEL7. My previous test was on Fedora 21, where it doesn't show that bug, while on RHEL7 the output is this: [hudson@miburman-jon33 bin]$ ./rhqctl status 17:39:04,947 INFO [org.jboss.modules] JBoss Modules version 1.3.3.Final-redhat-1 RHQ Storage Node (pid 12019 ) is ✔running RHQ Server (pid 13787 ) is ✔running JBossAS Java VM child process (pid 13906 ) is ✔running RHQ Agent (pid 14818 ) is ✔running [hudson@miburman-jon33 bin]$ kill -9 12019 [hudson@miburman-jon33 bin]$ cat ../rhq-storage/bin/cassandra.pid 12019[hudson@miburman-jon33 bin]$ ./rhqctl status 17:39:32,742 INFO [org.jboss.modules] JBoss Modules version 1.3.3.Final-redhat-1 RHQ Storage Node (no pid file) is ✘down RHQ Server (pid 13787 ) is ✔running JBossAS Java VM child process (pid 13906 ) is ✔running RHQ Agent (pid 14818 ) is ✔running [hudson@miburman-jon33 bin]$ rm -f ../rhq-storage/bin/cassandra.pid [hudson@miburman-jon33 bin]$ touch ../rhq-storage/bin/cassandra.pid [hudson@miburman-jon33 bin]$ ./rhqctl status 17:40:51,219 INFO [org.jboss.modules] JBoss Modules version 1.3.3.Final-redhat-1 RHQ Storage Node (pid ) is ✔running RHQ Server (pid 13787 ) is ✔running JBossAS Java VM child process (pid 13906 ) is ✔running RHQ Agent (pid 14818 ) is ✔running [hudson@miburman-jon33 bin]$ There seems to be at least different return codes between RHEL and Fedora also: RHEL7: [hudson@miburman-jon33 bin]$ kill -0 kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] [hudson@miburman-jon33 bin]$ echo $? 1 [hudson@miburman-jon33 bin]$ Fedora 21: [michael@miranda bin]$ kill -0 kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] [michael@miranda bin]$ echo $? 2 [michael@miranda bin]$ So yes, it seems this is platform dependant bug.. Changed on master: commit 58cdc87e9178ec73e277cbbc3c80d3b9d3516181 Author: Michael Burman <miburman> Date: Thu Nov 27 14:31:12 2014 +0200 [BZ 1155822] Validate that pid is a numeric value branch: release/jon3.3.x link: https://github.com/rhq-project/rhq/commit/8de7831d8 time: 2015-01-08 23:44:47 +0100 commit: 8de7831d880d440ec371c2a37fc518b32bac89d5 author: Michael Burman - miburman message: [BZ 1155822] Validate that pid is a numeric value (cherry picked from commit 58cdc87e9178ec73e277cbbc3c80d3b9d3516181) Signed-off-by: Libor Zoubek <lzoubek> Moving to ON_QA as available for test with the latest 3.3.1.ER01 bits from here: http://download.devel.redhat.com/brewroot/packages/org.jboss.on-jboss-on-parent/3.3.0.GA/12/maven/org/jboss/on/jon-server-patch/3.3.0.GA/jon-server-patch-3.3.0.GA.zip Created attachment 984763 [details]
fed20_status.log
Created attachment 984764 [details]
rhel6-status.log
Created attachment 984765 [details]
rhel7_status.log
verified on rhel6, rhel7 and fedora20. logs attached. |