Bug 1033667

Summary: rhqctl upgrade of agent across file system boundaries causes agent upgrade to fail
Product: [Other] RHQ Project Reporter: John Mazzitelli <mazz>
Component: InstallerAssignee: John Mazzitelli <mazz>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.9CC: hrupp
Target Milestone: GA   
Target Release: RHQ 4.10   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-23 12:31:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Mazzitelli 2013-11-22 15:01:56 UTC
If you have an old RHQ server install on one file system (say /opt) and you unzip a new server distro upgrade to another file system (say /home, which is on a different partition), the agent upgrade will fail.

This is because we can't do a true "mv", but rather, when we detect we cross partition boundaries, we do a "copy" instead. But we just copy the content and filenames - we do not retain file permissions. This means we lose the execute bit. So when the rhqctl upgrader tries to restart the agent it fails because the agent wrapper startup script can't run.

Replication:

1. Install RHQ Server/StorageNode/Agent via rhqctl on one partition (/part1/rhq).
2. Ensure everything installed and is running correctly.
3. Stop everything (rhqctl stop). Now you have an old instal in /part1/rhq.
4. Take an RHQ distro and unzip it on another partition (/part2/rhq).
5. Upgrade the original via rhqctl upgrade --from-server-dir=/part1/rhq/rhq-server-#.#.#
6. Boom. The upgrade will fail when it tries to restart the upgraded agent. You will see this:

09:52:12,107 INFO  [org.rhq.server.control.command.Upgrade] Starting RHQ agent...
09:52:12,108 ERROR [org.rhq.server.control.command.Upgrade] An error occurred while starting the agent: Cannot run program "./rhq-agent-wrapper.sh" (in directory "/part2/rhq/rhq-agent/bin"): error=13, Permission denied
09:52:12,110 ERROR [org.rhq.server.control.RHQControl] error=13, Permission denied: java.io.IOException: error=13, Permission denied
        at java.lang.UNIXProcess.forkAndExec(Native Method) [rt.jar:1.7.0_11]
...
        at org.rhq.server.control.command.AbstractInstall.startAgent(AbstractInstall.java:303) [rhq-server-control-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT]
        at org.rhq.server.control.command.Upgrade.exec(Upgrade.java:210) [rhq-server-control-4.10.0-SNAPSHOT.jar:4.10.0-SNAPSHOT]
...

Comment 1 John Mazzitelli 2013-11-22 17:45:51 UTC
git commit to master: 3c816b28513a6cd1b5bd7c45bc32ca456ca91055

Comment 2 Heiko W. Rupp 2014-04-23 12:31:55 UTC
Bulk closing of 4.10 issues.

If an issue is not solved for you, please open a new BZ (or clone the existing one) with a version designator of 4.10.