Bug 876103 - Haproxy process is dead for scalable jboss apps after server upgrade and migrate
Summary: Haproxy process is dead for scalable jboss apps after server upgrade and migrate
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Bill DeCoste
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-11-13 10:25 UTC by Meng Bo
Modified: 2015-05-14 23:02 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-12-19 19:26:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
upgrade log (111.84 KB, application/octet-stream)
2012-11-19 10:16 UTC, Jianwei Hou
no flags Details
/var/log/httpd/error_log (8.96 MB, application/octet-stream)
2012-11-19 10:17 UTC, Jianwei Hou
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 874445 0 unspecified CLOSED Can't move idled scalable app 2021-02-22 00:41:40 UTC

Internal Links: 874445

Description Meng Bo 2012-11-13 10:25:54 UTC
Description of problem:
Launch devenv-stage_249, create some apps on it, upgrade to devenv_2548 and do migration 2.0.20. After that try to access the scalable jbossas and jbosseap app, it cannot be accessed. SSH login to the app and check the process by ps -ef. Cannot find the haproxy process

Version-Release number of selected component (if applicable):
From devenv-stage_249 to devenv_2458

How reproducible:
always

Steps to Reproduce:
1.Launch devenv-stage_249
2.Create scalable jbossas and jbosseap app
3.Upgrade server to devenv_2458 and do migration
4.Check the two apps
  
Actual results:
1.App cannot be accessed from web browser. (503 error)
2.SSH login to the app and cannot find the haproxy process
[jbosseap1s-bmeng1dev.dev.rhcloud.com ~]\> ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
529       5543  5421  0 05:02 ?        00:00:00 sshd: 82d5b5a92d8747c3b559c55b04651eeb@pts/1
529       5581  5543  0 05:02 ?        00:00:00 /bin/bash --init-file /usr/bin/rhcsh -i
529       7361  5581  0 05:04 ?        00:00:00 ps -ef
529      11561     1  0 01:58 ?        00:00:00 /bin/sh /var/lib/openshift/82d5b5a92d8747c3b559c55b04651eeb//jbosseap-6.0/jbosseap-6.0/bin/standalone.sh
529      12175 11561  0 01:58 ?        00:00:53 /usr/lib/jvm/jre-1.7.0/bin/java -D[Standalone] -client -Xmx256m -XX:MaxPermSize=128m -XX:+AggressiveOpts -Dorg.apache.tomcat.util.LOW_MEMORY=t


Expected results:
App works fine.

Additional info:
After do a restart action to the app, the haproxy process can be started and app can be accessed.

Comment 1 Bill DeCoste 2012-11-13 22:28:26 UTC
How do I do upgrade to devenv_2458?

Comment 2 Meng Bo 2012-11-14 01:17:21 UTC
Upgrade to devenv_2458
1.Launch devenv-stage_249 and devenv_2458
2.Copy the /root/devenv-local and /etc/yum.repos/local.repo from devenv_2458 to devenv-stage_249
3.Modify the /etc/yum.repos/devenv.repo to use the candidate mirrors on devnev-stage_249
4.Prepare test data on devenv-stage_249
5.Do upgrade with devenv-local repo enabled
#yum update --enablerepo devenv-local
6.Reboot instance
7.Do migration 2.0.20
#rhc-admin-migrate --version 2.0.20

Comment 3 Meng Bo 2012-11-14 08:28:16 UTC
Also for scalable jbossews app

Comment 4 Bill DeCoste 2012-11-14 14:16:31 UTC
Migration itself broken after update from ami-ac67e3c5. Not even seeing the 2 gears on the box. I'll retest once that's resolved.

[root@ip-10-202-27-64 ~]# rhc-admin-migrate --version 2.0.20

**Notice: C extension not loaded. This is required for optimum MongoDB Ruby driver performance.
  You can install the extension as follows:
  gem install bson_ext

  If you continue to receive this message after installing, make sure that the
  bson_ext gem is in your load path and that the bson_ext and mongo gems are of the same version.

Mocha deprecation warning: Test::Unit or MiniTest must be loaded *before* Mocha.
Mocha deprecation warning: If you're integrating with another test library, you should probably require 'mocha_standalone' instead of 'mocha'
/opt/rh/ruby193/root/usr/share/gems/gems/systemu-1.2.0/lib/systemu.rb:28: Use RbConfig instead of obsolete and deprecated Config.
Getting all RHLogins...
Gathering gears for user: bdecoste76c with uuid: 64a7cce91cc9418183e96c1fa4b2d347
RHLogins.length: 1
#####################################################
#####################################################
#####################################################
Summary:
# of users: 1
# of gears: 0
# of failures: 0
Gear counts per thread: []
Additional timings:
Time gathering users: 0.005s
Total execution time: 0.005s
#####################################################

Comment 5 Jhon Honce 2012-11-14 16:58:32 UTC
Fixed on master https://github.com/openshift/origin-server/pull/865

Comment 6 Meng Bo 2012-11-15 07:40:02 UTC
Upgrade from devenv-stage_249 to devenv_2475.

This issue still reproduced for scalable jbossas and jbosseap app.
Haproxy process does not exist for this two kinds of cartridge.
Web page return 503 error.

And the migration script only handled the jenkins gear this time.

Comment 7 Meng Bo 2012-11-15 09:32:52 UTC
Test again upgrade from devenv-stage_249 to devenv_2476.
The problem cannot be reproduced.

Please assign the bug back and i will verify it.

Comment 8 Meng Bo 2012-11-16 09:50:36 UTC
Checked on devenv_2486,
Move bug to verified.

Comment 9 Jianwei Hou 2012-11-19 06:00:58 UTC
This bug is reproduced when upgrading instance from devenv-stage_249 to devenv-stage_254

Upgrade steps:
1. Launch devenv-stage_249
2. Create test data against instance
3. SSH into instance, do yum update(no need to modify or enable any repos since this upgrade is performed from devenv-stage_249 to devenv-stage_254)
4. reboot instance after upgrade
5. rhc-admin-migrate --version 2.0.20
6. After migrate, access all existing apps.

My scalable jbossas and jbosseap applications return 503 error when trying to access their url.
After sshing into haproxy gear, there was no haproxy process running

[jbosseap1s-jhou1.dev.rhcloud.com ~]\> ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
524      11553     1  0 00:30 ?        00:00:00 /bin/sh /var/lib/openshift/9897ba8450524da6b19663dacdb0bc36//jbosseap-6.0/jbosseap-6.0/bin/standalone.sh
524      12089 11553  1 00:30 ?        00:00:12 /usr/lib/jvm/jre-1.7.0/bin/java -D[Standalone] -client -Xmx256m -XX:MaxPermSize=128m -XX:+AggressiveOpts -Dorg.apache.tomca
524      24015 23984  0 00:42 ?        00:00:00 sshd: 9897ba8450524da6b19663dacdb0bc36@pts/2
524      24016 24015  0 00:42 pts/2    00:00:00 /bin/bash --init-file /usr/bin/rhcsh -i
524      24710 24016  0 00:44 pts/2    00:00:00 ps -ef

Both apps became available on being restarted(rhc app restart).
So reopen this bug since this problem is still seen.

Additional info:
I also got a scalable jbossews application, which didn't have this issue.

Comment 10 Jianwei Hou 2012-11-19 10:14:48 UTC
Reproduced it once again.
Actually, this is reproduced once upgrade is performed, so it's not related to migrate(which migrates only jenkins in sprint 2.0.20). 

The haproxy process is missing for jbossas/jbosseap applications.

Wired thing is, not all of the jbossas/jbosseap apps have this issue, to better reproduce, created multiple apps(3 jbossas apps and 3 jbosseap apps will be enough).

I have attached upgrade log and httpd/error_log in order to dig more

My app's internal ip is 127.0.252.1, add I discovered following error in /var/log/httpd/error_log:
[Mon Nov 19 04:25:30 2012] [error] proxy: ap_get_scoreboard_lb(162) failed in child 14150 for worker http://127.0.252.1:18001/swydws/
[Mon Nov 19 04:25:30 2012] [error] proxy: ap_get_scoreboard_lb(163) failed in child 14150 for worker http://127.0.252.1:8080/

Comment 11 Jianwei Hou 2012-11-19 10:16:38 UTC
Created attachment 647662 [details]
upgrade log

Comment 12 Jianwei Hou 2012-11-19 10:17:41 UTC
Created attachment 647663 [details]
/var/log/httpd/error_log

Comment 13 Bill DeCoste 2012-11-19 15:27:34 UTC
What size ami are you creating?

Comment 14 Bill DeCoste 2012-11-19 17:10:19 UTC
Have you added SwitchYard configuration to your application? The 18001 port and swydws context are SwitchYard.

Comment 15 Bill DeCoste 2012-11-19 17:23:12 UTC
I can't recreate this issue using devenv-stage_249 upgraded to devenv-stage_254. After a reboot it can take a while for the scaled EAP to come up and haproxy is started after both eap instances but it does come up.

Need to confirm that:
1) SwitchYard is or isn't deployed to the eap app used to test
2) There was a yum clean all before the yum update

Comment 16 Bill DeCoste 2012-11-19 17:31:02 UTC
I see the ProxyPass in zzzzz_proxy.conf. Didn't realize this had been added.
ProxyPass /swydws/ http://$IP:18001/swydws/ status=I
ProxyPassReverse /swydws/ http://$IP:18001/swydws/

Comment 17 Bill DeCoste 2012-11-19 21:29:32 UTC
Need to confirm the ami size, a yum clean all, and if haproxy comes up after several minutes after an upgrade and reboot.

Comment 18 Jianwei Hou 2012-11-20 03:13:44 UTC
(In reply to comment #17)
> Need to confirm the ami size, a yum clean all, and if haproxy comes up after
> several minutes after an upgrade and reboot.

ami size: Medium

SwitchYard: Didn't deploy swithyard to both jbossas and jbosseap applications.

yum clean all: I didn't run yum clean all when I reopened this bug. Tried it again with yum clean all, and all my jboss apps seems well after yum update and reboot

Haproxy: haproxy comes up after several minutes after upgrade and reboot.

Didn't reproduce this again. Maybe the reason is I didn't run "yum clean all" before upgrade, or maybe the haproxy just haven't come up when I test on that ami.

This issue is never reproduce on INT and STG.
So, I'm moving it to verified.


Note You need to log in before you can comment on or make changes to this bug.