Hide Forgot
Created attachment 511684 [details] ruby script to check services BUG: Ensure that *ALL* the aeolus services are started after a reboot Recreate: Get a build, install, configure... Ensure everything is working well.. Reboot the server Notice on reboot the following services do *NOT* start properly. 1. dbomatic 2. conductor-delayed_job 3. deltacloud-mock 4. imagefactory 5. aeolus-connector Hrm.. OK.. this does not appear to be consistent... On another box after rebooting... other services are *NOT* coming up. 1. deltacloud-mock * dup 2. iwhd * new 3. mongodb * new The other services listed above are running adding a test script so you can check the services before and after reboot
some debug info.. IWHD [root@dell-pe2950-01 ~]# [root@dell-pe2950-01 ~]# /etc/init.d/mongod start Starting mongod: [ OK ] [root@dell-pe2950-01 ~]# /etc/init.d/iwhd start waiting for mongod to listen on localhost:27017[FAILED] [root@dell-pe2950-01 ~]# telnet localhost 27017 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Connection refused [root@dell-pe2950-01 ~]# /etc/init.d/mongod restart Stopping mongod: [FAILED] Starting mongod: [ OK ] [root@dell-pe2950-01 ~]# telnet localhost 27017 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Connection refused [root@dell-pe2950-01 ~]# /etc/init.d/iwhd start waiting for mongod to listen on localhost:27017[FAILED] [root@dell-pe2950-01 ~]# This is caused by the mongodb lock file not getting cleaned up before shutdown.. We probably want to ensure that.. even though its not our responsibility [root@dell-pe2950-01 ~]# rm -Rf /var/lib/mongodb/mongod.lock [root@dell-pe2950-01 ~]# /etc/init.d/mongod start Starting mongod: [ OK ] [root@dell-pe2950-01 ~]# ps -ef | grep mongodb mongodb 15321 1 0 08:38 ? 00:00:00 /usr/bin/mongod --quiet -f /etc/mongodb.conf run root 15329 14663 0 08:38 pts/0 00:00:00 grep --color=auto mongodb [root@dell-pe2950-01 ~]#
On the other box.. once I started dbomatic.. the other scripts process started..
moving to on_qa.. be sure to set selinux to premissive before restarting.
working... in [root@sgi-xe310-02 ~]# ruby /root/checkServices.rb Checking aeolus-conductor ... Success: (pid 2130) is running... Checking aeolus-connector ... Success: image_factory_connector (pid 1944) is running... Checking condor ... Success: condor_master (pid 2115) is running... Checking conductor-dbomatic ... Success: dbomatic (pid 2272) is running... Checking conductor-delayed_job ... Success: delayed_job (pid 2318) is running... Checking deltacloud-ec2-us-east-1 ... Success: deltacloudd (pid 1949) is running... Checking deltacloud-ec2-us-west-1 ... Success: deltacloudd (pid 1962) is running... Checking deltacloud-mock ... FAILURE: deltacloudd dead but pid file exists Checking httpd ... Success: httpd (pid 1845) is running... Checking imagefactory ... Success: imagefactory (pid 2333) is running... Checking iwhd ... Success: iwhd (pid 1685) is running... Checking libvirtd ... Success: libvirtd (pid 1985) is running... Checking mongod ... Success: mongod (pid 1592) is running... Checking ntpd ... Success: ntpd (pid 1758) is running... Checking postgresql ... Success: postmaster (pid 1789) is running... Checking qpidd ... Success: qpidd (pid 1875) is running... Checking production solr ... Success: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 1698 root 64u IPv6 13930 0t0 TCP *:8983 (LISTEN) Checking connector ... Success: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME image_fac 1944 root 12u IPv4 16297 0t0 TCP localhost:cfinger (LISTEN) Checking condor_q ... Success: -- Submitter: sgi-xe310-02.rhts.eng.bos.redhat.com : <10.16.65.19:51801> : sgi-xe310-02.rhts.eng.bos.redhat.com ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 0 jobs; 0 idle, 0 running, 0 held Checking condor_status ... Success: [root@sgi-xe310-02 ~]# rpm -qa | grep aeolus aeolus-conductor-0.3.0-0.el6.20110708135911gitdb1097c.noarch rubygem-aeolus-cli-0.0.1-1.el6.20110708135911gitdb1097c.noarch aeolus-all-0.3.0-0.el6.20110708135911gitdb1097c.noarch aeolus-configure-2.0.1-0.el6.20110707131907gitfaa220b.noarch aeolus-conductor-doc-0.3.0-0.el6.20110708135911gitdb1097c.noarch aeolus-conductor-daemons-0.3.0-0.el6.20110708135911gitdb1097c.noarch [root@sgi-xe310-02 ~]#
removing from tracker
release pending...
release pending.. 2
perm close
closing out old bugs