Created attachment 1053919 [details] Satellite Foreman-Debug Description of problem: Satellite and Capsule upgrade is failing in step: Upgrade Step: start_httpd... Upgrade step start_httpd failed. Check logs for more information. , on Rhel 66 only. Version-Release number of selected component (if applicable): Satellite GA Snap 13 Compose 2. How reproducible: Pre-requisites: Satellite 6.0.8 configured with Capsule and at least one host. Steps to Reproduce: 1. Upgrade Sat 6.0.8 to latest Sat Ga snap 13 C2. 2. Observe the CLI outputs. Actual results: Upgrading... Upgrade Step: stop_services... Upgrade Step: start_mongo... Upgrade Step: start_httpd... Upgrade step start_httpd failed. Check logs for more information. Expected results: No Failure in upgrade. Additional info: On retrying the satellite upgrade command, the satellite upgrade is successful.
Created attachment 1053920 [details] Katello-installer Logs
Since this issue was entered in Red Hat Bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release.
[ INFO 2015-07-21 04:26:08 main] Upgrading... [ INFO 2015-07-21 04:26:08 main] Upgrade Step: stop_services... [DEBUG 2015-07-21 04:26:10 main] celery multi v3.1.11 (Cipater) > resource_manager.lab.eng.bos.redhat.com: DOWN celery multi v3.1.11 (Cipater) > reserved_resource_worker-0.lab.eng.bos.redhat.com: DOWN > reserved_resource_worker-1.lab.eng.bos.redhat.com: DOWN > reserved_resource_worker-2.lab.eng.bos.redhat.com: DOWN > reserved_resource_worker-3.lab.eng.bos.redhat.com: DOWN Stopping Qpid AMQP daemon: [FAILED] celery init v10.0. Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat Stopping pulp_celerybeat... NOT RUNNING Stopping tomcat6: [ OK ] Stopping foreman-proxy: [ OK ] celery init v10.0. Using config script: /etc/default/pulp_resource_manager celery init v10.0. Using config script: /etc/default/pulp_workers Stopping elasticsearch: [FAILED] Stopping foreman-tasks: [FAILED] Success! [ INFO 2015-07-21 04:26:10 main] Upgrade Step: start_mongo... [DEBUG 2015-07-21 04:26:12 main] mongod (pid 7146) is running... listening on 127.0.0.1:27017 connection test successful [DEBUG 2015-07-21 04:26:13 main] mongod (pid 7146) is running... listening on 127.0.0.1:27017 connection test successful [ INFO 2015-07-21 04:26:13 main] Upgrade Step: start_httpd... [ERROR 2015-07-21 04:26:13 main] httpd is stopped [DEBUG 2015-07-21 04:26:13 main] Starting httpd: [Tue Jul 21 04:26:13 2015] [warn] module passenger_module is already loaded, skipping [ OK ] [ERROR 2015-07-21 04:26:13 main] httpd dead but subsys locked [ERROR 2015-07-21 04:26:13 main] Upgrade step start_httpd failed. Check logs for more information. [DEBUG 2015-07-21 04:26:13 main] Exit with status code: 1 (signal was 1) [ERROR 2015-07-21 04:26:13 main] Repeating errors encountered during run: [ERROR 2015-07-21 04:26:13 main] httpd is stopped [ERROR 2015-07-21 04:26:13 main] httpd dead but subsys locked [ERROR 2015-07-21 04:26:13 main] Upgrade step start_httpd failed. Check logs for more information. [DEBUG 2015-07-21 04:26:13 main] Cleaning /etc/katello-installer/d20150721-25215-y8clcl [DEBUG 2015-07-21 04:26:13 main] Cleaning /tmp/default_values.yaml
If you re-run the upgrade then it proceed ahead. So issue appears only when you run the upgrade first time on rhel66.
From capsule: ============ [root@cloud-qe-4 ~]# capsule-installer --upgrade --certs-tar cloud-qe-4.idmqe.lab.eng.bos.redhat.com-certs.tar Upgrading... Upgrade Step: stop_services... Upgrade Step: start_mongo... Upgrade Step: start_httpd... Upgrade step start_httpd failed. Check logs for more information. [ INFO 2015-07-21 05:40:12 main] Upgrade Step: start_mongo... [DEBUG 2015-07-21 05:40:13 main] mongod (pid 24708) is running... listening on 127.0.0.1:27017 connection test successful [DEBUG 2015-07-21 05:40:14 main] mongod (pid 24708) is running... listening on 127.0.0.1:27017 connection test successful [ INFO 2015-07-21 05:40:14 main] Upgrade Step: start_httpd... [ERROR 2015-07-21 05:40:14 main] httpd is stopped [DEBUG 2015-07-21 05:40:15 main] Starting httpd: [Tue Jul 21 05:40:14 2015] [warn] module passenger_module is already loaded, skipping [ OK ] [ERROR 2015-07-21 05:40:15 main] httpd dead but subsys locked [ERROR 2015-07-21 05:40:15 main] Upgrade step start_httpd failed. Check logs for more information. [DEBUG 2015-07-21 05:40:15 main] Exit with status code: 1 (signal was 1) [ERROR 2015-07-21 05:40:15 main] Repeating errors encountered during run: [ERROR 2015-07-21 05:40:15 main] httpd is stopped [ERROR 2015-07-21 05:40:15 main] httpd dead but subsys locked [ERROR 2015-07-21 05:40:15 main] Upgrade step start_httpd failed. Check logs for more information. [
This is resolved via 1228694 Marking POST to indicate this.
Reproducible with upgrade from sat6.0.8 to sat6.1.1 GA snap15 on rhel66 --- [root@cloud-qe-19 yum.repos.d]# katello-installer --upgrade Upgrading... Upgrade Step: stop_services... Upgrade Step: start_mongo... Upgrade Step: migrate_pulp... Upgrade Step: start_httpd... Upgrade step start_httpd failed. Check logs for more information. --- [root@cloud-qe-19 yum.repos.d]# rpm -qa | grep katello-installer katello-installer-base-2.3.17-1.el6_6sat.noarch katello-installer-2.3.17-1.el6_6sat.noarch [root@cloud-qe-19 yum.repos.d]#
Error from logs: =============== ==> /var/log/katello-installer/katello-installer.log <== [DEBUG 2015-07-30 01:55:57 main] Mongo database for connection is version 2.4.6 Loading content types. Loading type descriptors [iso_support.json, puppet.json, docker.json, rpm_support.json, nodes.json] Parsing type descriptors Validating type descriptor syntactic integrity Validating type descriptor semantic integrity Updating the database with types [iso, puppet_module, docker_image, distribution, drpm, erratum, package_group, package_category, package_environment, rpm, srpm, yum_repo_metadata_file, repository, node] /usr/lib/python2.6/site-packages/pulp/server/db/connection.py:167: DeprecationWarning: The safe parameter is deprecated. Please use write concern options instead. return method(*args, **kwargs) Content types loaded. Ensuring the admin role and user are in place. Admin role and user are in place. Beginning database migrations. Applying pulp.server.db.migrations version 10 Migration to pulp.server.db.migrations version 10 complete. Applying pulp.server.db.migrations version 11 Migration to pulp.server.db.migrations version 11 complete. Applying pulp.server.db.migrations version 12 Migration to pulp.server.db.migrations version 12 complete. Migration package pulp_puppet.plugins.migrations is up to date at version 2 Applying pulp_rpm.plugins.migrations version 18 Migration to pulp_rpm.plugins.migrations version 18 complete. Applying pulp_rpm.plugins.migrations version 19 Migration to pulp_rpm.plugins.migrations version 19 complete. Database migrations complete. [ INFO 2015-07-30 01:55:57 main] Upgrade Step: start_httpd... [ERROR 2015-07-30 01:55:57 main] httpd is stopped [DEBUG 2015-07-30 01:55:57 main] Starting httpd: [Thu Jul 30 01:55:57 2015] [warn] module passenger_module is already loaded, skipping [ OK ] [ERROR 2015-07-30 01:55:57 main] httpd dead but subsys locked [ERROR 2015-07-30 01:55:57 main] Upgrade step start_httpd failed. Check logs for more information. [DEBUG 2015-07-30 01:55:57 main] Exit with status code: 1 (signal was 1) [ERROR 2015-07-30 01:55:57 main] Repeating errors encountered during run: [ERROR 2015-07-30 01:55:57 main] <NilClass> nil [ERROR 2015-07-30 01:55:57 main] httpd is stopped [ERROR 2015-07-30 01:55:57 main] httpd dead but subsys locked [ERROR 2015-07-30 01:55:57 main] Upgrade step start_httpd failed. Check logs for more information. [DEBUG 2015-07-30 01:55:57 main] Cleaning /etc/katello-installer/d20150730-5336-10crh6a [DEBUG 2015-07-30 01:55:57 main] Cleaning /tmp/default_values.yaml Also some qpid failures too, not sure if they are relevant ==> /var/log/messages <== Jul 30 01:56:52 cloud-qe-19 pulp: gofer.messaging.adapter.qpid.connection:ERROR: connect: qpid+ssl://cloud-qe-19.idmqe.lab.eng.bos.redhat.com:5671, failed: [Errno 111] Connection refused Jul 30 01:56:52 cloud-qe-19 pulp: gofer.messaging.adapter.qpid.connection:INFO: retry in 20 seconds Jul 30 01:57:12 cloud-qe-19 pulp: gofer.messaging.adapter.qpid.connection:ERROR: connect: qpid+ssl://cloud-qe-19.idmqe.lab.eng.bos.redhat.com:5671, failed: [Errno 111] Connection refused Jul 30 01:57:12 cloud-qe-19 pulp: gofer.messaging.adapter.qpid.connection:INFO: retry in 24 seconds Jul 30 01:57:37 cloud-qe-19 pulp: gofer.messaging.adapter.qpid.connection:ERROR: connect: qpid+ssl://cloud-qe-19.idmqe.lab.eng.bos.redhat.com:5671, failed: [Errno 111] Connection refused Jul 30 01:57:37 cloud-qe-19 pulp: gofer.messaging.adapter.qpid.connection:INFO: retry in 29 seconds
Immediately after above error, I checked the httpd status and katello-service status [root@cloud-qe-19 yum.repos.d]# service httpd status httpd (pid 6084) is running... [root@cloud-qe-19 yum.repos.d]# katello-service status qpidd is stopped celery init v10.0. Using config script: /etc/default/pulp_resource_manager node resource_manager is stopped... mongod (pid 4071) is running... listening on 127.0.0.1:27017 connection test successful celery init v10.0. Using config script: /etc/default/pulp_workers node reserved_resource_worker-0 is stopped... node reserved_resource_worker-1 is stopped... node reserved_resource_worker-2 is stopped... node reserved_resource_worker-3 is stopped... elasticsearch is stopped foreman-proxy is stopped celery init v10.0. Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat pulp_celerybeat is stopped. tomcat6 is stopped[ OK ] httpd (pid 6084) is running... dynflow_executor is not running. dynflow_executor_monitor is not running. Some services failed: qpidd,pulp_resource_manager,pulp_workers,elasticsearch,foreman-proxy,pulp_celerybeat,tomcat6,foreman-tasks [root@cloud-qe-19 yum.repos.d]#
Created attachment 1057531 [details] foreman-debug logs from upgrade with snap15
We're checking the status too soon: [root@satellite ~]# service httpd stop; service httpd start; service httpd status Starting httpd: [Thu Jul 30 11:20:33 2015] [warn] module passenger_module is already loaded, skipping [ OK ] httpd dead but subsys locked 1 second later... [root@satellite ~]# service httpd status httpd (pid 3799) is running...
Created redmine issue http://projects.theforeman.org/issues/11261 from this bug
Ok..Good news. upgraded sat 6.0.8 -> sat6.1.1 GA snap15 c3. Satellite server upgrade completed successfully on rhel66. [root@cloud-qe-9 yum.repos.d]# katello-installer --upgrade Upgrading... Upgrade Step: stop_services... Upgrade Step: start_mongo... Upgrade Step: migrate_pulp... Upgrade Step: start_httpd... Upgrade Step: migrate_candlepin... Upgrade Step: migrate_foreman... Upgrade Step: Running installer... Installing Done [100%] [..................................................................] The full log is at /var/log/katello-installer/katello-installer.log Upgrade Step: restart_services... Upgrade Step: db_seed... Upgrade Step: errata_import (this may take a while) ... Upgrade Step: update_gpg_urls (this may take a while) ... Upgrade Step: update_repository_metadata (this may take a while) ... Katello upgrade completed! [root@cloud-qe-9 yum.repos.d]# service httpd status httpd (pid 21303) is running... [root@cloud-qe-9 yum.repos.d]# ^C [root@cloud-qe-9 yum.repos.d]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.7 (Santiago) [root@cloud-qe-9 yum.repos.d]# However I can see an error at end of upgrade in installer.log: [ INFO 2015-07-31 05:58:35 main] Katello upgrade completed! [DEBUG 2015-07-31 05:58:35 main] Hook /usr/share/katello-installer/hooks/post/30-upgrade.rb returned [<Logging::Logger:0x3fea5e411530 name="main">, <Logging::Logger:0x3fea5e38abac name="fatal">] [ INFO 2015-07-31 05:58:35 main] All hooks in group post finished [DEBUG 2015-07-31 05:58:35 main] Exit with status code: 2 (signal was 2) [ERROR 2015-07-31 05:58:35 main] Repeating errors encountered during run: [ERROR 2015-07-31 05:58:35 main] <NilClass> nil [ERROR 2015-07-31 05:58:35 main] httpd is stopped For this another bz is filed 1245998
Ok, Now I tried capsule upgrade from "sat 6.0.8 -> sat6.1.1 GA snap15 c3." Upgrade completed successfully. [root@cloud-qe-6 ~]# capsule-installer --upgrade --certs-tar cloud-qe-6.idmqe.lab.eng.bos.redhat.com-certs.tar Upgrading... Upgrade Step: stop_services... Upgrade Step: start_mongo... Upgrade Step: migrate_pulp... Upgrade Step: start_httpd... Upgrade Step: Running installer... Installing --- /etc/foreman-proxy/settings.ymlngs_fi2015-07-31 04: [99%] [................................................................Installing Done [100%] [..................................................................] The full log is at /var/log/capsule-installer/capsule-installer.log Upgrade Step: restart_services... Katello upgrade completed! However, I got this error at start_httpd upgrade step [ INFO 2015-07-31 06:23:55 main] Upgrade Step: start_httpd... [ERROR 2015-07-31 06:23:55 main] httpd is stopped And at last "upgrade_step: restart_services" httpd seems ok.. [ INFO 2015-07-31 06:25:59 main] Upgrade Step: restart_services... [DEBUG 2015-07-31 06:26:30 main] celery multi v3.1.11 (Cipater) > Stopping nodes... > reserved_resource_worker-3.lab.eng.bos.redhat.com: QUIT -> 23298 > reserved_resource_worker-2.lab.eng.bos.redhat.com: QUIT -> 23269 > reserved_resource_worker-1.lab.eng.bos.redhat.com: QUIT -> 23240 > reserved_resource_worker-0.lab.eng.bos.redhat.com: QUIT -> 23217 > Waiting for 4 nodes -> 23298, 23269, 23240, 23217........ > reserved_resource_worker-3.lab.eng.bos.redhat.com: OK > Waiting for 3 nodes -> 23269, 23240, 23217.... > reserved_resource_worker-2.lab.eng.bos.redhat.com: OK > Waiting for 2 nodes -> 23240, 23217.... > reserved_resource_worker-1.lab.eng.bos.redhat.com: OK > Waiting for 1 node -> 23217.... > reserved_resource_worker-0.lab.eng.bos.redhat.com: OK celery multi v3.1.11 (Cipater) > Starting nodes... > reserved_resource_worker-0.lab.eng.bos.redhat.com: OK > reserved_resource_worker-1.lab.eng.bos.redhat.com: OK > reserved_resource_worker-2.lab.eng.bos.redhat.com: OK > reserved_resource_worker-3.lab.eng.bos.redhat.com: OK celery multi v3.1.11 (Cipater) > Stopping nodes... > resource_manager.lab.eng.bos.redhat.com: QUIT -> 23025 > Waiting for 1 node -> 23025..... > resource_manager.lab.eng.bos.redhat.com: OK celery multi v3.1.11 (Cipater) > Starting nodes... > resource_manager.lab.eng.bos.redhat.com: OK [Fri Jul 31 06:26:25 2015] [warn] module passenger_module is already loaded, skipping Stopping mongod: [ OK ] Starting mongod: [ OK ] Waiting for mongod to become available: [ OK ] Shutting down qdrouterd services: [ OK ] Starting qdrouterd services: [ OK ] Stopping Qpid AMQP daemon: [ OK ] Starting Qpid AMQP daemon: [ OK ] celery init v10.0. Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat Restarting celery periodic task scheduler Stopping pulp_celerybeat... OK Starting pulp_celerybeat... celery init v10.0. Using config script: /etc/default/pulp_workers Stopping foreman-proxy: [ OK ] Starting foreman-proxy: [ OK ] celery init v10.0. Using config script: /etc/default/pulp_resource_manager Stopping httpd: [ OK ] Starting httpd: [ OK ] Success!
@Stephen: Hey, Could you please take a look at comment 17. I can see error at upgrade step: start_httpd. But later on restart services step, everything seems OK. I guess that's not harmful, as at later step all services seems ok. But still would like to confirm with you. Thanks
Ah that's the same issue as https://bugzilla.redhat.com/show_bug.cgi?id=1245998. `service httpd status` will return 3 if the service is already stopped, and kafo sees that. It's not a problem other than the log entry it creates.
Ok.. Thanks for reply. As per comments 16-19, looks like the reported issue has been resolved. Upgrade is completed successfully on console. However we got httpd error and for that bz 1245998 is already filed. Moving this to verified. thanks
This bug was fixed in Satellite 6.1.1 which was delivered on 12 August, 2015.