| Summary: | undercloud upgrade failed due to failed docker service | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Ola Pavlenko <opavlenk> | ||||
| Component: | rhosp-director | Assignee: | Sofer Athlan-Guyot <sathlang> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Arik Chernetsky <achernet> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 8.0 (Liberty) | CC: | aschultz, dbecker, jpeeler, mburns, mcornea, morazi, opavlenk, rhel-osp-director-maint, sathlang | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-01-30 09:56:27 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
|
Description
Ola Pavlenko
2016-04-20 12:07:16 UTC
Did you actually reproduce this error? I would bet that something else breaks the next time a shutdown is attempted during an upgrade. I wish the output of "rpm -qV docker-registry" was collected before reinstalling the RPM. If the docker-registry.service file was actually empty, then there's no way for the service to know how to start. And I wouldn't think this could ever happen other than interrupting the upgrade process violently. (It appears that systemd masks the service with an empty service file, so that would explain that particular odd detail.) My opinion is that if power failure is a scenario that is supposed to be handled gracefully, some sort of RPM transaction verification would need to be done for everything installed on the system. I'd try a yum-complete-transaction to see if that fixes things. If it does, then I think this is notabug since this is really just a "how to recover from a power outage" action. (In reply to Mike Burns from comment #3) > I'd try a yum-complete-transaction to see if that fixes things. > > If it does, then I think this is notabug since this is really just a "how to > recover from a power outage" action. Unfortunately I've reprovisioned the env already. I'll reproduce the issue and will try the yum-complete-transaction Created attachment 1150987 [details]
after_undercloud_upgrade_inerrupt.log
Error during rerun of the upgrade :
Error: Could not start Service[docker-registry]: Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service: Unit docker-registry.service is masked.
Wrapped exception:
Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service: Unit docker-registry.service is masked.
Error: /Stage[main]/Main/Service[docker-registry]/ensure: change from stopped to running failed: Could not start Service[docker-registry]: Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service: Unit docker-registry.service is masked.
The undercloud upgrade was interrupted by shutting down the instack node.
rerunnig the undercloud upgrade ends with:
+ echo 'puppet apply exited with exit code 6'
puppet apply exited with exit code 6
+ '[' 6 '!=' 2 -a 6 '!=' 0 ']'
+ exit 6
[2016-04-26 09:51:53,773] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 6]
[2016-04-26 09:51:53,774] (os-refresh-config) [ERROR] Aborting...
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 815, in install
_run_orc(instack_env)
File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 699, in _run_orc
_run_live_command(args, instack_env, 'os-refresh-config')
File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 370, in _run_live_command
raise RuntimeError('%s failed. See log for details.' % name)
RuntimeError: os-refresh-config failed. See log for details.
Command 'instack-install-undercloud' returned non-zero exit status 1
Tried to
[stack@instack ~]$ sudo yum-complete-transaction
No unfinished transactions left.
and then rerun the upgrade ends with same output.
attached file : after_undercloud_upgrade_inerrupt.log
/usr/lib/systemd/system/docker-registry.service was empty
When exactly is the undercloud node being shutdown? I'm curious how much (if any) running a "sync" before shutting down would help. I'm unsure how an RPM transaction is listed as complete yet, as you have shown, the files are not properly present on disk. By my count it looks like 147 of 164 files didn't make it. (In reply to Jeff Peeler from comment #6) > When exactly is the undercloud node being shutdown? I'm curious how much (if > any) running a "sync" before shutting down would help. > > I'm unsure how an RPM transaction is listed as complete yet, as you have > shown, the files are not properly present on disk. By my count it looks like > 147 of 164 files didn't make it. The shutdown is done during the "openstack undercloud upgrade". During step#3 http://etherpad.corp.redhat.com/osp-d-upgrade-ohochman I let it run for a couple secs, and then shutting down the node using virsh. Hi Ola, I'm closing this issue as it didn't have any activity for a while. Don't hesitate to re-open it if that issue is still relevant. Regards, |