Hide Forgot
Description of problem: openstack undercloud upgrade" fails to complete after simulating power outage in the middle of undercloud upgrade. Fails to start the "docker-registry" service. /usr/lib/systemd/system/docker-registry.service is empty w/a reinstall docker-registry rpm and re-run the upgrade lError: Could not start Service[docker-registry]: Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service:Unit docker-registry.service is masked. Error: Could not start Service[docker-registry]: Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service:Unit docker-registry.service is masked. even after successful complettion of undercloud upgrade the service is down Version-Release number of selected component (if applicable): [stack@instack ~]$ rpm -qa | grep rhos rhos-release-1.0.39-1.noarch [stack@instack ~]$ rpm -qa | grep docker docker-registry-0.9.1-7.el7.x86_64 How reproducible: Steps to Reproduce: 1.install ospd 7 ga , deploy overcloud and populate it 2. run rhos-release -P 8-director and yum update on the undercloud node 3. run openstack undercloud upgrade 4. during #3 shut down the undercloud node 5. start theundercloud node and repeat step #3 Actual results: undercloud upgrade fails Expected results: undercloud upgrade succeed Additional info: reinstalled the docker rpm and ran the upgrade command again with a successful result.
Did you actually reproduce this error? I would bet that something else breaks the next time a shutdown is attempted during an upgrade. I wish the output of "rpm -qV docker-registry" was collected before reinstalling the RPM. If the docker-registry.service file was actually empty, then there's no way for the service to know how to start. And I wouldn't think this could ever happen other than interrupting the upgrade process violently. (It appears that systemd masks the service with an empty service file, so that would explain that particular odd detail.) My opinion is that if power failure is a scenario that is supposed to be handled gracefully, some sort of RPM transaction verification would need to be done for everything installed on the system.
I'd try a yum-complete-transaction to see if that fixes things. If it does, then I think this is notabug since this is really just a "how to recover from a power outage" action.
(In reply to Mike Burns from comment #3) > I'd try a yum-complete-transaction to see if that fixes things. > > If it does, then I think this is notabug since this is really just a "how to > recover from a power outage" action. Unfortunately I've reprovisioned the env already. I'll reproduce the issue and will try the yum-complete-transaction
Created attachment 1150987 [details] after_undercloud_upgrade_inerrupt.log Error during rerun of the upgrade : Error: Could not start Service[docker-registry]: Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service: Unit docker-registry.service is masked. Wrapped exception: Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service: Unit docker-registry.service is masked. Error: /Stage[main]/Main/Service[docker-registry]/ensure: change from stopped to running failed: Could not start Service[docker-registry]: Execution of '/bin/systemctl start docker-registry' returned 1: Failed to start docker-registry.service: Unit docker-registry.service is masked. The undercloud upgrade was interrupted by shutting down the instack node. rerunnig the undercloud upgrade ends with: + echo 'puppet apply exited with exit code 6' puppet apply exited with exit code 6 + '[' 6 '!=' 2 -a 6 '!=' 0 ']' + exit 6 [2016-04-26 09:51:53,773] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 6] [2016-04-26 09:51:53,774] (os-refresh-config) [ERROR] Aborting... Traceback (most recent call last): File "<string>", line 1, in <module> File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 815, in install _run_orc(instack_env) File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 699, in _run_orc _run_live_command(args, instack_env, 'os-refresh-config') File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 370, in _run_live_command raise RuntimeError('%s failed. See log for details.' % name) RuntimeError: os-refresh-config failed. See log for details. Command 'instack-install-undercloud' returned non-zero exit status 1 Tried to [stack@instack ~]$ sudo yum-complete-transaction No unfinished transactions left. and then rerun the upgrade ends with same output. attached file : after_undercloud_upgrade_inerrupt.log /usr/lib/systemd/system/docker-registry.service was empty
When exactly is the undercloud node being shutdown? I'm curious how much (if any) running a "sync" before shutting down would help. I'm unsure how an RPM transaction is listed as complete yet, as you have shown, the files are not properly present on disk. By my count it looks like 147 of 164 files didn't make it.
(In reply to Jeff Peeler from comment #6) > When exactly is the undercloud node being shutdown? I'm curious how much (if > any) running a "sync" before shutting down would help. > > I'm unsure how an RPM transaction is listed as complete yet, as you have > shown, the files are not properly present on disk. By my count it looks like > 147 of 164 files didn't make it. The shutdown is done during the "openstack undercloud upgrade". During step#3 http://etherpad.corp.redhat.com/osp-d-upgrade-ohochman I let it run for a couple secs, and then shutting down the node using virsh.
Hi Ola, I'm closing this issue as it didn't have any activity for a while. Don't hesitate to re-open it if that issue is still relevant. Regards,