Bug 981584 - 'snapshot restore' cannot find the uploaded archive when it's restored to a scaled application
'snapshot restore' cannot find the uploaded archive when it's restored to a s...
Status: CLOSED CURRENTRELEASE
Product: OpenShift Online
Classification: Red Hat
Component: Containers (Show other bugs)
2.x
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Paul Morie
libra bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-05 04:04 EDT by Qiushui Zhang
Modified: 2015-05-14 19:23 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-08-07 18:54:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Qiushui Zhang 2013-07-05 04:04:37 EDT
Description of problem:
Postgresql on the application will be destoyed if restoring postgresql-9.2 snapshot to postgresql-8.4 app.

Version-Release number of selected component (if applicable):
deven_3450

How reproducible:
Always

Steps to Reproduce:
1. Create an app with postgresql-9.2. For example, rhc app create pl510s perl-5.10 postgresql-9.2 -s. And then insert some data to the sql.
2. Create an snapshot for it. rhc snapshot save pl510s
3. Delete the app. 
4. Create a new app with postgresql-8.4. 
rhc app create pl510ss perl-5.10 postgresql-8.4 -s
5. Restore the snapshot to this new app.
rhc snapshot restore -a pl510ss -f pl510s.tar.gz

Actual results:
Restore process fails. If user logon the new app pl510ss, "psql" can not be launched.

Expected results:
The postgresql on the app should not be destroyed. After failure of the snapshot restore, it should be rollback correctly.

Additional info:
Adding the failure log here:

[openshift@localhost tmp]$ rhc snapshot restore -a pl510ss -f pl510s.tar.gz 
Restoring from snapshot pl510s.tar.gz...
Removing old git repo: ~/git/pl510ss.git/
Removing old data dir: ~/app-root/data/*
Restoring ~/git/pl510ss.git and ~/app-root/data
Restoring snapshot for postgresql-8.4 gear
cat: postgresql-8.4.tar.gz: No such file or directory
Removing old data dir: ~/app-root/data/*
Restoring ~/app-root/data

gzip: stdin: unexpected end of file
/bin/tar: Child returned status 1
/bin/tar: Error is not recoverable: exiting now
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:131:in `block (2 levels) in oo_spawn': Shell command '/bin/tar --strip=2 --overwrite -xmz ./*/app-root/data --transform="s|${OPENSHIFT_GEAR_NAME}/data|app-root/data|" --transform="s|git/.*\.git|git/${OPENSHIFT_GEAR_NAME}.git|" --exclude="./*/app-root/runtime/data" --exclude="./*/postgresql/data" 1>&2' returned an error. rc=2 (OpenShift::Runtime::Utils::ShellExecutionException)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:94:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:94:in `block in oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:93:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:93:in `oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-container-selinux-0.0.3/lib/openshift/runtime/containerization/selinux_container.rb:284:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container.rb:611:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:195:in `extract_restore_archive'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:136:in `restore'
	from /usr/bin/gear:280:in `block (2 levels) in <main>'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:155:in `run'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:385:in `run_active_command'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:62:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/delegates.rb:11:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/import.rb:10:in `block in <top (required)>'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:131:in `block (2 levels) in oo_spawn': Shell command 'cat postgresql-8.4.tar.gz | /usr/bin/ssh -q -o 'BatchMode=yes' -o 'StrictHostKeyChecking=no' -i $OPENSHIFT_APP_SSH_KEY  51d66f96e2874e1137000002@51d66f96e2874e1137000002-qiuzhang.dev.rhcloud.com 'restore'' returned an error. rc=1 (OpenShift::Runtime::Utils::ShellExecutionException)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:94:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:94:in `block in oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:93:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/utils/shell_exec.rb:93:in `oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-container-selinux-0.0.3/lib/openshift/runtime/containerization/selinux_container.rb:284:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container.rb:611:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:216:in `block in handle_scalable_restore'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:212:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:212:in `handle_scalable_restore'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.11.4/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:139:in `restore'
	from /usr/bin/gear:280:in `block (2 levels) in <main>'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:155:in `run'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:385:in `run_active_command'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:62:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/delegates.rb:11:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/import.rb:10:in `block in <top (required)>'
Error in trying to restore snapshot. You can try to restore manually by running:
cat pl510s.tar.gz | ssh e504a5cae54011e2a8d512313d0895bf@pl510ss-qiuzhang.dev.rhcloud.com 'restore INCLUDE_GIT'
Comment 1 Hiro Asari 2013-07-05 15:52:23 EDT
Is this really 100% reproducible? I couldn't reproduce it.

Note that the actual error message is:

gzip: stdin: unexpected end of file

Is the tarball actually valid? (Could you verify with 'tar tzvf *.tar.gz'?)

While the restore was under way, perhaps there was some network interruptions.
Comment 2 Qiushui Zhang 2013-07-08 03:48:51 EDT
It happens everytime for me.
Just the same steps as described in the bug.
I also checked the tarball. It is OK.

Did you use postgresql-8.4 for the second app? The bug I'm saying is that:
Restoring a postgresql-9.2 app snapshot to a postgresql-8.4 app. In my mind, this may not be a correct process. If snanshot restore fails, a correct rollback process is expected.
Comment 3 Hiro Asari 2013-07-08 22:34:16 EDT
I did.


rhc app create railsapp ruby-1.9 postgresql-9.2 --from-code https://github.com/BanzaiMan/openshift-rails-example-postgresql.git

cd railsapp

RAILS_ENV=production ./script/rails g scaffold articles title summary:text
git add .
git commit -m 'scaffold'
git push

visit the app (/articles), make a few records

bx bin/rhc snapshot save -a railsapp


rhc app create railsapp84 ruby-1.9 postgresql-8.4 --from-code https://github.com/BanzaiMan/openshift-rails-example-postgresql.git

bx bin/rhc snapshot restore -a railsapp84 -f railsapp.tar.gz



OUTPUT:
Restoring from snapshot railsapp.tar.gz...
Warning: Permanently added 'railsapp84-fooooooooooo.dev.rhcloud.com,54.227.85.179' (RSA) to the list of known hosts.
Removing old git repo: ~/git/railsapp84.git/
Removing old data dir: ~/app-root/data/*
Restoring ~/git/railsapp84.git and ~/app-root/data
/opt/rh/ruby193/root/usr/bin/ruby /var/lib/openshift/186068841760597924118528/app-root/runtime/repo/vendor/bundle/ruby/1.9.1/bin/rake assets:precompile:all RAILS_ENV=production RAILS_GROUPS=assets

RESULT:
Success
Comment 4 Hiro Asari 2013-07-08 23:00:27 EDT
I think I know what the problem is.
Comment 5 Hiro Asari 2013-07-08 23:08:58 EDT
On a second thought, I don't think there is anything that is specific to PostgreSQL cartridge. The problem should happen on *any* scaled application.

I'm adjusting the summary again, and sending this to CLI.
Comment 6 Clayton Coleman 2013-07-09 13:40:29 EDT
Why is this CLI?  We invoke : restore INCLUDE_GIT on the node and send the file over.  There is nothing on our end.
Comment 7 Paul Morie 2013-07-10 17:18:53 EDT
I was not able to reproduce this with the cucumber test that covers it (openshift-test/tests/runtime-cartridge-postgresql.feature:195) or manually.
Comment 8 Qiushui Zhang 2013-07-11 04:26:36 EDT
Hi,
I'm not quite clear with Hiro's comments. So far, I still have the following concerns:

If I try to snapshot restore a package to an unmatched app, I would expect the restore process fails and the app works OK after the rollback. 

Unfornately, if I restore a snapshot (generating from an app with postgresql-9.2) to an app with postgresql-8.4, postgresql 8.4 will fail to launch again.
Similar, I tried to restore an app WITHOUT mysql-5.1 to an app with mysql-5.1, the latter one will not be able to launch mysql-5.1 any more.


I suggest we have a correct rollback here. So I change the defect status to "assigned".
Comment 9 Paul Morie 2013-07-11 14:54:30 EDT
I am adding the UpcomingRelease tag to this issue to provide a warning when restoring between versions of postgresql.  

Regarding the other issues raised in this bug:

1. Restoring a dump without mysql to an application, scaled or unscaled, with a mysql cartridge, does not adversly affect mysql that I can determine.

2. We do not currently have metadata in the snapshot that we can use to determine whether the cartridge layout of a snapshot is compatible with the current state of the gear.
Comment 10 Hiro Asari 2013-07-11 14:56:41 EDT
I can reproduce this with devenv_3487.

Steps: https://gist.github.com/BanzaiMan/5978175
Comment 11 Hiro Asari 2013-07-11 15:05:54 EDT
The same procedure above would succeed if the target application has postgresql-9.2 instead.
Comment 12 Hiro Asari 2013-07-17 15:00:50 EDT
Restoring an 8.4 snapshot to an app with the 9.2 cartridge also fails similarly.
Comment 13 Paul Morie 2013-07-26 12:56:13 EDT
PR submitted to add a guard against restoring between 8.4 and 9.2.  Here's how this should be working now:

1. Going forward, snapshots of postresql will contain a marker indicating which version of postgresql created the snapshot.
2. If the snapshot contains a version marker, it will be validated against OPENSHIFT_POSTGRESQL_VERSION.  If the versions do not match, the database restore will be skipped and the database restarted.
3. If the snapshot does not contain a version marker:
  a.  If OPENSHIFT_POSTGRESQL_VERSION is 8.4, the restore will proceed.
  b.  Otherwise, the restore will be skipped, the database restarted, and a warning returned to the user.
Comment 15 Qiushui Zhang 2013-07-29 01:32:54 EDT
Tested on devenv_3569. I still got a failure with the same process:
1. rhc app create pl1s perl-5.10 postgresql-9.2 -s
2. rhc snapshot save pl1s
3. rhc app create pl2s perl-5.10 postgresql-8.4 -s

When doing "rhc snapshot restore -a pl2s -f pl1s.tar.gz", I got the following:

Restoring from snapshot pl1s.tar.gz...
Removing old git repo: ~/git/pl2s.git/
Removing old data dir: ~/app-root/data/*
Restoring ~/git/pl2s.git and ~/app-root/data
Restoring snapshot for postgresql-8.4 gear
cat: postgresql-8.4.tar.gz: No such file or directory
Removing old data dir: ~/app-root/data/*
Restoring ~/app-root/data

gzip: stdin: unexpected end of file
/bin/tar: Child returned status 1
/bin/tar: Error is not recoverable: exiting now
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:131:in `block (2 levels) in oo_spawn': Shell command '/bin/tar --strip=2 --overwrite -xmz ./*/app-root/data --transform="s|${OPENSHIFT_GEAR_NAME}/data|app-root/data|" --transform="s|git/.*\.git|git/${OPENSHIFT_GEAR_NAME}.git|" --exclude="./*/app-root/runtime/data" --exclude="./*/postgresql/data" 1>&2' returned an error. rc=2 (OpenShift::Runtime::Utils::ShellExecutionException)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:94:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:94:in `block in oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:93:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:93:in `oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-container-selinux-0.1.3/lib/openshift/runtime/containerization/selinux_container.rb:288:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container.rb:595:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:196:in `extract_restore_archive'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:137:in `restore'
	from /usr/bin/gear:306:in `block (2 levels) in <main>'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:155:in `run'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:385:in `run_active_command'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:62:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/delegates.rb:11:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/import.rb:10:in `block in <top (required)>'
/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:131:in `block (2 levels) in oo_spawn': Shell command 'cat postgresql-8.4.tar.gz | /usr/bin/ssh -q -o 'BatchMode=yes' -o 'StrictHostKeyChecking=no' -i $OPENSHIFT_APP_SSH_KEY  795328795491401579102208@795328795491401579102208-qiuzhang.dev.rhcloud.com 'restore'' returned an error. rc=1 (OpenShift::Runtime::Utils::ShellExecutionException)
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:94:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:94:in `block in oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:93:in `pipe'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/utils/shell_exec.rb:93:in `oo_spawn'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-container-selinux-0.1.3/lib/openshift/runtime/containerization/selinux_container.rb:288:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container.rb:595:in `run_in_container_context'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:217:in `block in handle_scalable_restore'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:213:in `each'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:213:in `handle_scalable_restore'
	from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.12.3/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:140:in `restore'
	from /usr/bin/gear:306:in `block (2 levels) in <main>'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:155:in `run'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:385:in `run_active_command'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:62:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/delegates.rb:11:in `run!'
	from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/import.rb:10:in `block in <top (required)>'
Error in trying to restore snapshot. You can try to restore manually by running:
cat pl1s.tar.gz | ssh 552608053280479020843008@pl2s-qiuzhang.dev.rhcloud.com 'restore INCLUDE_GIT'
Comment 16 openshift-github-bot 2013-07-30 14:26:16 EDT
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/b15682eb692db142058f33cd7f7ffa7936a700de
Fix bug 981584: skip restore for secondary gear group in scalable app if there is no appropriate snapshot
Comment 17 Qiushui Zhang 2013-07-30 23:02:31 EDT
Verified on devenv_3588

I got the following output:

Restoring from snapshot pq1s.tar.gz...
Removing old git repo: ~/git/pq2s.git/
Removing old data dir: ~/app-root/data/*
Restoring ~/git/pq2s.git and ~/app-root/data
Unable to restore postgresql-8.4 because it appears there is no snapshot for that type

RESULT:
Success


Mark the defect as verified.

Note You need to log in before you can comment on or make changes to this bug.