Bug 1414815 - 5.6.2.2-1 to 5.7.0.17-1 upgrade and running bundle corrupts gemset Gemfile.lock
Summary: 5.6.2.2-1 to 5.7.0.17-1 upgrade and running bundle corrupts gemset Gemfile.lock
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.7.0
Hardware: All
OS: All
high
high
Target Milestone: GA
: 5.9.0
Assignee: Gregg Tanzillo
QA Contact: Dave Johnson
URL:
Whiteboard: upgrade
: 1483659 1485741 (view as bug list)
Depends On:
Blocks: 1416998
TreeView+ depends on / blocked
 
Reported: 2017-01-19 14:31 UTC by Felix Dewaleyne
Modified: 2020-12-14 08:01 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1416998 (view as bug list)
Environment:
Last Closed: 2017-09-08 19:15:52 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:


Attachments (Terms of Use)

Description Felix Dewaleyne 2017-01-19 14:31:57 UTC
Created attachment 1242507 [details]
output_yumupdate.txt

Description of problem:
upgrade from 5.6.2.2-1 to 5.7.0.17-1 fails due to a json gem

Version-Release number of selected component (if applicable):
5.6.2.2-1 to 5.7.0.17-1

How reproducible:
customer environment

Steps to Reproduce:
1. follow steps of https://access.redhat.com/documentation/en/red-hat-cloudforms/4.2/single/migrating-to-red-hat-cloudforms-42/
2. reach yum upgrade and finish it
3.

Actual results:
after yum upgrade :
[root@sample ~]# vmdb
[root@sample vmdb]# bin/rake db:migrate
Your bundle is locked to json (2.0.2), but that version could not be found in any of the sources listed in your Gemfile. If you haven't changed sources, that means the author of json (2.0.2) has removed it. You'll need to update your bundle to a different version of json (2.0.2) that hasn't been removed in order to install.
Run `bundle install` to install missing gems.
[root@sample vmdb]# bundle install
/opt/rh/rh-ruby23/root/usr/bin/ruby: error while loading shared libraries: libruby.so.2.3: cannot open shared object file: No such file or directory


Expected results:
db:migrate completes

Additional info:
reproduction in progress
no additional gems installed for customisations according to the customer.
output of yum update and error attached to the case.

Comment 2 Felix Dewaleyne 2017-01-19 14:32:40 UTC
additional note : the upgrade was rolled back.

Comment 4 Joe Rafaniello 2017-01-19 15:11:15 UTC
Did we logout and log back in after upgrade?  The ruby environment in the current shell is upgraded when upgrading the packages so we need to reload the shell by exiting and reloading the shell.

https://access.redhat.com/documentation/en/red-hat-cloudforms/4.2/paged/migrating-to-red-hat-cloudforms-42/

We need ruby -v to say 2.3.  
step 3

1.4.1. Migrating VMDB Appliances from CFME 5.6 to 5.7

Perform the following steps on your CFME 5.6 VMDB appliances to migrate to CFME 5.7. Ensure you wait for each command to finish before going to the next step:

1. Connect to the appliance using SSH.
2. Update packages:

# yum update
3. Log out of the appliance, then log back in to fully reload Ruby.


Let me know if this is not what happened.  Please provide the rpm packages list in the log directory, what does 'ruby -v' show when this error occurs.

Comment 5 Felix Dewaleyne 2017-01-19 17:21:14 UTC
(In reply to Joe Rafaniello from comment #4)
> Did we logout and log back in after upgrade?  The ruby environment in the
> current shell is upgraded when upgrading the packages so we need to reload
> the shell by exiting and reloading the shell.
> 
> https://access.redhat.com/documentation/en/red-hat-cloudforms/4.2/paged/
> migrating-to-red-hat-cloudforms-42/
> 
> We need ruby -v to say 2.3.  
> step 3
> 
> 1.4.1. Migrating VMDB Appliances from CFME 5.6 to 5.7
> 
> Perform the following steps on your CFME 5.6 VMDB appliances to migrate to
> CFME 5.7. Ensure you wait for each command to finish before going to the
> next step:
> 
> 1. Connect to the appliance using SSH.
> 2. Update packages:
> 
> # yum update
> 3. Log out of the appliance, then log back in to fully reload Ruby.
> 
> 
> Let me know if this is not what happened.  Please provide the rpm packages
> list in the log directory, what does 'ruby -v' show when this error occurs.

I believe the customer did not do that as suggested by the attachment

Comment 6 Felix Dewaleyne 2017-01-19 17:25:00 UTC
wouldn't it be possible to avoid those issues with a source command to force the session to reload?

Comment 7 Joe Rafaniello 2017-01-19 18:39:43 UTC
Hi Felix,

I know we tried several ways to force reloading the shell after rpm upgrade but were unable to get the whole SCL environment to "reload" in that current shell cleanly.  Can you open an RFE for us to research other solutions to avoid this problem?  I'm sure there's a way, I'm just hoping it's something we can do seamlessly.

I'm marking this as a duplicate of case 1378400 since the reported problems are the same.  We can re-open if this is not the case.

Thanks!
Joe

*** This bug has been marked as a duplicate of bug 1378400 ***

Comment 8 Joe Rafaniello 2017-01-19 22:18:58 UTC
Re-opening this bug since it's slightly different.  While failing to logoff/login in the shell can lead to the "Your bundle is locked..." message.  Just running bundle install can corrupt the cfme-gemset rpms's files and logoff/login won't fix it.

Summary: If you run bundle install and then get the "Your bundle is locked..." error, you can repair this after logoff/login using:

yum reinstall cfme-gemset


This recipe can cause this problem:

yum update...
vmdb
bin/rake db:migrate (fails due to forgetting to logout/login to reload the ruby environment)
bundle install (since the failure above asks you to bundle)
logout/login (since we forgot)
bundle install (since it failed before)... still fails with "Your bundle is locked..."

If we look, the gemset rpm is modified:

[root@localhost vmdb]# rpm -V cfme-gemset
S.5....T.    /opt/rh/cfme-gemset/vmdb/.bundle/config
S.5....T.    /opt/rh/cfme-gemset/vmdb/Gemfile.lock


yum reinstall cfme-gemset fixes this problem.

We need to make this less error prone.

Comment 9 Joe Rafaniello 2017-01-19 22:37:26 UTC
At the very least, the documentation should say not to run bundle since we use yum to install gems.

Comment 10 Joe Rafaniello 2017-01-19 23:06:06 UTC
In wonder if we can try the --frozen option in the gemset so bundle install will refuse to touch the Gemfile.lock: http://bundler.io/v1.13/man/bundle-install.1.html

Comment 11 Joe Rafaniello 2017-01-20 19:10:26 UTC
We have two proposed changes that we need to further research and test if they will fix this issue, where bundle modifies/corrupts the Gemfile.lock in a way you can only repair it by reinstalling the cfme-gemset rpm.

override_gem is a feature upstream that allows us to not modify the Gemfile while allowing us to use slightly different gems upstream/downstream, such as for CVEs or backports of bugfixes.  That should cause a `bundle install` or just `bundle` to leave the Gemfile.lock as it is with no changes.  Currently, there are changes in the dependencies constraints for our downstream Gemfile.  override_gem should allows us to have these changes but not change the Gemfile.lock.

The bundler frozen feature will lock our Gemfile.lock so even if there are changes to be done, it will refuse to do them.  This is necessary since all of our dependencies downstream are done through rpms and not through gems on rubygems.org.

Comment 12 Nick LaMuro 2017-01-23 17:07:19 UTC
These are just some comments based on the above:


Not sure about the override_gem, and I will have to do some research to confirm, but I am pretty sure that the `override_gem` feature we have:

https://github.com/ManageIQ/manageiq/blob/master/Gemfile#L152-L156

Is going to just modify the dependency list before it resolves them, allowing a different dependency to be used when one already exists.  As far as I can tell, `override_gem` isn't going to change how the Gemfile.lock is changed to cause this bug in the first place, but could be used in a way to generate the gemset RPM in a way that doesn't conflict with upstream.  But maybe I am interpreting the usages incorrectly.


On the other hand, the `--frozen` feature might be a good way of handling this, and prevent bundler from modifying the gemset by mistake.

Comment 14 Felix Dewaleyne 2017-01-25 10:42:00 UTC
should this case become the rfe on the topic : "prevent gem bundle from overriding the gems required by cloudforms" or something alike? or do you mean to open a new case entirely to deal with console reloads in the first place? I don't know if running a new instance of the shell wouldn't be enough?

Comment 15 Joe Rafaniello 2017-01-25 18:15:57 UTC
Thanks Felix.  I marked the needinfo before we found the circumstances that a user could easily run into this.  I have updated the description to reflect that during upgrade it's easy to be told to run bundle which will invalidate the Gemfile.lock shipped by our gemset rpm.

We'll fix that issue in this BZ.  The hope is we will no longer have resolution differences to warrant a .lock file and also pass the --frozen option so that bundler will refuse to modify this file.

Comment 16 Joe Rafaniello 2017-01-25 18:20:38 UTC
Reassigning to build although this is both in the appliance/build area.

The goal of this BZ is to lock the bundle we ship in the gemset so that bundler will refuse to  modify it.  We'll still need users to logoff/logon after upgrade to load the new ruby environment.

Comment 17 Josh Carter 2017-04-03 17:32:58 UTC
*** Bug 1436398 has been marked as a duplicate of this bug. ***

Comment 18 Satoe Imaishi 2017-07-28 17:30:33 UTC
I tried following the upgrade scenario mentioned in Comment 8, but I'm not able to reproduce the issue where Gemfile.lock gets corrupted.  After performing all the steps, the only file modified was .bundle/config and Gemfile.lock stayed intact.

There is one 'issue' I see regarding .bundle/config, but this isn't specific to migration and should be tracked separately. On CFME appliances (migrated or not), running 'bundle install' adds:

  BUNDLE_DISABLE_SHARED_GEMS: "true"

to .bundle/config, which is the expected 'bundle' behavior.

However that's problematic for us as our gems are in multiple locations and having BUNDLE_DISABLE_SHARED_GEMS=true will cause not finding some gems. If you run 'bundle check' at this point, it will complain some gems are missing, or if you try to run appliance_console, it will fail.

If you remove the BUNDLE_DISABLE_SHARED_GEMS line, you can run 'bundle check' and appliance_console again. Even if you set the value to 'false', running 'bundle install' again will put the value back to 'true' - that's what 'bundle' has always done and is expected to do. More about this can be found in https://bugzilla.redhat.com/show_bug.cgi?id=1225662#c15.

Probably we should close this BZ and open a new BZ to disable(?) running bundle install/update command?  Sending back to Appliance team so they can decide what to do...

Comment 21 Dave Johnson 2017-09-08 19:15:52 UTC
I agree and am closing.

Comment 22 Yuri Rudman 2017-10-04 20:38:57 UTC
*** Bug 1485741 has been marked as a duplicate of this bug. ***

Comment 23 Joe Rafaniello 2018-02-27 21:05:06 UTC
*** Bug 1483659 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.