Bug 1715898 - Disk space check during mongo storage upgrade to wiredtiger failing and dropping database [NEEDINFO]
Summary: Disk space check during mongo storage upgrade to wiredtiger failing and dropp...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Installer
Version: Unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
high vote
Target Milestone: 6.6.0
Assignee: Chris Roberts
QA Contact: Devendra Singh
URL: https://projects.theforeman.org/issue...
Whiteboard:
: 1715960 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-31 14:50 UTC by Mike McCune
Modified: 2019-11-27 10:04 UTC (History)
8 users (show)

Fixed In Version: foreman-installer-1.22.0.12-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-22 12:47:37 UTC
Target Upstream Version:
desingh: needinfo? (chrobert)


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Foreman Issue Tracker 27826 'High' 'Closed' 'Disk space check during mongo storage upgrade to wiredtiger failing and dropping database' 2019-12-03 15:39:58 UTC
Github theforeman foreman-installer pull 388 'None' 'closed' 'Fixes #27826 - Update Mongo upgrade recovery logic' 2019-12-03 15:39:58 UTC
Red Hat Knowledge Base (Solution) 4308411 None None None 2019-07-30 00:37:38 UTC
Red Hat Product Errata RHSA-2019:3172 None None None 2019-10-22 12:47:46 UTC

Description Mike McCune 2019-05-31 14:50:29 UTC
Running:

satellite-installer --upgrade-mongo-storage-engine

When this happens:

2019-05-29T15:59:51.497-0400    Failed: error writing data for collection `pulp_database.units_rpm` to disk: error writing to file: write /var/tmp/mongodb_engine_upgrade/pulp_database/units_rpm.bson: no space left on device
mongodump --host localhost --out /var/tmp/mongodb_engine_upgrade failed! Check the output for error!

Then immediately followed by:

mv: error writing ‘/var/tmp/mongodb_backup/diagnostic.data/metrics.2019-05-28T14-59-39Z-00000’: No space left on device
mv: failed to extend ‘/var/tmp/mongodb_backup/diagnostic.data/metrics.2019-05-28T14-59-39Z-00000’: No space left on device
mv /var/lib/mongodb/* /var/tmp/mongodb_backup failed! Check the output for error!
sed -i.bak -e 's/mmapv1/wiredTiger/g' /etc/opt/rh/rh-mongodb34/mongod.conf finished successfully!
mv: cannot create regular file ‘/var/tmp/mongodb_backup/mongod.conf.bak’: No space left on device
mv /etc/opt/rh/rh-mongodb34/mongod.conf.bak /var/tmp/mongodb_backup failed! Check the output for error!

Then:

mongorestore --host localhost --db=pulp_database --drop --quiet --dir=/var/tmp/mongodb_engine_upgrade/pulp_database failed! Check the output for error!

Result is no database.

Expectations:

1. There should be a check and warning that space is required in /var/tmp/mongodb_backup to hold the database
2. A mongorestore with --drop should ONLY be attempted if the prior steps complete successfully

Where are you experiencing the behavior?  What environment?

Satellite 6.5 upgrading to wiredtiger

Comment 3 Andrew Schofield 2019-06-03 20:23:04 UTC
Error is in /usr/share/katello-installer-base/hooks/pre_validations/30-mongo_storage_engine.rb (from the satellite installer):


MONGO_DIR = '/var/lib/mongodb/'

and later:

mongo_size = File.directory?(MONGO_DIR) ? `du -s  #{@MONGO_DIR}`.split[0].to_i : 0

Comment 4 Chris Roberts 2019-06-21 18:20:57 UTC
*** Bug 1715960 has been marked as a duplicate of this bug. ***

Comment 10 Devendra Singh 2019-09-25 12:19:13 UTC
I have verified space-related issue during mongodb storage engine upgrade and get a message like below.

# satellite-installer --upgrade-mongo-storage-engine
Starting disk space check for upgrade
There is not enough free space 219571076, the size of MongoDB database is 227753604, please add additional space to /var/tmp and try again, exiting.

Kindly confirm, Is it what we expect?

Comment 11 Chris Roberts 2019-10-08 00:11:51 UTC
Hi Devendra,

Sorry for the late reply, I have been catching back up on things after the Katello meetup. 

I put this in the commit message upstream to show what we are solving with this:

- Removed moving the contents to the backup dir since we did a dump, this was causing large mongo databases to quickly fill up /var/tmp

- Added better recovery steps incase of failure

The goal is to not ever get to a point where a customer has to worry about --drop, so we didnt account for it. One of the reasons a lot of customers were hitting the issue was because we were taking up x3 the diskspace on the filesystem. 1 - Mongo itself, 2 we were moving all files over instead of running an rm -rf and 3rd we were doing a mongo dump. This could easily run a customer out of diskspace if they had a 500 GB MongoDB needing 1.5 TB of space now. Now we have done better in that area.

The --drop was brought up since this was a copy of the customer comment in a case which this bug was created from, but by having better recovery we do not need to change the --drop. There was a reason it was needed as well, I would really have to dig to see the history about that, but that is why it was there in the first place.

Comment 12 Devendra Singh 2019-10-09 09:37:48 UTC
As per your comment, We put the very first check on whether the system is ready for an upgrade or not in the very beginning, which verifies the first expectation.
And another improvement in disk-space occupancy during the upgrade that protects to use of huge space during this operation.

Do we need to verify something related to mongorestore and "--drop" option, If yes, then what is the procedure?

If not then I am expecting the verification of this bug has been completed.

Comment 13 Devendra Singh 2019-10-10 10:54:56 UTC
As discussed with Chris, the verification already covered in the disk space check and that was the main fix of this bug.
Now I am marking this issue as verified.

Comment 15 errata-xmlrpc 2019-10-22 12:47:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3172


Note You need to log in before you can comment on or make changes to this bug.