Bug 1715898

Summary:	Disk space check during mongo storage upgrade to wiredtiger failing and dropping database
Product:	Red Hat Satellite	Reporter:	Mike McCune <mmccune>
Component:	Installation	Assignee:	Chris Roberts <chrobert>
Status:	CLOSED ERRATA	QA Contact:	Devendra Singh <desingh>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	Unspecified	CC:	andrew.schofield, chrobert, gpadholi, inecas, mbacovsk, pdwyer, saydas, zhunting
Target Milestone:	6.6.0	Keywords:	Triaged, Upgrades
Target Release:	Unused
Hardware:	Unspecified
OS:	Unspecified
URL:	https://projects.theforeman.org/issues/27826
Whiteboard:
Fixed In Version:	foreman-installer-1.22.0.12-1	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-10-22 12:47:37 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Mike McCune 2019-05-31 14:50:29 UTC

Running:

satellite-installer --upgrade-mongo-storage-engine

When this happens:

2019-05-29T15:59:51.497-0400 Failed: error writing data for collection `pulp_database.units_rpm` to disk: error writing to file: write /var/tmp/mongodb_engine_upgrade/pulp_database/units_rpm.bson: no space left on device
mongodump --host localhost --out /var/tmp/mongodb_engine_upgrade failed! Check the output for error!

Then immediately followed by:

mv: error writing ‘/var/tmp/mongodb_backup/diagnostic.data/metrics.2019-05-28T14-59-39Z-00000’: No space left on device
mv: failed to extend ‘/var/tmp/mongodb_backup/diagnostic.data/metrics.2019-05-28T14-59-39Z-00000’: No space left on device
mv /var/lib/mongodb/* /var/tmp/mongodb_backup failed! Check the output for error!
sed -i.bak -e 's/mmapv1/wiredTiger/g' /etc/opt/rh/rh-mongodb34/mongod.conf finished successfully!
mv: cannot create regular file ‘/var/tmp/mongodb_backup/mongod.conf.bak’: No space left on device
mv /etc/opt/rh/rh-mongodb34/mongod.conf.bak /var/tmp/mongodb_backup failed! Check the output for error!

Then:

mongorestore --host localhost --db=pulp_database --drop --quiet --dir=/var/tmp/mongodb_engine_upgrade/pulp_database failed! Check the output for error!

Result is no database.

Expectations:

1. There should be a check and warning that space is required in /var/tmp/mongodb_backup to hold the database
2. A mongorestore with --drop should ONLY be attempted if the prior steps complete successfully

Where are you experiencing the behavior? What environment?

Satellite 6.5 upgrading to wiredtiger

Comment 3 Andrew Schofield 2019-06-03 20:23:04 UTC

Error is in /usr/share/katello-installer-base/hooks/pre_validations/30-mongo_storage_engine.rb (from the satellite installer):


MONGO_DIR = '/var/lib/mongodb/'

and later:

mongo_size = File.directory?(MONGO_DIR) ? `du -s  #{@MONGO_DIR}`.split[0].to_i : 0

Comment 4 Chris Roberts 2019-06-21 18:20:57 UTC

*** Bug 1715960 has been marked as a duplicate of this bug. ***

Comment 10 Devendra Singh 2019-09-25 12:19:13 UTC

I have verified space-related issue during mongodb storage engine upgrade and get a message like below.

# satellite-installer --upgrade-mongo-storage-engine
Starting disk space check for upgrade
There is not enough free space 219571076, the size of MongoDB database is 227753604, please add additional space to /var/tmp and try again, exiting.

Kindly confirm, Is it what we expect?

Comment 11 Chris Roberts 2019-10-08 00:11:51 UTC

Hi Devendra,

Sorry for the late reply, I have been catching back up on things after the Katello meetup. 

I put this in the commit message upstream to show what we are solving with this:

- Removed moving the contents to the backup dir since we did a dump, this was causing large mongo databases to quickly fill up /var/tmp

- Added better recovery steps incase of failure

The goal is to not ever get to a point where a customer has to worry about --drop, so we didnt account for it. One of the reasons a lot of customers were hitting the issue was because we were taking up x3 the diskspace on the filesystem. 1 - Mongo itself, 2 we were moving all files over instead of running an rm -rf and 3rd we were doing a mongo dump. This could easily run a customer out of diskspace if they had a 500 GB MongoDB needing 1.5 TB of space now. Now we have done better in that area.

The --drop was brought up since this was a copy of the customer comment in a case which this bug was created from, but by having better recovery we do not need to change the --drop. There was a reason it was needed as well, I would really have to dig to see the history about that, but that is why it was there in the first place.

Comment 12 Devendra Singh 2019-10-09 09:37:48 UTC

As per your comment, We put the very first check on whether the system is ready for an upgrade or not in the very beginning, which verifies the first expectation.
And another improvement in disk-space occupancy during the upgrade that protects to use of huge space during this operation.

Do we need to verify something related to mongorestore and "--drop" option, If yes, then what is the procedure?

If not then I am expecting the verification of this bug has been completed.

Comment 13 Devendra Singh 2019-10-10 10:54:56 UTC

As discussed with Chris, the verification already covered in the disk space check and that was the main fix of this bug.
Now I am marking this issue as verified.

Comment 15 errata-xmlrpc 2019-10-22 12:47:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3172