Bug 1703278
Summary: | Scheduled NFS backup is not directly stored on NFS location rather stored on /tmp and then moved to NFS storage. | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Nikhil Gupta <ngupta> | ||||
Component: | Appliance | Assignee: | Nick LaMuro <nlamuro> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Jaroslav Henner <jhenner> | ||||
Severity: | high | Docs Contact: | Red Hat CloudForms Documentation <cloudforms-docs> | ||||
Priority: | high | ||||||
Version: | 5.10.3 | CC: | abellott, dmetzger, jhenner, jocarter, michael.moir, mshriver, nlamuro, obarenbo, sigbjorn, simaishi | ||||
Target Milestone: | GA | Keywords: | TestOnly, ZStream | ||||
Target Release: | 5.11.0 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | 5.11.0.5 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1717025 (view as bug list) | Environment: | |||||
Last Closed: | 2019-12-13 14:55:27 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | Bug | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | CFME Core | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1732808 | ||||||
Bug Blocks: | 1704905, 1717025 | ||||||
Attachments: |
|
Description
Nikhil Gupta
2019-04-26 01:22:32 UTC
Have a proposed fix for this issue pushed: https://github.com/ManageIQ/manageiq/pull/18745/files Working on validating this fixes things and getting it reviewed. -Nick Dennis, I'm not sure why this was reassigned to me since it appears from reading through all the comments that Nick L already has a fix merged and he is just waiting for that PR to be backported before this BZ can be moved to POST. Jerry I tried to use inotifywait to watch the events happening in /tmp dir on cfme-5.11.0.15-1.el8cf.x86_64. I did find that some files are still created in /tmp: # inotifywait -mr /tmp ... /tmp/ CREATE,ISDIR miq_20190723-7895-1vb9rj4 /tmp/ OPEN,ISDIR miq_20190723-7895-1vb9rj4 /tmp/ ACCESS,ISDIR miq_20190723-7895-1vb9rj4 /tmp/ CLOSE_NOWRITE,CLOSE,ISDIR miq_20190723-7895-1vb9rj4 /tmp/ CREATE 20190723-7895-k9a36r /tmp/ OPEN 20190723-7895-k9a36r /tmp/ CREATE 20190723-7895-10k4lz1 /tmp/ OPEN 20190723-7895-10k4lz1 /tmp/ OPEN 20190723-7895-k9a36r /tmp/ MODIFY 20190723-7895-10k4lz1 /tmp/ CLOSE_WRITE,CLOSE 20190723-7895-10k4lz1 /tmp/ CLOSE_WRITE,CLOSE 20190723-7895-k9a36r /tmp/ OPEN 20190723-7895-10k4lz1 /tmp/ ACCESS 20190723-7895-10k4lz1 /tmp/ CLOSE_NOWRITE,CLOSE 20190723-7895-10k4lz1 /tmp/ DELETE 20190723-7895-10k4lz1 /tmp/ CLOSE_WRITE,CLOSE 20190723-7895-k9a36r /tmp/ DELETE 20190723-7895-k9a36r /tmp/miq_20190723-7895-1vb9rj4/ DELETE_SELF /tmp/ DELETE,ISDIR miq_20190723-7895-1vb9rj4 But it is true that on the cfme-5.10.7.1-1.el7cf.x86_64 there are more events: [root@dhcp-8-198-222 ~]# inotifywait -mr /tmp/ | tee files Setting up watches. Beware: since -r was given, this may take a while! Watches established. /tmp/ CREATE,ISDIR miq_20190723-12303-ba57n7 /tmp/ OPEN,ISDIR miq_20190723-12303-ba57n7 /tmp/ CLOSE_NOWRITE,CLOSE,ISDIR miq_20190723-12303-ba57n7 /tmp/ CREATE 20190723-12303-131ag0m /tmp/ OPEN 20190723-12303-131ag0m /tmp/ CREATE 20190723-12303-1ugpteq /tmp/ OPEN 20190723-12303-1ugpteq /tmp/ OPEN 20190723-12303-131ag0m /tmp/ MODIFY 20190723-12303-131ag0m /tmp/ ACCESS 20190723-12303-131ag0m /tmp/ MODIFY 20190723-12303-131ag0m /tmp/ ACCESS 20190723-12303-131ag0m ... *** many more times *** ... /tmp/ MODIFY 20190723-12303-131ag0m /tmp/ ACCESS 20190723-12303-131ag0m /tmp/ CLOSE_WRITE,CLOSE 20190723-12303-1ugpteq /tmp/ CLOSE_WRITE,CLOSE 20190723-12303-131ag0m /tmp/ DELETE 20190723-12303-1ugpteq /tmp/ CLOSE_WRITE,CLOSE 20190723-12303-131ag0m /tmp/ DELETE 20190723-12303-131ag0m /tmp/miq_20190723-12303-ba57n7/ DELETE_SELF /tmp/ DELETE,ISDIR miq_20190723-12303-ba57n7 Maybe I am to strict here, but while the inotify test is certainly not bad, it is not completely conclusive, because I am not sure what file name to search for in the inotify output to prove the file is not created there. So to really test this, I need to get a big DB, restore it on some appliance and then schedule the backup. Then I will see whether it fails because of filling the /tmp completely. First off: I am all for you testing with a big database. I didn't have one available when I was doing the refactoring for this last summer, so it would be nice for this to get a proper stress test. That said, the only thing that should be in the `/tmp` dir is a FIFO file that is in charge of streaming the data from the output of the `pg_dump`/`pg_basebackup` to the input of the file it is being sent to. This was done in ruby to allow it to be streamed in the same fashion across all backup endpoints. So what you might be seeing is the FIFO being hit a bunch, but nothing should be committed to disk long term. But I am not an expert with `inotifywait`, so unsure. It didn't work. I got [----] W, [2019-08-21T11:35:22.876992 #9628:2afa1dfbca24] WARN -- : MIQ(EvmDatabaseOps.validate_free_space) Destination location: [/tmp/miq_20190821-9628-1vfpo3o/db_backup/region_34/test/region_34_20190821_153522.backup], does not have enough free disk space: [9221177344 bytes] for database of size: [12932101607 bytes] ... /var/www/miq/vmdb/lib/evm_database_ops.rb:41:in `validate_free_space': Destination location: [/tmp/miq_20190821-9628-1vfpo3o/db_backup/region_34/test/region_34_20190821_153522.backup], does not have enough free disk space: [9221177344 bytes] for database of size: [12932101607 bytes] (MiqException::MiqDatabaseBackupInsufficientSpace) ... [----] E, [2019-08-21T11:35:23.017358 #9628:2afa176a85c4] ERROR -- : MIQ(MiqQueue#deliver) Message id: [34000043942339], Error: [undefined method `path' for nil:NilClass] [----] E, [2019-08-21T11:35:23.017836 #9628:2afa176a85c4] ERROR -- : [NoMethodError]: undefined method `path' for nil:NilClass Method:[block (2 levels) in <class:LogProxy>] [----] E, [2019-08-21T11:35:23.018249 #9628:2afa176a85c4] ERROR -- : /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/mount/miq_generic_mount_session.rb:493:in `source_for_log' /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/mount/miq_generic_mount_session.rb:265:in `rescue in add' /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/mount/miq_generic_mount_session.rb:261:in `add' /var/www/miq/vmdb/lib/evm_database_ops.rb:160:in `block in with_file_storage' /opt/rh/cfme-gemset/bundler/gems/cfme-gems-pending-ca1c762f8036/lib/gems/pending/util/miq_file_storage.rb:31:in `with_interface_class' /var/www/miq/vmdb/lib/evm_database_ops.rb:133:in `with_file_storage' /var/www/miq/vmdb/lib/evm_database_ops.rb:57:in `backup' /var/www/miq/vmdb/app/models/database_backup.rb:53:in `_backup' /var/www/miq/vmdb/app/models/database_backup.rb:37:in `backup' /var/www/miq/vmdb/app/models/database_backup.rb:14:in `backup' /var/www/miq/vmdb/app/models/miq_queue.rb:479:in `block in dispatch_method' /usr/share/ruby/timeout.rb:93:in `block in timeout' /usr/share/ruby/timeout.rb:33:in `block in catch' /usr/share/ruby/timeout.rb:33:in `catch' Created attachment 1606635 [details]
evm.log
Jaroslav, Can you provide some additional information about the error: 1. The command trying to be run 2. The out of `df -P LOCATION_OF_DB_DUMP_DIR`, or similar 3. The type of mount (I assume NFS, but unsure) 4. The version of MIQ/CFME you are now testing with Based on the error you provided above, it seems to be working as expected, with only 9Gigs free on the location it is targeting, but 12 gigs being required by the DB, but I don't know if that is `/tmp` or the share. Thanks, -Nick |