Description of problem: while running the engine-backup tool with scope=all (include DWH db) this operation failed in case /tmp is have low space for the required process no warning is provided to the user on the engine-backup log /tmp folder is used for a temporary folder to create the dump file in our case : 1. DWH db size on the DWH VM is 33GB : [root@rhev-red-01-dwh ~]# du -sh /var/lib/pgsql/data/ 33G / var/lib/pgsql/data/ 2. /tmp folder size on the engine is 2.1GB : [root@rhev-red-01 ~]# df -Th /tmp/ Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/ovirt-tmp xfs 2.0G 47M 2.0G 3% /tmp 3. we have checked that in this case with DWH db size of 33GB the minimum required space for /tmp folder is 4.1GB from the engine-backup log no relevant error is provided regarding of low space for this operation : engine-backup --scope=all --mode=backup --file=rhev-red-01_backup_03102021_all.tar --log=rhev-red-01_backup_03102021_all.log Start of engine-backup with mode 'backup' scope: all archive file: rhev-red-01_backup_03102021_all.tar log file: rhev-red-01_backup_03102021_all.log Backing up: Notifying engine - Files - Engine database 'engine' - DWH database 'ovirt_engine_history' Notifying engine FATAL: Database ovirt_engine_history backup failed from the backup log : 2021-10-03 09:12:21 3945450: Start of engine-backup mode backup scope all file rhev-red-01_backup_03102021_all.tar 2021-10-03 09:12:21 3945450: OUTPUT: Start of engine-backup with mode 'backup' 2021-10-03 09:12:21 3945450: OUTPUT: scope: all 2021-10-03 09:12:21 3945450: OUTPUT: archive file: rhev-red-01_backup_03102021_all.tar 2021-10-03 09:12:21 3945450: OUTPUT: log file: rhev-red-01_backup_03102021_all.log 2021-10-03 09:12:21 3945450: OUTPUT: Backing up: 2021-10-03 09:12:21 3945450: Generating pgpass 2021-10-03 09:12:21 3945450: OUTPUT: Notifying engine 2021-10-03 09:12:21 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('files', now(), 0, 'engine-backup: Backup Started, scope=files, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:12:21 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('db', now(), 0, 'engine-backup: Backup Started, scope=db, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:12:21 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('dwhdb', now(), 0, 'engine-backup: Backup Started, scope=dwhdb, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:12:21 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('cinderlib', now(), 0, 'engine-backup: Backup Started, scope=cinderlib, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:12:21 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('grafanadb', now(), 0, 'engine-backup: Backup Started, scope=grafanadb, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:12:21 3945450: Creating temp folder /tmp/engine-backup.nRZ4XZ7gmn/tar 2021-10-03 09:12:21 3945450: OUTPUT: - Files 2021-10-03 09:12:21 3945450: Backing up files to /tmp/engine-backup.nRZ4XZ7gmn/tar/files 2021-10-03 09:12:41 3945450: OUTPUT: - Engine database 'engine' 2021-10-03 09:12:41 3945450: Backing up database to /tmp/engine-backup.nRZ4XZ7gmn/tar/db/engine_backup.db 2021-10-03 09:12:41 3945450: pg_cmd running: pg_dump -w -U engine -h localhost -p 5432 engine -E UTF8 --disable-dollar-quoting --disable-triggers --format=custom 2021-10-03 09:12:50 3945450: OUTPUT: - DWH database 'ovirt_engine_history' 2021-10-03 09:12:50 3945450: Backing up dwh database to /tmp/engine-backup.nRZ4XZ7gmn/tar/db/dwh_backup.db 2021-10-03 09:12:50 3945450: pg_cmd running: pg_dump -w -U ovirt_engine_history -h 172.29.91.192 -p 5432 ovirt_engine_history -E UTF8 --disable-dollar-quoting --disable-triggers --format=custom 2021-10-03 09:16:12 3945450: FATAL: Database ovirt_engine_history backup failed 2021-10-03 09:16:12 3945450: OUTPUT: Notifying engine 2021-10-03 09:16:12 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('files', now(), -1, 'engine-backup: Database ovirt_engine_history backup failed, scope=files, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:16:12 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('db', now(), -1, 'engine-backup: Database ovirt_engine_history backup failed, scope=db, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:16:12 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('dwhdb', now(), -1, 'engine-backup: Database ovirt_engine_history backup failed, scope=dwhdb, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:16:12 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('cinderlib', now(), -1, 'engine-backup: Database ovirt_engine_history backup failed, scope=cinderlib, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); 2021-10-03 09:16:12 3945450: pg_cmd running: psql -w -U engine -h localhost -p 5432 engine -t -c SELECT LogEngineBackupEvent('grafanadb', now(), -1, 'engine-backup: Database ovirt_engine_history backup failed, scope=grafanadb, log=/rhev-red-01_backup_03102021_all.log', 'rhev-red-01.rdu2.scalelab.redhat.com', '/rhev-red-01_backup_03102021_all.log'); Version-Release number of selected component (if applicable): rhv-release-4.4.9-4-001.noarch Additional info: the workaround is to run the engine-backup with --dirtmp to a path with more space!
This bug is not very easy to fix, because the pg_dump log goes to the same temporary space of the dump itself, so if this space is exhausted, the log will have no indication either. You can already see this in the example in comment 0 - there are no errors from it. In theory we can try various complex things like keeping it in memory, or elsewhere, or pre-allocate space for the log, but not sure it's worth it. Some other options: - We can add a generic error message, always. - We can check free space on the tmpdir after failures, and error if it's (close to) full. - We can guess that the free space needed is, say, at least 30% of the db size or so, and warn/err/abort if it's not enough. That's just a guess, though - the dump is compressed, and the compression ratio depends on the actual data. We can check the size using 'pg_database_size' (so also from remote), which in the case referenced in comment 0 returned around 19GB.
*** Bug 2010075 has been marked as a duplicate of this bug. ***
(In reply to Yedidyah Bar David from comment #1) which in the case referenced in comment 0 returned around 19GB. the original size of the DWH db was 33GB when I opened the BZ after I have run on the DWH table > "ovirt_engine_history" > vacuum full analyze in order to decrease the size of the DWH db.
(In reply to Tzahi Ashkenazi from comment #3) > (In reply to Yedidyah Bar David from comment #1) > > which in the case referenced in comment 0 returned around 19GB. > > the original size of the DWH db was 33GB when I opened the BZ > after I have run on the DWH table > "ovirt_engine_history" > vacuum full > analyze > in order to decrease the size of the DWH db. This makes sense - I suppose after a full vacuum, 'du' and 'pg_database_size' should be quite similar.
We can go with either a generic message or just check free space on error and issue the specific error message.
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022. Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.