Description of problem: Running Sat 6.1.3 as an VMware guest (12 vCPU, 24G RAM, RHEL 7.1, XFS, storage does about 300MB/s). /var/lib/pulp is 64GB big. Running "katello-backup /mnt/backup" takes about 66 mintes (same storage as /var/lib/pulp) out of which around 63 minutes are used to backup pulp. Version-Release number of selected component (if applicable): Satellite 6.1.3 How reproducible: always Steps to Reproduce: 1. Install Sat6.1.3 2. Sync content 3. run katello-backup Actual results: Plain katello-backup on a fresh Sat6.1.3 with ~64GB content synced, one content-view, one attached system: real 66m32.761s user 51m49.001s sys 7m48.724s Expected results: faster :-) Copying that amount of data in 10-20 minutes seems possible. Additional info: I removed "-v" and "-z" from the tar call for /var/lib/pulp in katello-backup and got the following time: real 19m59.646s user 2m11.093s sys 7m18.084s Compressing RPMs sounds useless, as they are already compressed (the initial backup has compressed the 64G to 61G). Also running tar in verbose mode slows things down, as tar has to output every single file to a terminal then.
We would like to get this out to the customers fast. Due to this issue - we are wasting cpu ressources - the backup of /var/lib/pulp/ /var/www/pub/ is limited by the speed of one cpu, which gets occupied by gzip - the backup is very slow
This was merged upstream. Anybody wants a backported patch for the version in Sat6.1.4?
VERIFIED: # rpm -qa | grep foreman dell-pe1950-05.rhts.eng.brq.redhat.com-foreman-client-1.0-1.noarch dell-pe1950-05.rhts.eng.brq.redhat.com-foreman-proxy-1.0-1.noarch ruby193-rubygem-foreman_hooks-0.3.7-2.el7sat.noarch puppet-foreman_scap_client-0.3.3-10.el7sat.noarch foreman-vmware-1.7.2.50-1.el7sat.noarch rubygem-hammer_cli_foreman_tasks-0.0.3.5-1.el7sat.noarch foreman-ovirt-1.7.2.50-1.el7sat.noarch ruby193-rubygem-foreman_gutterball-0.0.1.9-1.el7sat.noarch foreman-1.7.2.50-1.el7sat.noarch ruby193-rubygem-foreman_docker-1.2.0.24-1.el7sat.noarch ruby193-rubygem-foreman-tasks-0.6.15.7-1.el7sat.noarch rubygem-hammer_cli_foreman_bootdisk-0.1.2.7-1.el7sat.noarch rubygem-hammer_cli_foreman_docker-0.0.3.10-1.el7sat.noarch foreman-debug-1.7.2.50-1.el7sat.noarch foreman-proxy-1.7.2.8-1.el7sat.noarch dell-pe1950-05.rhts.eng.brq.redhat.com-foreman-proxy-client-1.0-1.noarch foreman-discovery-image-3.0.5-3.el7sat.noarch foreman-libvirt-1.7.2.50-1.el7sat.noarch ruby193-rubygem-foreman_openscap-0.3.2.10-1.el7sat.noarch foreman-gce-1.7.2.50-1.el7sat.noarch rubygem-hammer_cli_foreman-0.1.4.15-1.el7sat.noarch ruby193-rubygem-foreman_discovery-2.0.0.23-1.el7sat.noarch foreman-selinux-1.7.2.17-1.el7sat.noarch foreman-postgresql-1.7.2.50-1.el7sat.noarch foreman-compute-1.7.2.50-1.el7sat.noarch ruby193-rubygem-foreman-redhat_access-0.2.4-1.el7sat.noarch rubygem-hammer_cli_foreman_discovery-0.0.1.10-1.el7sat.noarch ruby193-rubygem-foreman_bootdisk-4.0.2.14-1.el7sat.noarch steps: After Sync content time katello-backup /tmp/katello_backup Success! ** BACKUP Complete, contents can be found in: /tmp/katello_backup real 11m57.641s user 6m41.736s sys 0m23.281s
I was about to open a new BZ, but found this. I'd like to ask why katello-backup compresses pulp data: echo "Backing up Pulp data... " tar --selinux -czvf pulp_data.tar.gz /var/lib/pulp/ /var/www/pub/ echo "Done." Aren't rpms already compressed? Generally binary data, not going to benefit from gzip compression. This is in 6.1.5
The backup of /var/lib/pulp should probably use rsync instead of tar. Not only does rsync utilize multiple processors better than tar, but subsequent runs will benefit greatly if an artifact has already been archived, no need to copy it again.
Abel, when using "rsync" we would get a complete filesystem structure replicated. This is a whole different concept, the destination filesystem would then also have to support ownerships, permissions and so on. With the current strategy of creating a file - we can also i.e. backup to a vfat - and we get a single, good to handle file > Aren't rpms already compressed? Generally binary data, not going > to benefit from gzip compression. Correct, we are approaching this issue here in the bz.
If this bug requires doc text for errata release, please provide draft text in the doc text field in the following format: Cause: Consequence: Fix: Result: The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -.
I think we do not need any doc text here.
(In reply to Evgeni Golov from comment #11) > I think we do not need any doc text here. When I read releasenotes/changelogs then I am grateful to see info like "In the past, also backups of the packages were compressed. With this fix, no compression is attempted any more.". There are probably guidelines to decide if doctext should be done or not.
The doc text currently reads: "Katello backups were slow due to compressing rpm data which is already compressed. The backup script was fixed to not do redundant backups." Does that mean the RPM data is not backed up, or it is not compressed before it is backed up (which I thought was the actual issue)? I'd like to get this clarified before I approve the doc text. Apologies if this is apparent in the bug but I missed it. thanks
> "Katello backups were slow due to compressing rpm data which is already > compressed. The backup script was fixed to not do redundant backups." That is wrong, I suggest "Katello backups were slow due to compressing rpm data which is already compressed. The backup script was fixed to not do redundant compression of the rpm data." > Does that mean the RPM data is not backed up, or it is not compressed > before it is backed up (which I thought was the actual issue)? I'd like > to get this clarified before I approve the doc text. Apologies if this > is apparent in the bug but I missed it. The latter one. Can be verified in the description; not the directories to backup are changed, but compression and verbosity options removed: > I removed "-v" and "-z" from the tar call for /var/lib/pulp in > katello-backup and got the following time: [..]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:0052