Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1461132

Summary: Cinder Backup Ceph incremental backup support
Product: Red Hat OpenStack Reporter: Gregory Charot <gcharot>
Component: openstack-cinderAssignee: Eric Harney <eharney>
Status: CLOSED DUPLICATE QA Contact: Avi Avraham <aavraham>
Severity: unspecified Docs Contact: Don Domingo <ddomingo>
Priority: high    
Version: 10.0 (Newton)CC: ddomingo, eharney, geguileo, jobernar, pgrist, scohen, srevivo, wesun
Target Milestone: ---Keywords: FutureFeature
Target Release: 13.0 (Queens)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
When using Red Hat Ceph Storage as a Block Storage backend for both Cinder volume and Cinder backup, any attempts to perform an incremental backup will result in a full backup instead, without any warning. This is a known issue.
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-10 18:52:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gregory Charot 2017-06-13 14:57:16 UTC
Description of problem:

When backuping a cinder volume with Ceph as a backend, the operation terminates successfully but Cinder backup logs an error about rbd import-diff

Result is having full volume backups dump instead of incrementals.

Version-Release number of selected component (if applicable):
10 - Might be on 11 too haven't tried

How reproducible:
Always

Steps to Reproduce:
1. Create a 2G cinder volume
cinder create --name vol2 2

2. Backup the volume (volume is available)
openstack volume backup create  vol2

3. Create an incr backup (volume is available)
openstack volume backup create (--increment) vol2

Actual results:

Cinder backup fails with:

/var/log/cinder/backup.log
Importing image diff: 0% complete...failed.backup.drivers.ceph [req-08d753ea-8ce1-44e6-ac3b-72e4c7d6ab01 b1362809137541ce8a3de5ff55eedc27 9fd36bcb07884224b3ecd2e22d730931 - default default] RBD diff op failed - (ret=33 stderr=
rbd: import-diff failed: (33) Numerical argument out of domain

The error occurs during both backups steps (full and incremental)

If we looks at rbd images they all have the cinder volume size (2G)

# rbd du -p backups                                                                    PROVISIONED  USED
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.44977689-8287-4cdb-a551-6b15bf661e50       2048M 2048M
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.f7cd81fa-0c55-4ee5-9b52-bdf29b1ea76d       2048M 2044M
<TOTAL>                                                                                       4096M 4092M

Expected results:

Backups should use rbd-diff mechanism, this bug has an impact in terms of time to backup and space used. 

The full backup should looks like :
NAME                                                                                                                   PROVISIONED   USED
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base.snap.1497363445.68       2048M 32768k
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base                                                                      2048M      0

Additional info:

Looking for upstream bugs I found one[1] that matches this issue for mitaka with users still complaining for Newton/Ocata.

Using patch from comment #18 on file /usr/lib/python2.7/site-packages/os_brick/initiator/connectors/rbd.py seems to solves the issue. Redoing the full backup + one incremental gives :


rbd du -p backups
warning: fast-diff map is not enabled for volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base. operation may be slow.
NAME                                                                                                                   PROVISIONED   USED
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base.snap.1497361566.5        2048M 32768k
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base.snap.1497361619.98       2048M 16384k
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base                                                                      2048M      0
<TOTAL>                                                                                                                      2048M 49152k

However I still have the same error if the volume is in-use or if using --snapshot

[1] https://bugs.launchpad.net/cinder/+bug/1578036

Comment 1 Gregory Charot 2017-06-13 17:53:20 UTC
I came across https://bugzilla.redhat.com/show_bug.cgi?id=1375207 which is similar but when using a different Ceph cluster for backups.

In this BZ the backups and the volumes are in the same cluster (default config) so this bug affects both cases (same and different clusters)

Comment 5 Sean Cohen 2017-08-10 18:52:30 UTC

*** This bug has been marked as a duplicate of bug 1375207 ***

Comment 6 wesun 2017-10-17 19:28:15 UTC
We are running into the same issue and this is a separate bug from 1375207.