Bug 1461132 - Cinder Backup Ceph incremental backup support
Summary: Cinder Backup Ceph incremental backup support
Keywords:
Status: CLOSED DUPLICATE of bug 1375207
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-cinder
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: 13.0 (Queens)
Assignee: Eric Harney
QA Contact: Avi Avraham
Don Domingo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-13 14:57 UTC by Gregory Charot
Modified: 2017-10-17 19:28 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
When using Red Hat Ceph Storage as a Block Storage backend for both Cinder volume and Cinder backup, any attempts to perform an incremental backup will result in a full backup instead, without any warning. This is a known issue.
Clone Of:
Environment:
Last Closed: 2017-08-10 18:52:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Gregory Charot 2017-06-13 14:57:16 UTC
Description of problem:

When backuping a cinder volume with Ceph as a backend, the operation terminates successfully but Cinder backup logs an error about rbd import-diff

Result is having full volume backups dump instead of incrementals.

Version-Release number of selected component (if applicable):
10 - Might be on 11 too haven't tried

How reproducible:
Always

Steps to Reproduce:
1. Create a 2G cinder volume
cinder create --name vol2 2

2. Backup the volume (volume is available)
openstack volume backup create  vol2

3. Create an incr backup (volume is available)
openstack volume backup create (--increment) vol2

Actual results:

Cinder backup fails with:

/var/log/cinder/backup.log
Importing image diff: 0% complete...failed.backup.drivers.ceph [req-08d753ea-8ce1-44e6-ac3b-72e4c7d6ab01 b1362809137541ce8a3de5ff55eedc27 9fd36bcb07884224b3ecd2e22d730931 - default default] RBD diff op failed - (ret=33 stderr=
rbd: import-diff failed: (33) Numerical argument out of domain

The error occurs during both backups steps (full and incremental)

If we looks at rbd images they all have the cinder volume size (2G)

# rbd du -p backups                                                                    PROVISIONED  USED
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.44977689-8287-4cdb-a551-6b15bf661e50       2048M 2048M
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.f7cd81fa-0c55-4ee5-9b52-bdf29b1ea76d       2048M 2044M
<TOTAL>                                                                                       4096M 4092M

Expected results:

Backups should use rbd-diff mechanism, this bug has an impact in terms of time to backup and space used. 

The full backup should looks like :
NAME                                                                                                                   PROVISIONED   USED
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base.snap.1497363445.68       2048M 32768k
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base                                                                      2048M      0

Additional info:

Looking for upstream bugs I found one[1] that matches this issue for mitaka with users still complaining for Newton/Ocata.

Using patch from comment #18 on file /usr/lib/python2.7/site-packages/os_brick/initiator/connectors/rbd.py seems to solves the issue. Redoing the full backup + one incremental gives :


rbd du -p backups
warning: fast-diff map is not enabled for volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base. operation may be slow.
NAME                                                                                                                   PROVISIONED   USED
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base.snap.1497361566.5        2048M 32768k
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base.snap.1497361619.98       2048M 16384k
volume-389ce208-9c67-483c-8128-46c43adcf436.backup.base                                                                      2048M      0
<TOTAL>                                                                                                                      2048M 49152k

However I still have the same error if the volume is in-use or if using --snapshot

[1] https://bugs.launchpad.net/cinder/+bug/1578036

Comment 1 Gregory Charot 2017-06-13 17:53:20 UTC
I came across https://bugzilla.redhat.com/show_bug.cgi?id=1375207 which is similar but when using a different Ceph cluster for backups.

In this BZ the backups and the volumes are in the same cluster (default config) so this bug affects both cases (same and different clusters)

Comment 5 Sean Cohen 2017-08-10 18:52:30 UTC

*** This bug has been marked as a duplicate of bug 1375207 ***

Comment 6 wesun 2017-10-17 19:28:15 UTC
We are running into the same issue and this is a separate bug from 1375207.


Note You need to log in before you can comment on or make changes to this bug.