Bug 1062848 - [RHS-RHOS] Root disk corruption on a nova instance booted from a cinder volume after a remove-brick/rebalance
Summary: [RHS-RHOS] Root disk corruption on a nova instance booted from a cinder volum...
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: distribute
Version: 2.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Nithya Balachandran
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1286133
TreeView+ depends on / blocked
 
Reported: 2014-02-08 08:56 UTC by shilpa
Modified: 2015-11-27 11:43 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1286133 (view as bug list)
Environment:
Last Closed: 2015-11-27 11:43:02 UTC
Embargoed:


Attachments (Terms of Use)
Log messages from VM instance (80.46 KB, image/png)
2014-02-08 08:58 UTC, shilpa
no flags Details

Description shilpa 2014-02-08 08:56:04 UTC
Description of problem:
When a nova instance is rebooted while rebalance is in progress on the gluster volume, the root filesystem is mounted R/O after the instance comes back up. Corruption messages are seen. 


Version-Release number of selected component (if applicable):
glusterfs-3.4.0.59rhs-1.el6_4.x86_64

How reproducible: Always


Steps to Reproduce:
1. Create two 6*2 distribute-replicate volumes called glance-vol and cinder-vol for glance images and cinder volumes respectively.

2. Tag the volumes with group virt
   #gluster volume set glance-vol group virt

3. Set the storage.owner-uid and storage.owner-gid of glance-vol to 161
         gluster volume set glance-vol storage.owner-uid 161
         gluster volume set glance-vol storage.owner-gid 161

4. On RHOS machine, mount the RHS glance volume on /mnt/gluster/glance/images and start the glance-api service. Also configure glance volume for nova instances to use gluster glance-vol.

5. Mount RHS cinder-vol on /var/lib/cinder/volumes and configure RHOS to use RHS volume for cinder storage.

6. Create glance image, create cinder volume and copy the image the image to the volume.

# cinder create --display-name vol3 --image-id dfac4c39-7946-4baa-9fb3-444ec6348a88 10

7. Boot a nova instance out of the bootable cinder volume.

# nova boot --flavor 2 --boot-volume 71973975-7952-4d66-a3d8-3cd38de18431 instance-5

# getfattr -d -etext -m. -n trusted.glusterfs.pathinfo /var/lib/cinder/mnt/4db90e5492997091a102ba6ad764dade/volume-71973975-7952-4d66-a3d8-3cd38de18431
getfattr: Removing leading '/' from absolute path names
# file: var/lib/cinder/mnt/4db90e5492997091a102ba6ad764dade/volume-71973975-7952-4d66-a3d8-3cd38de18431
trusted.glusterfs.pathinfo="(<DISTRIBUTE:cinder-vol-dht> (<REPLICATE:cinder-vol-replicate-0> <POSIX(/rhs/brick1/c2):rhs-vm2:/rhs/brick1/c2/volume-71973975-7952-4d66-a3d8-3cd38de18431> <POSIX(/rhs/brick1/c1):rhs-vm1:/rhs/brick1/c1/volume-71973975-7952-4d66-a3d8-3cd38de18431>))"

8. Now run a remove-brick on the bricks from above output. 

# gluster v remove-brick cinder-vol 10.70.37.180:/rhs/brick1/c1 10.70.37.120:/rhs/brick1/c2 start

9. When the volume 71973975-7952-4d66-a3d8-3cd38de18431 is being migrated, reboot the instance-8 that is created from this volume.

10. Check the instance console once it is rebooted. Look for corruption errors messages. Once the instance is up, the rootfs /dev/vda is mounted R/O. Ran fsck manually to correct errors which did not help. The instance is rendered unuseable. 

Expected results:

The rootfs should be mounted R/W after the reboot and no corruption messages should be seen


Additional info:

Sosreports and VM screenshot is attached.

Comment 1 shilpa 2014-02-08 08:58:17 UTC
Created attachment 860851 [details]
Log messages from VM instance

Comment 2 shilpa 2014-02-08 09:07:39 UTC
sosreports in http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1062848/

Comment 4 Susant Kumar Palai 2015-11-27 11:43:02 UTC
Cloning this to 3.1. To be fixed in future.


Note You need to log in before you can comment on or make changes to this bug.