Description of problem: We saw a kernel panic while rebooting the instance after installation of RHEL 6.4 ISO on the nova instance. The nova instance was on a bootable cinder volume hosted on a RHS cluster. Version-Release number of selected component (if applicable): Openstack: http://download.lab.bos.redhat.com/rel-eng/OpenStack/Grizzly/2013-07-08.1/ openstack-cinder-2013.1.2-3.el6ost.noarch openstack-nova-compute-2013.1.2-4.el6ost.noarch RHS: glusterfs-3.4.0.24rhs-1.el6rhs.x86_64 How reproducible: Saw it for the first time. Steps to Reproduce: 1. Create a 2x2 Distributed-Replicate RHS volume 2. Tag the volume with group virt (i.e) gluster volume set cinder-vol group virt 3. Set owner uid and gid to the volume (i.e) gluster volume set cinder-vol storage.owner-uid 165 gluster volume set cinder-vol storage.owner-gid 165 4. gluster volume info and status Volume Name: cinder-vol Type: Distributed-Replicate Volume ID: 25b9729b-b326-4eb8-9068-961c67ee25c6 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: rhshdp01.lab.eng.blr.redhat.com:/cinder1/s1 Brick2: rhshdp02.lab.eng.blr.redhat.com:/cinder1/s2 Brick3: rhshdp03.lab.eng.blr.redhat.com:/cinder1/s3 Brick4: rhshdp04.lab.eng.blr.redhat.com:/cinder1/s4 Options Reconfigured: network.remote-dio: enable cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off storage.owner-uid: 165 storage.owner-gid: 165 # gluster volume status cinder-vol Status of volume: cinder-vol Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhshdp01.lab.eng.blr.redhat.com:/cinder1/s1 49153 Y 11027 Brick rhshdp02.lab.eng.blr.redhat.com:/cinder1/s2 49153 Y 20438 Brick rhshdp03.lab.eng.blr.redhat.com:/cinder1/s3 49157 Y 763 Brick rhshdp04.lab.eng.blr.redhat.com:/cinder1/s4 49157 Y 24021 NFS Server on localhost 2049 Y 20502 Self-heal Daemon on localhost N/A Y 20510 NFS Server on rhshdp03.lab.eng.blr.redhat.com 2049 Y 825 Self-heal Daemon on rhshdp03.lab.eng.blr.redhat.com N/A Y 833 NFS Server on rhshdp04.lab.eng.blr.redhat.com 2049 Y 850 Self-heal Daemon on rhshdp04.lab.eng.blr.redhat.com N/A Y 859 NFS Server on 10.70.36.116 2049 Y 20366 Self-heal Daemon on 10.70.36.116 N/A Y 20375 There are no active volume tasks 5. Configure cinder to use glusterfs volume a. # openstack-config --set /etc/cinder/cinder.conf DEFAULT volume_driver cinder.volume.drivers.glusterfs.GlusterfsDriver # openstack-config --set /etc/cinder/cinder.conf DEFAULT glusterfs_shares_config /etc/cinder/shares.conf # openstack-config --set /etc/cinder/cinder.conf DEFAULT glusterfs_mount_point_base /var/lib/cinder/volumes b. # cat /etc/cinder/shares.conf rhshdp01.lab.eng.blr.redhat.com:cinder-vol c. for i in api scheduler volume; do sudo service openstack-cinder-${i} restart; done 6. Create 2x2 Distributed-Replicate volume for glance-vol with virt tag and uid and gid set to 161 # gluster volume info glance-vol Volume Name: glance-vol Type: Distributed-Replicate Volume ID: c3fe0412-9fec-4914-8fcc-648dc8632a2e Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: rhshdp01.lab.eng.blr.redhat.com:/glance1/s1 Brick2: rhshdp02.lab.eng.blr.redhat.com:/glance1/s2 Brick3: rhshdp03.lab.eng.blr.redhat.com:/glance1/s3 Brick4: rhshdp04.lab.eng.blr.redhat.com:/glance1/s4 Options Reconfigured: performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off cluster.eager-lock: enable network.remote-dio: enable storage.owner-uid: 161 storage.owner-gid: 161 7. Mount the RHS glance volume on /var/lib/glance/images 8. Uploaded RHEL 6.4 ISO from openstack horizon dashboard # http://download.eng.blr.redhat.com/pub/rhel/released/RHEL-6/6.4/Server/x86_64/iso/RHEL6.4-20130130.0-Server-x86_64-DVD1.iso 9. Created a cinder volume of 10G 10. Launched a nova instance with cinder as boot volume and RHEL 6.4 ISO 11. During the installation, brought down 2 bricks rhshdp02.lab.eng.blr.redhat.com:/cinder1/s2 (kill -9 pid) rhshdp03.lab.eng.blr.redhat.com:/cinder1/s3 (kill -9 pid) 12. Installation completed and rebooted the instance. While rebooting, all brought up all the bricks which were down using "gluster volume start cinder-vol force" Actual results: Kernel panic in the instance Expected results: Instance should boot up fine. Additional info: 1) df output on openstack host machine. # df -h Filesystem Size Used Avail Use% Mounted on rhshdp01.lab.eng.blr.redhat.com:glance-vol 200G 11G 190G 6% /var/lib/glance/images rhshdp01.lab.eng.blr.redhat.com:cinder-vol 200G 1.7G 199G 1% /var/lib/cinder/volumes/2b0d90354f56d251613926a47374f77b rhshdp01.lab.eng.blr.redhat.com:cinder-vol 200G 1.7G 199G 1% /var/lib/nova/mnt/2b0d90354f56d251613926a47374f77b 2) # cinder list +--------------------------------------+----------------+--------------+------+-------------+----------+--------------------------------------+ | ID | Status | Display Name | Size | Volume Type | Bootable | Attached to | +--------------------------------------+----------------+--------------+------+-------------+----------+--------------------------------------+ | | ede3931f-6332-460e-a9b4-544d715241e8 | in-use | vol_nova | 10 | None | false | 6c39d39c-2517-426e-bf7a-e7de12436a99
Created attachment 792037 [details] screenshot of the panic
Sosreports and statedumps, http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1002863/
Amar, This bug has been identified as a known issue bug. Please provide CCFR information in the Doc Text field.
Divya, as of now, the RCA for the bug is not done. hence the summary of the bug itself serves as the CCFR.
Tested on RHOS4.0 with RHS2.1 glusterfs-3.4.0.59rhs-1.el6_4.x86_64. With client-quorum enabled with the latest RHS version, I only brought down second bricks in the cluster. Could not reproduce this issue.
Comment #6 wrong BZ update.