Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1405302 - vm does not boot up when first data brick in the arbiter volume is killed.
vm does not boot up when first data brick in the arbiter volume is killed.
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: arbiter (Show other bugs)
3.2
x86_64 Linux
unspecified Severity high
: ---
: RHGS 3.2.0
Assigned To: Ravishankar N
RamaKasturi
:
Depends On:
Blocks: Gluster-HC-2 1351528
  Show dependency treegraph
 
Reported: 2016-12-16 02:05 EST by RamaKasturi
Modified: 2017-03-23 01:58 EDT (History)
7 users (show)

See Also:
Fixed In Version: glusterfs-3.8.4-10
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-03-23 01:58:12 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0486 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 05:18:45 EDT

  None (edit)
Description RamaKasturi 2016-12-16 02:05:53 EST
Description of problem:
Bring down the first data brick in the arbiter volume and create a vm. Once the vm installation finishes poweroff the node. Bring back the first brick up and start the vm. I see that vm does not boot and goes to grub mode.

As suggested by vijay, while starting the vm i have not brought the first brick up and i see that vm boots with out any issues.

Version-Release number of selected component (if applicable):
glusterfs-3.8.4-8.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create a 1 x (2 + 1) arbiter volume.
2. Now bring down the first data brick in the volume.
3. Create a vm.
4. Once vm installation finishes poweroff the vm.
5. Bring the first brick which was down and start the vm.

Actual results:
I see that vm does not boot and goes to grub mode.

Expected results:
vm should boot with out any issues.

Additional info:
Comment 2 RamaKasturi 2016-12-16 02:16:16 EST
gluster volume info on vmstore:
====================================
[root@rhsqa-grafton4 ~]# gluster volume info vmstore
 
Volume Name: vmstore
Type: Replicate
Volume ID: 3d67c0ad-5084-4190-a4b5-c468994ca084
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.36.82:/rhgs/brick2/vmstore
Brick2: 10.70.36.83:/rhgs/brick2/vmstore
Brick3: 10.70.36.84:/rhgs/brick2/vmstore (arbiter)
Options Reconfigured:
performance.readdir-ahead: on
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: off
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
features.shard: on
features.shard-block-size: 512MB
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
network.ping-timeout: 30
user.cifs: off
performance.strict-o-direct: on
client.ssl: on
server.ssl: on
auth.ssl-allow: 10.70.36.84,10.70.36.82,10.70.36.83
cluster.granular-entry-heal: enable
cluster.use-compound-fops: on
Comment 3 RamaKasturi 2016-12-16 02:16:40 EST
sosreports can be found in the brick below:
=============================================
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1405302/
Comment 6 Ravishankar N 2016-12-20 01:54:24 EST
From the errors in rhev-data-center-mnt-glusterSD-10.70.36.82\:_vmstore.log in grafton5, it looks like the same problem as BZ 1404982 (comment #5). I am providing a test build to Kasturi with the same fix on top of the latest downstream code (HEAD @ tag: v3.8.4-9, origin/rhgs-3.2.0, rhgs-3.2.0) protocol/client: fix op_errno handling, was unused variable) to see if it fixes the issue.
Comment 7 Atin Mukherjee 2016-12-20 04:52:30 EST
upstream mainline patch http://review.gluster.org/#/c/16205/ posted for review.
Comment 8 RamaKasturi 2016-12-20 09:03:10 EST
Hi Ravi,

   Tested this issue with glusterfs-3.8.4-5.el7rhgs.x86_64. I tried the steps mentioned in the description thrice. But i was not able to hit the issue.

Thanks
kasturi
Comment 9 Ravishankar N 2016-12-20 10:45:29 EST
(In reply to RamaKasturi from comment #8)
> Hi Ravi,
> 
>    Tested this issue with glusterfs-3.8.4-5.el7rhgs.x86_64. I tried the
> steps mentioned in the description thrice. But i was not able to hit the
> issue.
> 
> Thanks
> kasturi

Thanks Kasturi, if we are able to hit the issue with glusterfs-3.8.4-6, then we have a reasonably lesser no. of fixes to do a git bisect and find the offending commit. Please give it a try on v3.8.4-6 as well.
Thanks!
Ravi
Comment 10 Ravishankar N 2016-12-22 01:26:16 EST
Just for the record, after comment#9, Kasturi tried a couple of test builds (thanks a lot Kasturi!) and we were not able to hit the issue with some modifications made to the original patch posted in comment #7.
Comment 11 Ravishankar N 2016-12-22 01:27:49 EST
Downstream patch: https://code.engineering.redhat.com/gerrit/#/c/93560/
Comment 13 RamaKasturi 2016-12-29 02:23:55 EST
will verify this bug once the fix for https://bugzilla.redhat.com/show_bug.cgi?id=1400057 lands.
Comment 14 RamaKasturi 2017-01-13 04:50:41 EST
verified and works fine with build glusterfs-3.8.4-11.el7rhgs.x86_64.

Brought the first brick down in the volume, created a vm and installed os. Once vm is installed,powered off the vm,brought the first brick up and i see that vm has been booted successfully.
Comment 15 RamaKasturi 2017-01-13 04:51:13 EST
Moving this to verified state.
Comment 16 RamaKasturi 2017-01-13 04:52:25 EST
[root@rhsqa-grafton4 ~]# gluster volume info vmstore
 
Volume Name: vmstore
Type: Replicate
Volume ID: 2f8938c2-26d3-4912-a6e0-bc12b76146d0
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 10.70.36.82:/rhgs/brick1/vmstore
Brick2: 10.70.36.83:/rhgs/brick1/vmstore
Brick3: 10.70.36.84:/rhgs/brick1/vmstore (arbiter)
Options Reconfigured:
auth.ssl-allow: 10.70.36.84,10.70.36.82,10.70.36.83
server.ssl: on
client.ssl: on
cluster.granular-entry-heal: on
user.cifs: off
network.ping-timeout: 30
performance.strict-o-direct: on
cluster.shd-wait-qlength: 10000
cluster.shd-max-threads: 8
cluster.locking-scheme: granular
performance.low-prio-threads: 32
features.shard-block-size: 4MB
storage.owner-gid: 36
storage.owner-uid: 36
cluster.data-self-heal-algorithm: full
features.shard: on
cluster.server-quorum-type: server
cluster.quorum-type: auto
network.remote-dio: off
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
transport.address-family: inet
performance.readdir-ahead: on
nfs.disable: on
Comment 18 errata-xmlrpc 2017-03-23 01:58:12 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.