Bug 1247933

Summary: RFE: qemu-kvm-rhev: support multiple volume hosts for gluster volumes
Product: Red Hat Enterprise Linux 7 Reporter: Peter Krempa <pkrempa>
Component: qemu-kvm-rhevAssignee: Prasanna Kumar Kalever <prasanna.kalever>
Status: CLOSED ERRATA QA Contact: jingzhao <jinzhao>
Severity: unspecified Docs Contact:
Priority: high    
Version: 7.1CC: ahino, amureini, areis, bmcclain, bugzilla.redhat.com, chayang, huding, jcody, jdenemar, jinzhao, jsuchane, juzhang, knoel, libvirt-maint, lmiksik, michen, pcuzner, prasanna.kalever, rbalakri, rcyriac, sabose, sankarshan, sherold, smohan, ssaha, v.astafiev, virt-bugs, virt-maint, xfu
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.6.0-20.el7 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1247521
: 1323593 (view as bug list) Environment:
Last Closed: 2016-11-07 20:29:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1260561    
Bug Blocks: 1022961, 1247521, 1277939, 1288337, 1298558, 1313485, 1322852, 1323593    

Description Peter Krempa 2015-07-29 09:26:52 UTC
qemu doesn't allow specifying multiple volfile servers for a gluster volume.

Multiple calls to glfs_set_volfile_server() allow to add a list of servers, but the list can't be represented in the URI that is currently used.

Version-Release number of selected component (if applicable):
all, including upstream

+++ This bug was initially created as a clone of Bug #1247521 +++

Description of problem:
Sending multiple hosts to libvirt fails libvirt with following error:
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 731, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 1902, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3427, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: internal error: Expected exactly 1 host for the gluster volume


This limitation exists as qemu does not support specifying multiple volume servers for a gluster volume.

Comment 1 Prasanna Kumar Kalever 2015-09-08 13:24:49 UTC
I have send a patch to qemu, waiting for review.

Thank you.

Comment 2 Jiri Denemark 2015-09-08 17:45:39 UTC
MODIFIED means the patches were backported for RHEL, acked, applied, and an official package was built with them.

Comment 22 Miroslav Rezanina 2016-08-10 18:54:28 UTC
Fix included in qemu-kvm-rhev-2.6.0-20.el7

Comment 24 jingzhao 2016-08-17 05:42:44 UTC
Try to test it on QE side

gluster server info:
[root@intel-e5530-8-2 gluster]# gluster volume info
 
Volume Name: test-volume
Type: Replicate
Volume ID: 803aaeae-a0a0-43ee-bd22-41b151123328
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.73.196.191:/home/brick
Brick2: 10.66.144.41:/home/brick
Options Reconfigured:
performance.readdir-ahead: on

10.73.196.191 info:
[root@intel-e5530-8-2 gluster]# ll
total 0
srwxr-xr-x 1 root root  0 Aug 16 21:59 5ba1c3d087f6121d879d82a0d9cadb86.socket
srwxr-xr-x 1 root root  0 Aug 16 21:59 6df3895ae062126d6015cceb44f64a20.socket
srwxr-xr-x 1 root root  0 Aug 16 21:35 changelog-1567e90295180d1d26b80f98a7802f5f.sock
srwxr-xr-x 1 root root  0 Aug 16 21:59 changelog-f8bde5c5c2615e5fa883edf704543aad.sock
srwxr-xr-x 1 root root  0 Aug 16 21:59 ef55086647657158ae233ea4b30ede3e.socket
drwxr-xr-x 2 root root 40 Aug 16 21:32 snaps
[root@intel-e5530-8-2 gluster]# pwd
/var/run/gluster

10.66.144.41 info:
[root@hp-z800-01 gluster]# ll
total 0
srwxr-xr-x. 1 root root  0 Aug 16 21:59 4788422b92d0dc6702ef7f9fac490cfb.socket
srwxr-xr-x. 1 root root  0 Aug 16 21:59 4e30cd62f8fea4af95974876e1b0df8e.socket
srwxr-xr-x. 1 root root  0 Aug 16 21:59 cd8c17e291d55ce272314befdb72be42.socket
srwxr-xr-x. 1 root root  0 Aug 16 21:35 changelog-1567e90295180d1d26b80f98a7802f5f.sock
srwxr-xr-x. 1 root root  0 Aug 16 21:59 changelog-f8bde5c5c2615e5fa883edf704543aad.sock
drwxr-xr-x. 2 root root 40 Aug 16 21:31 snaps
[root@hp-z800-01 gluster]# pwd
/var/run/gluster


Boot guest on client:
[root@localhost bug]# sh pc.sh
qemu-kvm: -drive driver=qcow2,file.driver=gluster,file.volume=test-volume,file.path=/home/brick/rhel.img,file.server.0.host=10.73.196.191,file.server.0.type=tcp,file.server.0.port=24007,file.server.1.transport=unix,file.server.1.socket=/var/run/4788422b92d0dc6702ef7f9fac490cfb.socket,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop: Parameter 'type' is missing
hint: check in 'server' array index '1'
Usage: -drive driver=qcow2,file.driver=gluster,file.volume=testvol,file.path=/path/a.qcow2[,file.debug=9],file.server.0.type=tcp,file.server.0.host=1.2.3.4,file.server.0.port=24007,file.server.1.transport=unix,file.server.1.socket=/var/run/glusterd.socket ...


[root@localhost bug]# cat pc.sh
/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-nodefaults -rtc base=utc \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-enable-kvm \
-name rhel7.3 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-nodefaults \
-serial unix:/tmp/serial0,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/bug/seabios.log,id=seabios \
-device isa-debugcon,chardev=seabios,iobase=0x402 \
-qmp tcp:0:6666,server,nowait \
-device VGA,id=video \
-vnc :2 \
-drive driver=qcow2,file.driver=gluster,file.volume=test-volume,file.path=/home/brick/rhel.img,file.server.0.host=10.73.196.191,file.server.0.type=tcp,file.server.0.port=24007,file.server.1.transport=unix,file.server.1.socket=/var/run/4788422b92d0dc6702ef7f9fac490cfb.socket,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e -netdev tap,id=tap10 \
-monitor stdio \


Could you share more usage of the bz. I tried above cmd but also failed.


Thanks
Jing Zhao

Comment 26 jingzhao 2016-08-18 02:19:47 UTC
qemu-kvm-rhev-2.6.0-20.el7.x86_64
host kernel:3.10.0-492.el7.x86_64

Test the bz with following scenario

1. Check the "file=gluster[+tcp]://server1[:port]/testvol/a.img"
 
1) Boot guest with following command

/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-nodefaults -rtc base=utc \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-enable-kvm \
-name rhel7.3 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-nodefaults \
-serial unix:/tmp/serial0,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/bug/seabios.log,id=seabios \
-device isa-debugcon,chardev=seabios,iobase=0x402 \
-qmp tcp:0:6666,server,nowait \
-device VGA,id=video \
-vnc :2 \
-drive file=gluster://10.73.196.191/test-volume/rhel.img,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e -netdev tap,id=tap10 \
-monitor stdio \

2) guest can boot up successfully

3) (qemu) info block
drive-virtio-disk0 (#block125): gluster://10.73.196.191/test-volume/rhel.img (qcow2)
    Cache mode:       writeback, direct

2. Check the new scenario of the bz

1) Boot guest with following command
/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-nodefaults -rtc base=utc \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-enable-kvm \
-name rhel7.3 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-nodefaults \
-serial unix:/tmp/serial0,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/bug/seabios.log,id=seabios \
-device isa-debugcon,chardev=seabios,iobase=0x402 \
-qmp tcp:0:6666,server,nowait \
-device VGA,id=video \
-vnc :2 \
-drive driver=qcow2,file.driver=gluster,file.volume=test-volume,file.path=/rhel.img,file.server.0.host=10.73.196.191,file.server.0.type=tcp,file.server.0.port=24007,file.server.1.type=unix,file.server.1.socket=/var/run/glusterd.socket,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e -netdev tap,id=tap10 \
-monitor stdio \

2) guest boot up successfully

3) (qemu) info block
drive-virtio-disk0 (#block164): json:{"driver": "qcow2", "file": {"server.0.host": "10.73.196.191", "driver": "gluster", "path": "/rhel.img", "server.0.type": "tcp", "server.1.type": "unix", "server.1.socket": "/var/run/glusterd.socket", "server.0.port": "24007", "volume": "test-volume"}} (qcow2)
    Cache mode:       writethrough, direct

4) cp large file to the guest

5) delete one brick in the gulster server

gluster server side:
gluster volume remove-brick test-volume replica 1 10.73.196.191:/home/brick force

client side:

(qemu) [2016-08-18 01:46:53.980786] W [socket.c:589:__socket_rwv] 0-test-volume-client-2: writev on 10.73.196.191:49153 failed (Connection reset by peer)
[2016-08-18 01:46:53.981715] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f2caeb89b42] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f2cae9548de] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f2cae9549ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f2cae95637a] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f2cae956ba8] ))))) 0-test-volume-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2016-08-18 01:46:51.986428 (xid=0x3131)
[2016-08-18 01:46:53.981729] W [MSGID: 114031] [client-rpc-fops.c:907:client3_3_writev_cbk] 0-test-volume-client-2: remote operation failed [Transport endpoint is not connected]
[2016-08-18 01:46:53.985018] I [socket.c:3309:socket_submit_request] 0-test-volume-client-2: not connected (priv->connected = 0)
[2016-08-18 01:46:53.985027] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-test-volume-client-2: failed to submit rpc-request (XID: 0x3137 Program: GlusterFS 3.3, ProgVers: 330, Proc: 16) to rpc-transport (test-volume-client-2)
[2016-08-18 01:46:53.985033] W [MSGID: 114031] [client-rpc-fops.c:1031:client3_3_fsync_cbk] 0-test-volume-client-2: remote operation failed [Transport endpoint is not connected]
[2016-08-18 01:46:53.985044] W [MSGID: 108035] [afr-transaction.c:1611:afr_changelog_fsync_cbk] 0-test-volume-replicate-0: fsync(359d9931-6342-47f6-ad28-08f266440536) failed on subvolume test-volume-client-2. Transaction was WRITE [Transport endpoint is not connected]
[2016-08-18 01:46:53.985172] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f2caeb89b42] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f2cae9548de] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f2cae9549ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f2cae95637a] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f2cae956ba8] ))))) 0-test-volume-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2016-08-18 01:46:52.026509 (xid=0x3132)
[2016-08-18 01:46:53.985274] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f2caeb89b42] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f2cae9548de] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f2cae9549ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f2cae95637a] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f2cae956ba8] ))))) 0-test-volume-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2016-08-18 01:46:52.030421 (xid=0x3133)
[2016-08-18 01:46:53.985369] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f2caeb89b42] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f2cae9548de] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f2cae9549ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f2cae95637a] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f2cae956ba8] ))))) 0-test-volume-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2016-08-18 01:46:52.109687 (xid=0x3134)
[2016-08-18 01:46:53.985461] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f2caeb89b42] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f2cae9548de] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f2cae9549ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f2cae95637a] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f2cae956ba8] ))))) 0-test-volume-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2016-08-18 01:46:53.968152 (xid=0x3135)
[2016-08-18 01:46:53.985566] E [rpc-clnt.c:362:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f2caeb89b42] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f2cae9548de] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f2cae9549ee] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f2cae95637a] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f2cae956ba8] ))))) 0-test-volume-client-2: forced unwinding frame type(GlusterFS 3.3) op(WRITE(13)) called at 2016-08-18 01:46:53.970623 (xid=0x3136)
[2016-08-18 01:46:53.985588] I [MSGID: 114018] [client.c:2030:client_rpc_notify] 0-test-volume-client-2: disconnected from test-volume-client-2. Client process will keep trying to connect to glusterd until brick's port is available
[2016-08-18 01:46:54.062780] I [graph.c:269:gf_add_cmdline_options] 0-test-volume-write-behind: adding option 'resync-failed-syncs-after-fsync' for volume 'test-volume-write-behind' with value 'on'
[2016-08-18 01:46:54.063034] I [graph.c:269:gf_add_cmdline_options] 0-test-volume-write-behind: adding option 'resync-failed-syncs-after-fsync' for volume 'test-volume-write-behind' with value 'on'
[2016-08-18 01:46:54.064349] I [MSGID: 104045] [glfs-master.c:95:notify] 0-gfapi: New graph 6c6f6361-6c68-6f73-742e-6c6f63616c64 (2) coming up
[2016-08-18 01:46:54.064358] I [MSGID: 114020] [client.c:2106:notify] 2-test-volume-client-1: parent translators are ready, attempting connect on transport
[2016-08-18 01:46:54.069080] E [MSGID: 114058] [client-handshake.c:1524:client_query_portmap_cbk] 0-test-volume-client-2: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2016-08-18 01:46:54.069104] I [MSGID: 114018] [client.c:2030:client_rpc_notify] 0-test-volume-client-2: disconnected from test-volume-client-2. Client process will keep trying to connect to glusterd until brick's port is available
[2016-08-18 01:46:54.098505] W [MSGID: 114010] [client-callback.c:33:client_cbk_fetchspec] 2-test-volume-client-1: this function should not be called
[2016-08-18 01:46:55.155279] W [MSGID: 114061] [client-rpc-fops.c:4510:client3_3_fsync] 0-test-volume-client-2:  (359d9931-6342-47f6-ad28-08f266440536) remote_fd is -1. EBADFD [File descriptor in bad state]
[2016-08-18 01:46:55.155304] W [MSGID: 108035] [afr-transaction.c:1611:afr_changelog_fsync_cbk] 0-test-volume-replicate-0: fsync(359d9931-6342-47f6-ad28-08f266440536) failed on subvolume test-volume-client-2. Transaction was WRITE [File descriptor in bad state]
[2016-08-18 01:46:55.155606] W [MSGID: 114061] [client-rpc-fops.c:4510:client3_3_fsync] 0-test-volume-client-2:  (359d9931-6342-47f6-ad28-08f266440536) remote_fd is -1. EBADFD [File descriptor in bad state]
[2016-08-18 01:46:55.155620] W [MSGID: 108035] [afr-transaction.c:1611:afr_changelog_fsync_cbk] 0-test-volume-replicate-0: fsync(359d9931-6342-47f6-ad28-08f266440536) failed on subvolume test-volume-client-2. Transaction was WRITE [File descriptor in bad state]
[2016-08-18 01:46:55.231622] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 2-test-volume-client-1: changing port to 49153 (from 0)
[2016-08-18 01:46:55.251863] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 2-test-volume-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2016-08-18 01:46:55.262105] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 2-test-volume-client-1: Connected to test-volume-client-1, attached to remote volume '/home/brick'.
[2016-08-18 01:46:55.262114] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 2-test-volume-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2016-08-18 01:46:55.270889] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 2-test-volume-client-1: Server lk version = 1
[2016-08-18 01:46:55.278852] W [MSGID: 114061] [client-rpc-fops.c:4510:client3_3_fsync] 0-test-volume-client-2:  (359d9931-6342-47f6-ad28-08f266440536) remote_fd is -1. EBADFD [File descriptor in bad state]
[2016-08-18 01:46:55.278871] W [MSGID: 108035] [afr-transaction.c:1611:afr_changelog_fsync_cbk] 0-test-volume-replicate-0: fsync(359d9931-6342-47f6-ad28-08f266440536) failed on subvolume test-volume-client-2. Transaction was WRITE [File descriptor in bad state]
[2016-08-18 01:46:55.279090] W [MSGID: 114061] [client-rpc-fops.c:4510:client3_3_fsync] 0-test-volume-client-2:  (359d9931-6342-47f6-ad28-08f266440536) remote_fd is -1. EBADFD [File descriptor in bad state]
[2016-08-18 01:46:55.279101] W [MSGID: 108035] [afr-transaction.c:1611:afr_changelog_fsync_cbk] 0-test-volume-replicate-0: fsync(359d9931-6342-47f6-ad28-08f266440536) failed on subvolume test-volume-client-2. Transaction was WRITE [File descriptor in bad state]
[2016-08-18 01:46:55.279833] W [MSGID: 114061] [client-rpc-fops.c:4510:client3_3_fsync] 0-test-volume-client-2:  (359d9931-6342-47f6-ad28-08f266440536) remote_fd is -1. EBADFD [File descriptor in bad state]
[2016-08-18 01:46:55.279844] W [MSGID: 108035] [afr-transaction.c:1611:afr_changelog_fsync_cbk] 0-test-volume-replicate-0: fsync(359d9931-6342-47f6-ad28-08f266440536) failed on subvolume test-volume-client-2. Transaction was WRITE [File descriptor in bad state]
[2016-08-18 01:46:55.312778] W [MSGID: 114031] [client-rpc-fops.c:1917:client3_3_fxattrop_cbk] 0-test-volume-client-2: remote operation failed
The message "W [MSGID: 114031] [client-rpc-fops.c:1917:client3_3_fxattrop_cbk] 0-test-volume-client-2: remote operation failed" repeated 5 times between [2016-08-18 01:46:55.312778] and [2016-08-18 01:46:55.314011]
[2016-08-18 01:46:55.315998] E [MSGID: 114031] [client-rpc-fops.c:1676:client3_3_finodelk_cbk] 0-test-volume-client-2: remote operation failed [Transport endpoint is not connected]

6) qemu and guest hang and no response when do anything. 
In host, ping guest ip and failed
[root@localhost ~]# ping 10.66.5.11
PING 10.66.5.11 (10.66.5.11) 56(84) bytes of data.
From 10.66.6.246 icmp_seq=1 Destination Host Unreachable
From 10.66.6.246 icmp_seq=2 Destination Host Unreachable

And you must kill qemu pid in host 

Add info:
before delete:
[root@intel-e5530-8-2 home]# gluster volume status
Status of volume: test-volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.73.196.191:/home/brick             49154     0          Y       14267
Brick 10.66.144.41:/home/brick              49154     0          Y       17571
NFS Server on localhost                     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       17137
NFS Server on 10.66.144.41                  2049      0          Y       17665
Self-heal Daemon on 10.66.144.41            N/A       N/A        Y       17673
 
Task Status of Volume test-volume
------------------------------------------------------------------------------
There are no active volume tasks


[root@intel-e5530-8-2 home]# gluster volume info
 
Volume Name: test-volume
Type: Replicate
Volume ID: 803aaeae-a0a0-43ee-bd22-41b151123328
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.73.196.191:/home/brick
Brick2: 10.66.144.41:/home/brick
Options Reconfigured:
performance.readdir-ahead: on

After delete the brick

[root@hp-z800-01 home]# gluster volume status
Status of volume: test-volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.66.144.41:/home/brick              49153     0          Y       17165
NFS Server on localhost                     2049      0          Y       17349
NFS Server on intel-e5530-8-2.lab.eng.pek2.
redhat.com                                  N/A       N/A        N       N/A  


Questions:

  1. Is the test enough ?
  2. qemu and guest hang after deleted 1 brick, and QE think qemu and guest can run successfully after deleted 1 brick and can change to another brick, so I think the bz didn't fix
  3. how we distinguish the info which before delete the brick and after delete the brick, and I didn't find the difference between before delete the brick and after delete the brick. Other words, how QE can find the block info really change to another gluster brick when the one gluster brick stop.

Thanks
Jing Zhao

Comment 27 Prasanna Kumar Kalever 2016-09-06 09:54:56 UTC
Hi jingzhao,

> 1. Is the test enough ?
> 2. qemu and guest hang after deleted 1 brick, and QE think qemu and guest can run successfully after deleted 1 brick and can change to another brick, so I think the bz didn't fix

Certainly this is a wrong way of testing.

> 3. how we distinguish the info which before delete the brick and after delete the brick, and I didn't find the difference between before delete the brick and after delete the brick. Other words, how QE can find the block info really change to another gluster brick when the one gluster brick stop.

seek the help of netstat.


1. I recommend to use replica 3 volume here for the testing, this also gives freedom from copying the files extensively.
2. brick processes are responsible for IO they got nothing to do with management connection, in all you experiments you should bring the glusterd processes down not the glusterfsd's (bricks)

Some hints to test the patch
1. gluster vol create sample replica 3 HOST1:/b1 HOST2:/b2 HOST3:/b3 force

2. gluster vol start sample and copy the VM file into the volume say a.qcow2

3. qemu-system-x86_64  -drive driver=qcow2, file.driver=gluster, file.volume=sample ,file.path=${pathfromvolume}/a.qcow2, file.debug=9, file.server.0.type=tcp, file.server.0.host=WRONGHOST, file.server.0.port=24007,file.server.1.type=tcp, file.server.1.host=HOST1, file.server.1.port=24007, file.server.2.type=unix, file.server.2.socket=/var/run/glusterd.socket, file.server.3.type=tcp, file.server.3.host=HOST2, file.server.3.port=24007, file.server.4.type=tcp, file.server.4.host=HOST3, file.server.4.port=24007

Have gave file.server.1.type=unix assuming in case we use HOST2 as client (where we run qemu command)
This should boot, check #netstat -tanp | grep "gluster" and see the ESTABLISHED connections (assuming it shows HOST1)

4. Now login to HOST1 (since that was connected) and kill glusterd (# pkill -9 glusterd)

5. check with gluster vol status to confirm with the same

6. '# netstat -x | grep gluster' should show me unix domain connection from HOST2, in case if you don't use unix domain socket, that should be '# netstat -tanp | grep gluster' should show HOST2 connection ESTABLISHMENT


From above we conclude:
1. The registration of 3 nodes is working
2. while initial bootup, gluster skips the WRONGHOST and connects to HOST1 (From this we say, even one node is down while bootup this patch manages to switch to another host instead of simply failing as before)
3. after killing gusterd on HOST1, gluster should switch the connection to next available server (From this, runtime failures of nodes will not suspend/stop qemu for storage, as the management daemon serves the file from next available storage nodes)

I think the above mentioned approach for testing this fix should be fine.

Happy testing and Good Luck!

Comment 28 jingzhao 2016-09-08 01:49:57 UTC
Tested it on qemu-kvm-rhev-2.6.0-22.el7.x86_64 and host-kernel-3.10.0-503.el7.x86_64 with rhel7.3 guest

1. Boot guest with gluster backend

/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-nodefaults -rtc base=utc \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-enable-kvm \
-name rhel7.3 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-nodefaults \
-serial unix:/tmp/serial0,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/bug/seabios.log,id=seabios \
-device isa-debugcon,chardev=seabios,iobase=0x402 \
-qmp tcp:0:6666,server,nowait \
-device VGA,id=video \
-vnc :2 \
-drive driver=qcow2,file.driver=gluster,file.volume=test-volume,file.path=/rhel.qcow2,file.server.0.host=10.66.144.26,file.server.0.type=tcp,file.server.0.port=24007,file.server.1.port=24007,file.server.1.type=tcp,file.server.1.host=10.66.4.211,file.server.2.type=tcp,file.server.2.host=10.66.144.41,file.server.2.port=24007,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e -netdev tap,id=tap10 \
-monitor stdio \

2.(qemu) info block
drive-virtio-disk0 (#block182): json:{"driver": "qcow2", "file": {"server.0.host": "10.66.144.26", "server.1.host": "10.66.4.211", "server.2.host": "10.66.144.41", "driver": "gluster", "path": "/rhel.qcow2", "server.0.type": "tcp", "server.1.type": "tcp", "server.2.type": "tcp", "server.0.port": "24007", "server.1.port": "24007", "server.2.port": "24007", "volume": "test-volume"}} (qcow2)
    Cache mode:       writeback, direct

3. login into 10.66.144.26 and pkill -9 glusterd
(qemu) [2016-09-08 01:17:19.448603] W [socket.c:589:__socket_rwv] 0-gfapi: readv on 10.66.144.26:24007 failed (No data available)
[2016-09-08 01:17:29.480994] E [socket.c:2279:socket_connect_finish] 0-gfapi: connection to 10.66.144.26:24007 failed (Connection refused)

and the info of 10.66.144.26
[root@ibm-x3100m4-02 home]# netstat -tanp |grep gluster
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      25765/glusterfsd    
tcp        0      0 10.66.144.26:49152      10.66.144.41:1020       ESTABLISHED 25765/glusterfsd    
tcp        0      0 10.66.144.26:1020       10.66.144.26:49152      ESTABLISHED 25792/glusterfs     
tcp        0      0 10.66.144.26:49152      10.66.6.246:1015        ESTABLISHED 25765/glusterfsd    
tcp        0      0 10.66.144.26:49152      10.66.4.211:1020        ESTABLISHED 25765/glusterfsd    
tcp        0      0 10.66.144.26:49152      10.66.144.41:1012       ESTABLISHED 25765/glusterfsd    
tcp        0      0 10.66.144.26:1013       10.66.144.41:49152      ESTABLISHED 25792/glusterfs     
tcp        0      0 10.66.144.26:1014       10.66.4.211:49152       ESTABLISHED 25792/glusterfs     
tcp        0      0 10.66.144.26:49152      10.66.144.26:1020       ESTABLISHED 25765/glusterfsd    
[root@ibm-x3100m4-02 home]# gluster volume info
Connection failed. Please check if gluster daemon is operational.


4.In hmp 
(qemu) info block
drive-virtio-disk0 (#block182): json:{"driver": "qcow2", "file": {"server.0.host": "10.66.144.26", "server.1.host": "10.66.4.211", "server.2.host": "10.66.144.41", "driver": "gluster", "path": "/rhel.qcow2", "server.0.type": "tcp", "server.1.type": "tcp", "server.2.type": "tcp", "server.0.port": "24007", "server.1.port": "24007", "server.2.port": "24007", "volume": "test-volume"}} (qcow2)
    Cache mode:       writeback, direct
(qemu) sy
system_powerdown  system_reset      system_wakeup     
(qemu) system_reset 

guest can reboot successfully and login successfully

5.Now login 10.66.4.211 and pkill -9 glusterd 
[root@localhost brick1]# netstat -tanp |grep gluster
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      4554/glusterfsd     
tcp        0      0 10.66.4.211:49152       10.66.144.41:1019       ESTABLISHED 4554/glusterfsd     
tcp        0      0 10.66.4.211:1020        10.66.144.26:49152      ESTABLISHED 4581/glusterfs      
tcp        0      0 10.66.4.211:49152       10.66.144.41:1010       ESTABLISHED 4554/glusterfsd     
tcp        0      0 10.66.4.211:1014        10.66.144.41:49152      ESTABLISHED 4581/glusterfs      
tcp        0      0 10.66.4.211:49152       10.66.4.211:1019        ESTABLISHED 4554/glusterfsd     
tcp        0      0 10.66.4.211:1019        10.66.4.211:49152       ESTABLISHED 4581/glusterfs      
tcp        0      0 10.66.4.211:49152       10.66.6.246:1014        ESTABLISHED 4554/glusterfsd     
tcp        0      0 10.66.4.211:49152       10.66.144.26:1014       ESTABLISHED 4554/glusterfsd     
[root@localhost brick1]# gluster volume info
Connection failed. Please check if gluster daemon is operational.

6.Login host3 10.66.144.41 and check the gluster info
[root@hp-z800-01 brick1]# netstat -tanp |grep gluster
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      20859/glusterfsd    
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      20881/glusterfs     
tcp        0      0 0.0.0.0:38465           0.0.0.0:*               LISTEN      20881/glusterfs     
tcp        0      0 0.0.0.0:38466           0.0.0.0:*               LISTEN      20881/glusterfs     
tcp        0      0 0.0.0.0:38468           0.0.0.0:*               LISTEN      20881/glusterfs     
tcp        0      0 0.0.0.0:38469           0.0.0.0:*               LISTEN      20881/glusterfs     
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      20479/glusterd      
tcp        0      0 0.0.0.0:713             0.0.0.0:*               LISTEN      20881/glusterfs     
tcp        0      0 10.66.144.41:1020       10.66.144.26:49152      ESTABLISHED 20886/glusterfs     
tcp        0      0 10.66.144.41:49152      10.66.144.41:1015       ESTABLISHED 20859/glusterfsd    
tcp        0      0 10.66.144.41:1019       10.66.4.211:49152       ESTABLISHED 20886/glusterfs     
tcp        0      0 10.66.144.41:49152      10.66.144.41:1009       ESTABLISHED 20859/glusterfsd    
tcp        0      0 10.66.144.41:24007      10.66.144.41:1021       ESTABLISHED 20479/glusterd      
tcp        0      0 10.66.144.41:1009       10.66.144.41:49152      ESTABLISHED 20881/glusterfs     
tcp        0      0 127.0.0.1:24007         127.0.0.1:1019          ESTABLISHED 20479/glusterd      
tcp        0      0 10.66.144.41:1021       10.66.144.41:24007      ESTABLISHED 20859/glusterfsd    
tcp        0      0 10.66.144.41:49152      10.66.144.26:1013       ESTABLISHED 20859/glusterfsd    
tcp        0      0 10.66.144.41:1010       10.66.4.211:49152       ESTABLISHED 20881/glusterfs     
tcp        0      0 127.0.0.1:24007         127.0.0.1:1020          ESTABLISHED 20479/glusterd      
tcp        0      0 10.66.144.41:1012       10.66.144.26:49152      ESTABLISHED 20881/glusterfs     
tcp        0      0 127.0.0.1:1019          127.0.0.1:24007         ESTABLISHED 20886/glusterfs     
tcp        0      0 10.66.144.41:49152      10.66.6.246:1013        ESTABLISHED 20859/glusterfsd    
tcp        0      0 10.66.144.41:49152      10.66.4.211:1014        ESTABLISHED 20859/glusterfsd    
tcp        0      0 10.66.144.41:1015       10.66.144.41:49152      ESTABLISHED 20886/glusterfs     
tcp        0      0 127.0.0.1:1020          127.0.0.1:24007         ESTABLISHED 20881/glusterfs     

and guest reboot successfully and login successfully

7. Login server1 and server2, pkill -9 glusterfsd, but I found the guest paused and I/O error,

(qemu) [2016-09-08 01:25:49.796839] W [socket.c:589:__socket_rwv] 0-test-volume-client-0: readv on 10.66.144.26:49152 failed (No data available)
[2016-09-08 01:25:49.796871] I [MSGID: 114018] [client.c:2030:client_rpc_notify] 0-test-volume-client-0: disconnected from test-volume-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2016-09-08 01:25:59.811482] E [socket.c:2279:socket_connect_finish] 0-test-volume-client-0: connection to 10.66.144.26:24007 failed (Connection refused)
[2016-09-08 01:26:12.003521] W [socket.c:589:__socket_rwv] 0-test-volume-client-1: readv on 10.66.4.211:49152 failed (No data available)
[2016-09-08 01:26:12.003553] I [MSGID: 114018] [client.c:2030:client_rpc_notify] 0-test-volume-client-1: disconnected from test-volume-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2016-09-08 01:26:12.003564] W [MSGID: 108001] [afr-common.c:4089:afr_notify] 0-test-volume-replicate-0: Client-quorum is not met
[2016-09-08 01:26:22.839528] E [socket.c:2279:socket_connect_finish] 0-test-volume-client-1: connection to 10.66.4.211:24007 failed (Connection refused)

(qemu) [2016-09-08 01:27:33.653530] I [MSGID: 114021] [client.c:2115:notify] 0-test-volume-client-0: current graph is no longer active, destroying rpc_client 
[2016-09-08 01:27:33.653558] I [MSGID: 114021] [client.c:2115:notify] 0-test-volume-client-1: current graph is no longer active, destroying rpc_client 
[2016-09-08 01:27:33.653565] I [MSGID: 114021] [client.c:2115:notify] 0-test-volume-client-2: current graph is no longer active, destroying rpc_client 
[2016-09-08 01:27:33.653636] I [MSGID: 114018] [client.c:2030:client_rpc_notify] 0-test-volume-client-2: disconnected from test-volume-client-2. Client process will keep trying to connect to glusterd until brick's port is available
[2016-09-08 01:27:33.653658] E [MSGID: 108006] [afr-common.c:4045:afr_notify] 0-test-volume-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2016-09-08 01:27:33.653882] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-gfapi: size=84 max=1 total=1
[2016-09-08 01:27:33.654066] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-gfapi: size=156 max=2 total=2
[2016-09-08 01:27:33.654181] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-gfapi: size=108 max=2 total=2055
[2016-09-08 01:27:33.654189] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-client-0: size=1300 max=28 total=15030
[2016-09-08 01:27:33.654195] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-client-1: size=1300 max=27 total=608
[2016-09-08 01:27:33.654200] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-client-2: size=1300 max=26 total=619
[2016-09-08 01:27:33.654206] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-replicate-0: size=10524 max=55 total=15215
[2016-09-08 01:27:33.654303] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-dht: size=1148 max=0 total=0
[2016-09-08 01:27:33.654351] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-dht: size=2284 max=28 total=14892
[2016-09-08 01:27:33.654436] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-read-ahead: size=188 max=0 total=0
[2016-09-08 01:27:33.654441] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-readdir-ahead: size=52 max=0 total=0
[2016-09-08 01:27:33.654446] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-io-cache: size=68 max=256 total=265
[2016-09-08 01:27:33.654451] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-test-volume-io-cache: size=252 max=14 total=2648
[2016-09-08 01:27:33.654459] I [io-stats.c:2951:fini] 0-test-volume: io-stats translator unloaded
[2016-09-08 01:27:33.655128] I [MSGID: 101191] [event-epoll.c:663:event_dispatch_epoll_worker] 0-epoll: Exited thread with index 1
[2016-09-08 01:27:33.655150] I [MSGID: 101191] [event-epoll.c:663:event_dispatch_epoll_worker] 0-epoll: Exited thread with index 2

(qemu) info status
VM status: paused (io-error)

According to the QE understanding, guest should be run sccussfully because the host3 is available. 

QE want to confirm with you.
1. The test steps is right?
2. Are the test reslut right, are they the expected result?
3. About the step7, is it the expected result? 

Thanks
Jing Zhao

Comment 29 Prasanna Kumar Kalever 2016-09-08 07:39:59 UTC
Sorry, but terminology used above is messy, I recommend to keep it in points wise statements and provide the output to support the statement

If I have to summarize the test plan above:

1. booted the VM by mentioning 3 storage servers
naming:
       HOST1: 10.66.144.26
       HOST2: 10.66.4.211 
       HOST3: 10.66.144.41

Observation: boot successful, netstat shows connected to server HOST1


2. Login to HOST1, kill glusterd

Observation: boot successful, netstat shows connected to server HOST2


3. Login to HOST2, kill glusterd

Observation: boot successful, netstat shows connected to server HOST3

4. Login to HOST1 & HOST2, kill glusterfsd (brick processes)

Observation: VM Paused, IO Error

assuming above furnished steps are right, moving forward to answer the questions

> QE want to confirm with you.

> 1. The test steps is right?

Perfect!

> 2. Are the test reslut right, are they the expected result?

Yes they are right and expected

> 3. About the step7, is it the expected result? 

Yes it is expected, because of the client side quorum  factor.

From Logs:
[2016-09-08 01:26:12.003564] W [MSGID: 108001] [afr-common.c:4089:afr_notify] 0-test-volume-replicate-0: Client-quorum is not met

AFR, which responsible for maintaining data consistency in the replication wants at least 2/3 nodes (brick processes) to be up and running for maintaining data consistency, other wise it endup in split brain situations (read more [1])

It is recommended not to change the behavior of quorum, but for testing purposes you can set "# gluster vol set cluster.quorum-count 1" after you create the volume and repeat the testing, this time hopefully you should not see VM pause and IO Error.


[1] https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/Administration_Guide/sect-Managing_Split-brain.html

Cheers

Comment 30 jingzhao 2016-09-08 09:00:47 UTC
(In reply to Prasanna Kumar Kalever from comment #29)
> Sorry, but terminology used above is messy, I recommend to keep it in points
> wise statements and provide the output to support the statement
> 
> If I have to summarize the test plan above:
> 
> 1. booted the VM by mentioning 3 storage servers
> naming:
>        HOST1: 10.66.144.26
>        HOST2: 10.66.4.211 
>        HOST3: 10.66.144.41
> 
> Observation: boot successful, netstat shows connected to server HOST1
> 
> 
> 2. Login to HOST1, kill glusterd
> 
> Observation: boot successful, netstat shows connected to server HOST2
> 
> 
> 3. Login to HOST2, kill glusterd
> 
> Observation: boot successful, netstat shows connected to server HOST3
> 
> 4. Login to HOST1 & HOST2, kill glusterfsd (brick processes)
> 
> Observation: VM Paused, IO Error
> 
> assuming above furnished steps are right, moving forward to answer the
> questions

Yes, correctly and the gluster server info

[root@ibm-x3100m4-02 ~]# gluster volume info
 
Volume Name: test-volume
Type: Replicate
Volume ID: 5c7ff685-00c5-4267-ba19-962c3048894c
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.66.144.26:/home/brick1
Brick2: 10.66.4.211:/home/brick1
Brick3: 10.66.144.41:/home/brick1
Options Reconfigured:
performance.readdir-ahead: on

> 
> > QE want to confirm with you.
> 
> > 1. The test steps is right?
> 
> Perfect!
> 
> > 2. Are the test reslut right, are they the expected result?
> 
> Yes they are right and expected
> 
> > 3. About the step7, is it the expected result? 
> 
> Yes it is expected, because of the client side quorum  factor.
> 
> From Logs:
> [2016-09-08 01:26:12.003564] W [MSGID: 108001]
> [afr-common.c:4089:afr_notify] 0-test-volume-replicate-0: Client-quorum is
> not met
> 
> AFR, which responsible for maintaining data consistency in the replication
> wants at least 2/3 nodes (brick processes) to be up and running for
> maintaining data consistency, other wise it endup in split brain situations
> (read more [1])

checked it and only pkill -9 glusterfsd, keep host2 and host3, guest can reboot successfully and no I/O error

> 
> It is recommended not to change the behavior of quorum, but for testing
> purposes you can set "# gluster vol set cluster.quorum-count 1" after you
> create the volume and repeat the testing, this time hopefully you should not
> see VM pause and IO Error.
> 
> 
> [1]
> https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3/html/
> Administration_Guide/sect-Managing_Split-brain.html
> 
> Cheers

How can we assure guest used the correct backend.
Such as pkill -9 glusterd on host1 and host2, and I want to confirm guest running on host 3
From the port number -- 24007? and I didn't found the difference between before backend on host3 and after backend on host3

Before pkill -9 glusterd on host2, check the host3 
[root@hp-z800-01 gluster]# netstat -tanp |grep gluster
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      1834/glusterfsd     
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38465           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38466           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38468           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38469           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1810/glusterd       
tcp        0      0 0.0.0.0:755             0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 10.66.144.41:1008       10.66.4.211:49152       ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:1022       10.66.144.26:24007      ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1017          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1018          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:1017          127.0.0.1:24007         ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:1018       10.66.144.26:49152      ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:24007      10.66.4.211:1022        ESTABLISHED 1810/glusterd       
tcp        0      0 10.66.144.41:24007      10.66.144.26:1022       ESTABLISHED 1810/glusterd       
tcp        0      0 10.66.144.41:49152      10.66.144.41:1017       ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:1020       10.66.144.41:24007      ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:1013       10.66.4.211:49152       ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:49152      10.66.144.26:1019       ESTABLISHED 1834/glusterfsd     
tcp        0      0 127.0.0.1:1018          127.0.0.1:24007         ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:49152      10.66.4.211:1019        ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:1010       10.66.144.26:49152      ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:1021       10.66.4.211:24007       ESTABLISHED 1810/glusterd       
tcp        0      0 10.66.144.41:1017       10.66.144.41:49152      ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:1007       10.66.144.41:49152      ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:49152      10.66.144.41:1007       ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:24007      10.66.144.41:1020       ESTABLISHED 1810/glusterd  

And after kill -9 glusterd on host3 

[root@hp-z800-01 gluster]# netstat -tanp |grep gluster
tcp        0      0 0.0.0.0:49152           0.0.0.0:*               LISTEN      1834/glusterfsd     
tcp        0      0 0.0.0.0:2049            0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38465           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38466           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38468           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:38469           0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1810/glusterd       
tcp        0      0 0.0.0.0:755             0.0.0.0:*               LISTEN      1843/glusterfs      
tcp        0      0 10.66.144.41:1008       10.66.4.211:49152       ESTABLISHED 1843/glusterfs      
tcp        0      0 127.0.0.1:24007         127.0.0.1:1017          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1018          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:1017          127.0.0.1:24007         ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:1018       10.66.144.26:49152      ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:49152      10.66.144.41:1017       ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:1020       10.66.144.41:24007      ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:1013       10.66.4.211:49152       ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:49152      10.66.144.26:1019       ESTABLISHED 1834/glusterfsd     
tcp        0      0 127.0.0.1:1018          127.0.0.1:24007         ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:49152      10.66.4.211:1019        ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:49152      10.66.6.246:1017        ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:1010       10.66.144.26:49152      ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:1017       10.66.144.41:49152      ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:1007       10.66.144.41:49152      ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:49152      10.66.144.41:1007       ESTABLISHED 1834/glusterfsd     
tcp        0      0 10.66.144.41:24007      10.66.144.41:1020       ESTABLISHED 1810/glusterd       


Thanks
Jing

Comment 31 Prasanna Kumar Kalever 2016-09-09 07:49:10 UTC
Jing,

You need to look at the client side for this part.
i.e On the host in which you call qemu-system-x86 --

Check for glusterd and glusterfsd before you take a decision. Ideally in our case it is glusterd (from which it fetches volfile server)

On the client machine you should see something like:

# pidof "qemu-system-x86_64"
1234

# netstat -tanp | grep "1234"

Ideally It will list one glusterd connection and multiple brick (glusterfsd) connection depending on how many are alive.

Look for glusterd connection which should say you at the movement which HOST is serving as management node.

HTH,
Good Luck!

Comment 32 jingzhao 2016-09-09 09:23:49 UTC
Settle the test steps because above so messly

Prepare the enviroment
[root@ibm-x3100m4-02 ~]# gluster volume info
 
Volume Name: test-volume
Type: Replicate
Volume ID: 5c7ff685-00c5-4267-ba19-962c3048894c
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.66.144.26:/home/brick1
Brick2: 10.66.4.211:/home/brick1
Brick3: 10.66.144.41:/home/brick1
Options Reconfigured:
performance.readdir-ahead: on


1. Check the "file=gluster[+tcp]://server1[:port]/testvol/a.img"
a. boot up guest like following command
/usr/libexec/qemu-kvm \
-M pc \
..........
-drive file=gluster://10.66.144.26/test-volume/rhel.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,disable-legacy=off,disable-modern=off,bootindex=1 \
........
b. guest boot up successfully
(qemu) info block
drive-virtio-disk0 (#block169): gluster://10.66.144.26/test-volume/rhel.qcow2 (qcow2)
    Cache mode:       writeback, direct

2.check the multi volume
a. boot up guest 
.............
-drive driver=qcow2,file.driver=gluster,file.volume=test-volume,file.path=/rhel.qcow2,file.server.0.host=10.66.144.26,file.server.0.type=tcp,file.server.0.port=24007,file.server.1.port=24007,file.server.1.type=tcp,file.server.1.host=10.66.4.211,file.server.2.type=tcp,file.server.2.host=10.66.144.41,file.server.2.port=24007,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop \
...........
b.create a file like /home/test in guest 
c.on host1(10.66.6.246) pkill -9 glusterd,pkill -9 glusterfsd
(qemu) [2016-09-09 09:01:51.764969] W [socket.c:589:__socket_rwv] 0-gfapi: readv on 10.66.144.26:24007 failed (No data available)

(qemu) [2016-09-09 09:02:03.186882] E [socket.c:2279:socket_connect_finish] 0-gfapi: connection to 10.66.144.26:24007 failed (Connection refused)

(qemu) [2016-09-09 09:05:51.109791] W [socket.c:589:__socket_rwv] 0-test-volume-client-0: readv on 10.66.144.26:49152 failed (No data available)
[2016-09-09 09:05:51.109824] I [MSGID: 114018] [client.c:2030:client_rpc_notify] 0-test-volume-client-0: disconnected from test-volume-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2016-09-09 09:06:01.346856] E [socket.c:2279:socket_connect_finish] 0-test-volume-client-0: connection to 10.66.144.26:24007 failed (Connection refused)

d.reboot the guest and check the file existed in guest (/home/guest)
e.on host2(10.66.4.211) pkill -9 glusterd
f.reboot the guest and check the file existed in guest (/home/guest)
h.on gluster client
[root@jinzhao bug]# pidof qemu-kvm
19938
[root@jinzhao bug]# netstat -anpt |grep "19938"
tcp        0      0 0.0.0.0:6667            0.0.0.0:*               LISTEN      19938/qemu-kvm      
tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      19938/qemu-kvm      
tcp        0      0 10.66.6.246:5902        10.66.7.102:39147       ESTABLISHED 19938/qemu-kvm      
tcp        0      0 10.66.6.246:1015        10.66.4.211:49152       ESTABLISHED 19938/qemu-kvm      
tcp        0      0 10.66.6.246:1014        10.66.144.41:49152      ESTABLISHED 19938/qemu-kvm  

According to comment31, prasanna said "ideally it will list on gluster" so I think it's ok for the step h.

Comment 33 jingzhao 2016-09-09 09:26:38 UTC
Add more info for comment 32
gluster client:
  host-kernel:3.10.0-503.el7.x86_64
  qemu-kvm-rhev-2.6.0-23.el7.x86_64
gluster server:
  glusterfs-server-3.7.9-12.el7rhgs.x86_64

Comment 34 jingzhao 2016-09-09 10:08:23 UTC
prasanna, thanks so much.

And I summarize the test scenario and test steps on comment 32 and comment 33, any other scenario, did I lost? are you agree ? 
If it's ok, i will change the status to verified.

Thanks 
Jing Zhao

Comment 35 Prasanna Kumar Kalever 2016-09-09 11:11:00 UTC
As long as you see glusterd switches from client it is fine for me.

I don't see 24007 in the output above from comment 32 at # netstat -anpt |grep "19938", you can check also # netstat -tanp | grep "24007" in the client

One more thing missing in the test scenarios was test for unix domain sockets.
You can pick one from 3 gluster hosts as client and use tranport type as unix and can verify if that works.


Overall it looks good to me.

Comment 36 jingzhao 2016-09-12 05:27:06 UTC
Verified it with unix backend on qemu-kvm-rhev-2.6.0-23.el7.x86_64
Prepare:
[root@ibm-x3100m4-02 run]# gluster volume info
 
Volume Name: test-volume
Type: Replicate
Volume ID: 5c7ff685-00c5-4267-ba19-962c3048894c
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.66.144.26:/home/brick1
Brick2: 10.66.4.211:/home/brick1
Brick3: 10.66.144.41:/home/brick1
Options Reconfigured:
performance.readdir-ahead: on

1.Boot guest with following command
/usr/libexec/qemu-kvm \
-M pc \
-cpu SandyBridge \
-nodefaults -rtc base=utc \
-m 4G \
-smp 2,sockets=2,cores=1,threads=1 \
-enable-kvm \
-name rhel7.3 \
-uuid 990ea161-6b67-47b2-b803-19fb01d30d12 \
-smbios type=1,manufacturer='Red Hat',product='RHEV Hypervisor',version=el6,serial=koTUXQrb,uuid=feebc8fd-f8b0-4e75-abc3-e63fcdb67170 \
-k en-us \
-nodefaults \
-serial unix:/tmp/serial0,server,nowait \
-boot menu=on \
-bios /usr/share/seabios/bios.bin \
-chardev file,path=/home/seabios.log,id=seabios \
-device isa-debugcon,chardev=seabios,iobase=0x402 \
-qmp tcp:0:6667,server,nowait \
-device VGA,id=video \
-vnc :2 \
-drive driver=qcow2,file.driver=gluster,file.volume=test-volume,file.path=/rhel.qcow2,file.server.0.host=10.66.144.26,file.server.0.type=tcp,file.server.0.port=24007,file.server.1.port=24007,file.server.1.type=tcp,file.server.1.host=10.66.4.211,file.server.2.type=unix,file.server.2.socket=/var/run/glusterd.socket,if=none,id=drive-virtio-disk0,cache=none,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
-device virtio-net-pci,netdev=tap10,mac=9a:6a:6b:6c:6d:6e -netdev tap,id=tap10 \
-monitor stdio \

2.Login guest and create file
dd if=/dev/zero of=/home/tmp bs=1M count=1024
tmp file create successfully

3.In gluster client, check the connections
[root@hp-z800-01 ~]# pidof qemu-kvm
21167
[root@hp-z800-01 ~]# netstat -tanp |grep "21167"
tcp        0      0 0.0.0.0:6667            0.0.0.0:*               LISTEN      21167/qemu-kvm      
tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      21167/qemu-kvm      
tcp        0      0 10.66.144.41:1012       10.66.4.211:49152       ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1018       10.66.144.26:49152      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1021       10.66.144.26:24007      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1010       10.66.144.41:49152      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:5902       10.66.7.102:46130       ESTABLISHED 21167/qemu-kvm      
[root@hp-z800-01 ~]# netstat -tanp |grep "24007"
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1810/glusterd       
tcp        0      0 10.66.144.41:1022       10.66.144.26:24007      ESTABLISHED 1810/glusterd       
tcp        0      0 10.66.144.41:1023       10.66.4.211:24007       ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1017          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1018          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:1017          127.0.0.1:24007         ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:24007      10.66.4.211:1022        ESTABLISHED 1810/glusterd       
tcp        0      0 10.66.144.41:24007      10.66.144.26:1021       ESTABLISHED 1810/glusterd       
tcp        0      0 10.66.144.41:1021       10.66.144.26:24007      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1020       10.66.144.41:24007      ESTABLISHED 1834/glusterfsd     
tcp        0      0 127.0.0.1:1018          127.0.0.1:24007         ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:24007      10.66.144.41:1020       ESTABLISHED 1810/glusterd       

4. Login host1(10.66.144.26), pkill -9 glusterd 
5. In gluster client, on hmp
(qemu) info block
drive-virtio-disk0 (#block103): json:{"driver": "qcow2", "file": {"server.0.host": "10.66.144.26", "server.1.host": "10.66.4.211", "driver": "gluster", "path": "/rhel.qcow2", "server.0.type": "tcp", "server.1.type": "tcp", "server.2.type": "unix", "server.2.socket": "/var/run/glusterd.socket", "server.0.port": "24007", "server.1.port": "24007", "volume": "test-volume"}} (qcow2)
    Cache mode:       writethrough, direct
(qemu) [2016-09-12 05:17:16.711175] W [socket.c:701:__socket_rwv] 0-gfapi: readv on 10.66.144.26:24007 failed (No data available)

(qemu) [2016-09-12 05:17:28.257302] E [socket.c:2395:socket_connect_finish] 0-gfapi: connection to 10.66.144.26:24007 failed (Connection refused)
sy
system_powerdown  system_reset      system_wakeup     
(qemu) system_reset 

6.Guest reboot successfully and tmp file existed 
7.In gluster client, check the connections
[root@hp-z800-01 ~]# netstat -tanp |grep "21167"
tcp        0      0 0.0.0.0:6667            0.0.0.0:*               LISTEN      21167/qemu-kvm      
tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      21167/qemu-kvm      
tcp        0      0 10.66.144.41:1012       10.66.4.211:49152       ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1018       10.66.144.26:49152      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1010       10.66.144.41:49152      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:5902       10.66.7.102:46130       ESTABLISHED 21167/qemu-kvm      
[root@hp-z800-01 ~]# netstat -tanp |grep "24007"
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1810/glusterd       
tcp        0      0 10.66.144.41:1023       10.66.4.211:24007       ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1017          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1018          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:1017          127.0.0.1:24007         ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:24007      10.66.4.211:1022        ESTABLISHED 1810/glusterd       
tcp        0      0 10.66.144.41:1020       10.66.144.41:24007      ESTABLISHED 1834/glusterfsd     
tcp        0      0 127.0.0.1:1018          127.0.0.1:24007         ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:24007      10.66.144.41:1020       ESTABLISHED 1810/glusterd       

8.Then login host2(10.66.4.211),pkill -9 glusterd
9.reboot guest, guest reboot successfully and tmp file existed
10.In gluster client, check the connections
[root@hp-z800-01 ~]# netstat -tanp |grep "21167"
tcp        0      0 0.0.0.0:6667            0.0.0.0:*               LISTEN      21167/qemu-kvm      
tcp        0      0 0.0.0.0:5902            0.0.0.0:*               LISTEN      21167/qemu-kvm      
tcp        0      0 10.66.144.41:1012       10.66.4.211:49152       ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1018       10.66.144.26:49152      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:1010       10.66.144.41:49152      ESTABLISHED 21167/qemu-kvm      
tcp        0      0 10.66.144.41:5902       10.66.7.102:46130       ESTABLISHED 21167/qemu-kvm      
[root@hp-z800-01 ~]# netstat -tanp |grep "24007"
tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1017          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:24007         127.0.0.1:1018          ESTABLISHED 1810/glusterd       
tcp        0      0 127.0.0.1:1017          127.0.0.1:24007         ESTABLISHED 1851/glusterfs      
tcp        0      0 10.66.144.41:1020       10.66.144.41:24007      ESTABLISHED 1834/glusterfsd     
tcp        0      0 127.0.0.1:1018          127.0.0.1:24007         ESTABLISHED 1843/glusterfs      
tcp        0      0 10.66.144.41:24007      10.66.144.41:1020       ESTABLISHED 1810/glusterd       

11.Then on hmp:
(qemu) system_powerdown 
12.Guest shutdown successfully.

Thanks
Jing Zhao

Comment 37 jingzhao 2016-09-12 05:34:50 UTC
According to comment 32 and comment 36, marked the bz as verified.

Comment 39 errata-xmlrpc 2016-11-07 20:29:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html