Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1095078

Summary:	migration with block I/O error when use glusterfs storage backends
Product:	Red Hat Enterprise Linux 7	Reporter:	Sibiao Luo <sluo>
Component:	glusterfs	Assignee:	Vijay Bellur <vbellur>
Status:	CLOSED CURRENTRELEASE	QA Contact:	SATHEESARAN <sasundar>
Severity:	high	Docs Contact:
Priority:	high
Version:	7.0	CC:	amainkar, amit.shah, areis, asrivast, chayang, juzhang, knoel, lmiksik, mazhang, michen, ovasik, qzhang, rbalakri, rcyriac, rwheeler, sasundar, tlavigne, vagarwal, vbellur, virt-maint, xfu
Target Milestone:	rc	Keywords:	Regression, ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.6.0.41-1.el7	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2015-03-26 11:34:36 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Sibiao Luo 2014-05-07 05:18:35 UTC

Description of problem:
migration with block I/O error when use glusterfs storage backends, if use the local NFS storage backends image which did not hit such issue.

Version-Release number of selected component (if applicable):
src host info(rhel7.0):
# uname -r && rpm -q qemu-kvm
3.10.0-123.el7.x86_64
qemu-kvm-1.5.3-60.el7.x86_64
dest host info(rhel7.0-z):
# uname -r && rpm -q qemu-kvm
3.10.0-123.el7.x86_64
qemu-kvm-1.5.3-60.el7_0.1.x86_64
guest info:
# uname -r
3.10.0-123.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1.boot up a KVM guest in src rhel7.0 host using glusterfs storage backends.
e.g:...-drive file=gluster://10.66.106.6/sluo_volume/RHEL-7.0-20140505.1_Server_x86_64.raw,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,addr=0x4,bus=pci.0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-system-disk,id=system-disk,bootindex=1
2.append '-incoming ' in dest rhel7.0-z host.
3.do migration from rhel7.0 and to rhel7.0-z host.
(qemu) migrate -d tcp:10.66.106.5:5888

Actual results:
After step 3, after complete migration, the dest QEMU with block I/O error when use glusterfs storage backends.
(qemu) block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)

(qemu) info status 
VM status: paused (io-error)
(qemu) cont
(qemu) block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)
block I/O error in device 'drive-system-disk': Input/output error (5)

Expected results:
there should not meet any i/o error.

Additional info:
/usr/libexec/qemu-kvm -M pc -cpu Opteron_G4 -enable-kvm -m 4G -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 -no-kvm-pit-reinjection -usb -device usb-tablet,id=input0 -name sluo -uuid 990ea161-6b67-41b2-b803-19fb01d30d30 -rtc base=localtime,clock=host,driftfix=slew -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0,bus=pci.0,addr=0x3 -chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait -device virtserialport,chardev=channel1,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port1 -chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait -device virtserialport,chardev=channel2,name=com.redhat.rhevm.vdsm,bus=virtio-serial0.0,id=port2 -drive file=gluster://10.66.106.6/sluo_volume/RHEL-7.0-20140505.1_Server_x86_64.raw,if=none,id=drive-system-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop -device virtio-scsi-pci,id=scsi0,addr=0x4,bus=pci.0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-system-disk,id=system-disk,bootindex=1 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=00:01:02:B6:40:11,bus=pci.0,addr=0x5 -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x6 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -serial unix:/tmp/ttyS0,server,nowait -k en-us -boot menu=on -qmp tcp:0:4444,server,nowait -vnc :1 -monitor stdio

Comment 1 Sibiao Luo 2014-05-07 05:20:48 UTC

My glusterfs version and volume info. 
# rpm -q glusterfs
glusterfs-3.5qa2-0.425.git9360107.el7.x86_64

# gluster
gluster> volume info
 
Volume Name: sluo_volume
Type: Distribute
Volume ID: 25ec85a1-e90b-499e-86e6-12db3e02b745
Status: Started
Snap Volume: no
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.66.106.6:/home/brick1

Comment 2 Sibiao Luo 2014-05-07 05:22:17 UTC

# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                16
On-line CPU(s) list:   0-15
Thread(s) per core:    1
Core(s) per socket:    8
Socket(s):             2
NUMA node(s):          4
Vendor ID:             AuthenticAMD
CPU family:            16
Model:                 9
Model name:            AMD Opteron(tm) Processor 6128
Stepping:              1
CPU MHz:               2000.154
BogoMIPS:              4000.07
Virtualization:        AMD-V
L1d cache:             64K
L1i cache:             64K
L2 cache:              512K
L3 cache:              5118K
NUMA node0 CPU(s):     0,2,4,6
NUMA node1 CPU(s):     8,10,12,14
NUMA node2 CPU(s):     9,11,13,15
NUMA node3 CPU(s):     1,3,5,7

Comment 3 Sibiao Luo 2014-05-07 05:43:20 UTC

(In reply to Sibiao Luo from comment #0)
> How reproducible:
> always
not 100% if just one time migration, but 100% if do ping-pong migration according my testing.

BTW, if use the glusterfs(fuse) not the glusterfs(native) which will met anther problem after ping-pong migration. Please let me know to separate a new bug for it if it was not the same issue, thanks.

block I/O error in device 'drive-virtio-disk': File descriptor in bad state (77)

(qemu) info status 
VM status: paused (io-error)
(qemu) cont
(qemu) block I/O error in device 'drive-virtio-disk': File descriptor in bad state (77)

Best Regards,
sluo

Comment 4 Sibiao Luo 2014-05-07 06:52:10 UTC

1. I just tested the virtio-blk and virtio-scsi which both can hit it.

2. Just migration from rhel7.0(qemu-kvm-1.5.3-60.el7.x86_64) to rhel7.0(qemu-kvm-1.5.3-60.el7.x86_64) which also can hit it.

3. Also tried the glusterfs in rhel6.6 server side host as storage backends which did not hit such issue.

glusterfs server rhel6 side:
glusterfs-3.5qa2-0.340.gitc193996.el7.x86_64

glusterfs client rhel7 side:
glusterfs-3.5qa2-0.425.git9360107.el7.x86_64

So i think this issue is glusterfs rhel7 package issue. Please help move to the right componment if i mistake it, thanks.

Best Regard,
sluo

Comment 5 Sibiao Luo 2014-05-07 07:25:01 UTC

(In reply to Sibiao Luo from comment #4)
> 1. I just tested the virtio-blk and virtio-scsi which both can hit it.
> 
> 2. Just migration from rhel7.0(qemu-kvm-1.5.3-60.el7.x86_64) to
> rhel7.0(qemu-kvm-1.5.3-60.el7.x86_64) which also can hit it.
> 
> 3. Also tried the glusterfs in rhel6.6 server side host as storage backends
> which did not hit such issue.
> glusterfs server rhel6 side:
> glusterfs-3.5qa2-0.340.gitc193996.el7.x86_64
> 
> glusterfs client rhel7 side:
> glusterfs-3.5qa2-0.425.git9360107.el7.x86_64

Clear up it as following:

########### I: Not hit this issue.
glusterfs server:
rhel6 host
glusterfs-3.4.0.59rhs-1.el6rhs.x86_64
glusterfs client:
rhel7 host
glusterfs-3.5qa2-0.425.git9360107.el7.x86_64

########### II: Not hit this issue.
glusterfs server:
rhel7 host
glusterfs-3.5qa2-0.340.gitc193996.el7.x86_64
glusterfs client:
rhel7 host
glusterfs-3.5qa2-0.425.git9360107.el7.x86_64

########### III: Hit this issue.
glusterfs server:
rhel7 host
glusterfs-3.5qa2-0.340.gitc193996.el7.x86_64
glusterfs client:
rhel7 host
glusterfs-3.5qa2-0.425.git9360107.el7.x86_64

> So i think this issue is glusterfs rhel7 package issue. Please help move to
> the right componment if i mistake it, thanks.
>

Comment 7 Sibiao Luo 2014-05-07 07:29:57 UTC

(In reply to Sibiao Luo from comment #5)
> (In reply to Sibiao Luo from comment #4)
> > 1. I just tested the virtio-blk and virtio-scsi which both can hit it.
> > 
> > 2. Just migration from rhel7.0(qemu-kvm-1.5.3-60.el7.x86_64) to
> > rhel7.0(qemu-kvm-1.5.3-60.el7.x86_64) which also can hit it.
> > 
> > 3. Also tried the glusterfs in rhel6.6 server side host as storage backends
> > which did not hit such issue.
> > glusterfs server rhel6 side:
> > glusterfs-3.5qa2-0.340.gitc193996.el7.x86_64
> > 
> > glusterfs client rhel7 side:
> > glusterfs-3.5qa2-0.425.git9360107.el7.x86_64
> 
> Clear up it as following:
> 
> ########### I: Not hit this issue.
> glusterfs server:
> rhel6 host
> glusterfs-3.4.0.59rhs-1.el6rhs.x86_64
> glusterfs client:
> rhel7 host
> glusterfs-3.5qa2-0.425.git9360107.el7.x86_64
> 
> ########### II: Not hit this issue.
> glusterfs server:
> rhel7 host
> glusterfs-3.5qa2-0.340.gitc193996.el7.x86_64
> glusterfs client:
> rhel7 host
> glusterfs-3.5qa2-0.425.git9360107.el7.x86_64
> 
> ########### III: Hit this issue.
> glusterfs server:
> rhel7 host
> glusterfs-3.5qa2-0.340.gitc193996.el7.x86_64
  glusterfs-3.5qa2-0.425.git9360107.el7.x86_64
> glusterfs client:
> rhel7 host
> glusterfs-3.5qa2-0.425.git9360107.el7.x86_64
> 
> > So i think this issue is glusterfs rhel7 package issue. Please help move to
> > the right componment if i mistake it, thanks.
> >

Comment 11 SATHEESARAN 2015-01-21 07:24:55 UTC

Tested the live migration of guest(RHEL 7.0) from RHEL 7.1 to other RHEL 7.1, where guest uses the glusterfs shared storage domain using libgfapi access mechanism.

Guest migration was successful and guest vm was running healthy post migration.
Repeated the steps couple of times and couldn't reproduce this issue.

Tested with :
RHEL 7.1 Nightly :
http://download.devel.redhat.com/composes/nightly/RHEL-7.1-20150116.n.0/

Glusterfs RPMS   : glusterfs-3.6.0.42-1.el7rhs
http://download.devel.redhat.com/brewroot/packages/glusterfs/3.6.0.42/1.el7rhs/x86_64/

qemu-kvm         : 1.5.3-60.el7_0.11.x86_64
qemu-kvm-1.5.3-60.el7_0.11.x86_64
qemu-kvm-common-1.5.3-60.el7_0.11.x86_64
qemu-kvm-tools-1.5.3-60.el7_0.11.x86_64

Comment 12 SATHEESARAN 2015-01-21 09:05:20 UTC

Marking this bug as VERIFIED with the test results as commented in comment11