Bug 1663431
Summary: | Get errors"Could not read qcow2 header" when read qcow2 file with glusterfs | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | gaojianan <jgao> | |
Component: | libgfapi | Assignee: | Michael Adam <madam> | |
Status: | CLOSED WORKSFORME | QA Contact: | Vivek Das <vdas> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | rhgs-3.4 | CC: | dyuan, hhan, h.moeller, jgao, jthottan, kdhananj, lmen, madam, moagrawa, ndevos, pasik, pkarampu, rgowdapp, rhs-bugs, skoduri, storage-qa-internal, vbellur, xuzhang, yafu, yalzhang, yisun | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1775512 (view as bug list) | Environment: | ||
Last Closed: | 2019-12-03 07:15:33 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1775512 | |||
Bug Blocks: | ||||
Attachments: |
hit the same issue: (.libvirt-ci-venv-ci-runtest-sBLGCJ) [root@hp-dl380g9-02 virtual_disks]# rpm -qa | grep gluster libvirt-daemon-driver-storage-gluster-4.5.0-19.module+el8+2712+4c318da1.x86_64 glusterfs-client-xlators-3.12.2-32.1.el8.x86_64 qemu-kvm-block-gluster-2.12.0-59.module+el8+2714+6d9351dd.x86_64 glusterfs-cli-3.12.2-32.1.el8.x86_64 glusterfs-libs-3.12.2-32.1.el8.x86_64 glusterfs-api-3.12.2-32.1.el8.x86_64 glusterfs-3.12.2-32.1.el8.x86_64 glusterfs-fuse-3.12.2-32.1.el8.x86_64 (.libvirt-ci-venv-ci-runtest-sBLGCJ) [root@hp-dl380g9-02 virtual_disks]# qemu-img info gluster://10.66.7.98/gluster-vol1/aaa.qcow2 qemu-img: Could not open 'gluster://10.66.7.98/gluster-vol1/aaa.qcow2': Could not read L1 table: Input/output error And this will block vm using the gluster disk, so escalate the priority (.libvirt-ci-venv-ci-runtest-sBLGCJ) [root@hp-dl380g9-02 virtual_disks]# cat gdisk <disk device="disk" type="network"><driver cache="none" name="qemu" type="qcow2" /><target bus="virtio" dev="vdb" /><source name="gluster-vol1/aaa.qcow2" protocol="gluster"><host name="10.66.7.98" port="24007" /></source></disk> (.libvirt-ci-venv-ci-runtest-sBLGCJ) [root@hp-dl380g9-02 virtual_disks]# virsh attach-device avocado-vt-vm1 gdisk error: Failed to attach device from gdisk error: internal error: unable to execute QEMU command 'device_add': Property 'virtio-blk-device.drive' can't find value 'drive-virtio-disk1' Moving bug to Krutika as she is more experienced in Virt workloads. Meantime, looking at glusterfs version, this is RHGS 3.4 builds. Could you share the following two pieces of information - 1. output of `gluster volume info $VOLNAME` 2. Are the glusterfs client and server running the same version of gluster/RHGS? -Krutika (In reply to Krutika Dhananjay from comment #4) > Could you share the following two pieces of information - > > 1. output of `gluster volume info $VOLNAME` > 2. Are the glusterfs client and server running the same version of > gluster/RHGS? Let me clarify why I'm asking about the versions - the bug's "Description" section says this - gluster server:glusterfs-3.12.2-19.el7rhgs.x86_64 client:glusterfs-3.12.2-15.4.el8.x86_64" but comment 2 lists the client package as glusterfs-client-xlators-3.12.2-32.1.el8.x86_64 Want to be sure about the exact versions being used so I can recreate it. (Looked at the logs, not much clue there) -Krutika > > -Krutika (In reply to Krutika Dhananjay from comment #4) > Could you share the following two pieces of information - > > 1. output of `gluster volume info $VOLNAME` > 2. Are the glusterfs client and server running the same version of > gluster/RHGS? > > -Krutika 1.`gluster volume info $VOLNAME` [root@node1 ~]# gluster volume info gv1 Volume Name: gv1 Type: Distribute Volume ID: de5d9272-e237-4a4e-8a30-a7c737f393db Status: Started Snapshot Count: 0 Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.66.4.119:/br2 Options Reconfigured: nfs.disable: on transport.address-family: inet 2.Server version: [root@node1 ~]# rpm -qa |grep gluster libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64 glusterfs-api-devel-3.12.2-19.el7rhgs.x86_64 pcp-pmda-gluster-4.1.0-4.el7.x86_64 glusterfs-3.12.2-19.el7rhgs.x86_64 python2-gluster-3.12.2-19.el7rhgs.x86_64 glusterfs-server-3.12.2-19.el7rhgs.x86_64 glusterfs-geo-replication-3.12.2-19.el7rhgs.x86_64 glusterfs-api-3.12.2-19.el7rhgs.x86_64 glusterfs-devel-3.12.2-19.el7rhgs.x86_64 glusterfs-debuginfo-3.12.2-18.el7.x86_64 glusterfs-libs-3.12.2-19.el7rhgs.x86_64 glusterfs-cli-3.12.2-19.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-19.el7rhgs.x86_64 glusterfs-fuse-3.12.2-19.el7rhgs.x86_64 glusterfs-rdma-3.12.2-19.el7rhgs.x86_64 glusterfs-events-3.12.2-19.el7rhgs.x86_64 samba-vfs-glusterfs-4.8.3-4.el7.x86_64 Client version: [root@nssguest ~]# rpm -qa |grep gluster qemu-kvm-block-gluster-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64 glusterfs-3.12.2-32.1.el8.x86_64 glusterfs-client-xlators-3.12.2-32.1.el8.x86_64 libvirt-daemon-driver-storage-gluster-5.0.0-6.virtcov.el8.x86_64 glusterfs-libs-3.12.2-32.1.el8.x86_64 glusterfs-cli-3.12.2-32.1.el8.x86_64 glusterfs-api-3.12.2-32.1.el8.x86_64 (In reply to gaojianan from comment #6) > (In reply to Krutika Dhananjay from comment #4) > > Could you share the following two pieces of information - > > > > 1. output of `gluster volume info $VOLNAME` > > 2. Are the glusterfs client and server running the same version of > > gluster/RHGS? > > > > -Krutika > > 1.`gluster volume info $VOLNAME` > [root@node1 ~]# gluster volume info gv1 > > Volume Name: gv1 > Type: Distribute > Volume ID: de5d9272-e237-4a4e-8a30-a7c737f393db > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 > Transport-type: tcp > Bricks: > Brick1: 10.66.4.119:/br2 > Options Reconfigured: > nfs.disable: on > transport.address-family: inet > > > 2.Server version: > [root@node1 ~]# rpm -qa |grep gluster > libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64 > glusterfs-api-devel-3.12.2-19.el7rhgs.x86_64 > pcp-pmda-gluster-4.1.0-4.el7.x86_64 > glusterfs-3.12.2-19.el7rhgs.x86_64 > python2-gluster-3.12.2-19.el7rhgs.x86_64 > glusterfs-server-3.12.2-19.el7rhgs.x86_64 > glusterfs-geo-replication-3.12.2-19.el7rhgs.x86_64 > glusterfs-api-3.12.2-19.el7rhgs.x86_64 > glusterfs-devel-3.12.2-19.el7rhgs.x86_64 > glusterfs-debuginfo-3.12.2-18.el7.x86_64 > glusterfs-libs-3.12.2-19.el7rhgs.x86_64 > glusterfs-cli-3.12.2-19.el7rhgs.x86_64 > glusterfs-client-xlators-3.12.2-19.el7rhgs.x86_64 > glusterfs-fuse-3.12.2-19.el7rhgs.x86_64 > glusterfs-rdma-3.12.2-19.el7rhgs.x86_64 > glusterfs-events-3.12.2-19.el7rhgs.x86_64 > samba-vfs-glusterfs-4.8.3-4.el7.x86_64 > > Client version: > [root@nssguest ~]# rpm -qa |grep gluster > qemu-kvm-block-gluster-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64 > glusterfs-3.12.2-32.1.el8.x86_64 > glusterfs-client-xlators-3.12.2-32.1.el8.x86_64 > libvirt-daemon-driver-storage-gluster-5.0.0-6.virtcov.el8.x86_64 > glusterfs-libs-3.12.2-32.1.el8.x86_64 > glusterfs-cli-3.12.2-32.1.el8.x86_64 > glusterfs-api-3.12.2-32.1.el8.x86_64 Thanks. I tried the same set of steps with the same versions of gluster client and server and the test works for me everytime. Perhaps the ONLY difference between your configuration and mine is that my gluster-client is also on rhel7 unlike yours where you're running rhel8 on the client machine. Also the qemu-img versions could be different. Are you hitting this issue even with fuse mount, i.e., when you run `qemu-img info` this way - `qemu-img info $FUSE_MOUNT_PATH/aaa.qcow2`? If yes, could you run both `qemu-img create` and `qemu-img info` commands with strace for a fresh file: # strace -ff -T -v -o /tmp/qemu-img-create.out qemu-img create -f qcow2 $IMAGE_PATH 10M # strace -ff -T -v -o /tmp/qemu-img-info.out info $IMAGE_PATH_OVER_FUSE_MOUNT and share all of the resultant output files having format qemu-img-create.out* and qemu-img-info.out*? -Krutika Created attachment 1545120 [details]
The info of `qemu-img create` and `info`
(In reply to Krutika Dhananjay from comment #7) > (In reply to gaojianan from comment #6) > > (In reply to Krutika Dhananjay from comment #4) > > > Could you share the following two pieces of information - > > > > > > 1. output of `gluster volume info $VOLNAME` > > > 2. Are the glusterfs client and server running the same version of > > > gluster/RHGS? > > > > > > -Krutika > > > > 1.`gluster volume info $VOLNAME` > > [root@node1 ~]# gluster volume info gv1 > > > > Volume Name: gv1 > > Type: Distribute > > Volume ID: de5d9272-e237-4a4e-8a30-a7c737f393db > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 1 > > Transport-type: tcp > > Bricks: > > Brick1: 10.66.4.119:/br2 > > Options Reconfigured: > > nfs.disable: on > > transport.address-family: inet > > > > > > 2.Server version: > > [root@node1 ~]# rpm -qa |grep gluster > > libvirt-daemon-driver-storage-gluster-4.5.0-10.el7_6.3.x86_64 > > glusterfs-api-devel-3.12.2-19.el7rhgs.x86_64 > > pcp-pmda-gluster-4.1.0-4.el7.x86_64 > > glusterfs-3.12.2-19.el7rhgs.x86_64 > > python2-gluster-3.12.2-19.el7rhgs.x86_64 > > glusterfs-server-3.12.2-19.el7rhgs.x86_64 > > glusterfs-geo-replication-3.12.2-19.el7rhgs.x86_64 > > glusterfs-api-3.12.2-19.el7rhgs.x86_64 > > glusterfs-devel-3.12.2-19.el7rhgs.x86_64 > > glusterfs-debuginfo-3.12.2-18.el7.x86_64 > > glusterfs-libs-3.12.2-19.el7rhgs.x86_64 > > glusterfs-cli-3.12.2-19.el7rhgs.x86_64 > > glusterfs-client-xlators-3.12.2-19.el7rhgs.x86_64 > > glusterfs-fuse-3.12.2-19.el7rhgs.x86_64 > > glusterfs-rdma-3.12.2-19.el7rhgs.x86_64 > > glusterfs-events-3.12.2-19.el7rhgs.x86_64 > > samba-vfs-glusterfs-4.8.3-4.el7.x86_64 > > > > Client version: > > [root@nssguest ~]# rpm -qa |grep gluster > > qemu-kvm-block-gluster-3.1.0-18.module+el8+2834+fa8bb6e2.x86_64 > > glusterfs-3.12.2-32.1.el8.x86_64 > > glusterfs-client-xlators-3.12.2-32.1.el8.x86_64 > > libvirt-daemon-driver-storage-gluster-5.0.0-6.virtcov.el8.x86_64 > > glusterfs-libs-3.12.2-32.1.el8.x86_64 > > glusterfs-cli-3.12.2-32.1.el8.x86_64 > > glusterfs-api-3.12.2-32.1.el8.x86_64 > > Thanks. > > I tried the same set of steps with the same versions of gluster client and > server and the test works for me everytime. > Perhaps the ONLY difference between your configuration and mine is that my > gluster-client is also on rhel7 unlike yours where you're running rhel8 on > the client machine. Also the qemu-img versions could be different. > > Are you hitting this issue even with fuse mount, i.e., when you run > `qemu-img info` this way - `qemu-img info $FUSE_MOUNT_PATH/aaa.qcow2`? > > If yes, could you run both `qemu-img create` and `qemu-img info` commands > with strace for a fresh file: > > # strace -ff -T -v -o /tmp/qemu-img-create.out qemu-img create -f qcow2 > $IMAGE_PATH 10M > # strace -ff -T -v -o /tmp/qemu-img-info.out info $IMAGE_PATH_OVER_FUSE_MOUNT > > > and share all of the resultant output files having format > qemu-img-create.out* and qemu-img-info.out*? > > -Krutika I think this bug only happens when we create a file on the mounted path and check it with `qemu-img info gluster://$ip/filename` ,and `qemu-img info $FUSE_MOUNT_PATH/filename ` works well. OK, I took a look at the traces. Unfortunately in the libgfapi-access case, we need ltrace output instead of strace since all calls are made in the userspace. I did test ltrace command before sharing it with you just to be sure it works. But i see that the arguments to the library calls are not printed as symbols. Since you're seeing this issue only with gfapi, I'm passing this issue over to gfapi experts for a faster resolution. Poornima/Soumya/Jiffin, Could one of you help? -Krutika To start with, getting the logs exclusive to gfapi access and tcpdump while the below command is ran would be helpful - qemu-img info gluster://$ip/filename Setting needinfo on the reporter to get the info requested in comment 11. Created attachment 1546788 [details]
There are the gfapi log and tcpdump in the attachment
Status? @Soumya Did you get a chance to analyze the logs and tcpdump? Thanks, Mohit Agrawal (In reply to Mohit Agrawal from comment #16) > @Soumya > > Did you get a chance to analyze the logs and tcpdump? > > Thanks, > Mohit Agrawal Hi, I just looked at the files uploaded. The tcpdump doesnt have gluster traffic captured. Please ensure if the command was issued on the right machine (where the command is being executed) and verify the filters (for the right interface and IP etc) From the logs, I see there is a failure for SEEK() fop - [2019-03-22 06:47:34.557047] T [MSGID: 0] [dht-hashfn.c:94:dht_hash_compute] 0-gv1-dht: trying regex for test.img [2019-03-22 06:47:34.557059] D [MSGID: 0] [dht-common.c:3675:dht_lookup] 0-gv1-dht: Calling fresh lookup for /test.img on gv1-client-0 [2019-03-22 06:47:34.557067] T [MSGID: 0] [dht-common.c:3679:dht_lookup] 0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-dht to gv1-client-0 [2019-03-22 06:47:34.557079] T [rpc-clnt.c:1496:rpc_clnt_record] 0-gv1-client-0: Auth Info: pid: 10233, uid: 0, gid: 0, owner: [2019-03-22 06:47:34.557086] T [rpc-clnt.c:1353:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 420, payload: 348, rpc hdr: 72 [2019-03-22 06:47:34.557110] T [rpc-clnt.c:1699:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0xb Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) to rpc-transport (gv1-client-0) [2019-03-22 06:47:34.557513] T [rpc-clnt.c:675:rpc_clnt_reply_init] 0-gv1-client-0: received rpc message (RPC XID: 0xb Program: GlusterFS 3.3, ProgVers: 330, Proc: 27) from rpc-transport (gv1-client-0) [2019-03-22 06:47:34.557536] T [MSGID: 0] [client-rpc-fops.c:2873:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x55ce03dd1720, gv1-client-0 returned 0 [2019-03-22 06:47:34.557549] D [MSGID: 0] [dht-common.c:3228:dht_lookup_cbk] 0-gv1-dht: fresh_lookup returned for /test.img with op_ret 0 >> LOOKUP on /test.img was successful [2019-03-22 06:47:34.563416] T [MSGID: 0] [defaults.c:2927:default_seek] 0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-read-ahead to gv1-write-behind [2019-03-22 06:47:34.563424] T [MSGID: 0] [defaults.c:2927:default_seek] 0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-write-behind to gv1-dht [2019-03-22 06:47:34.563432] T [MSGID: 0] [defaults.c:2927:default_seek] 0-stack-trace: stack-address: 0x55ce03dd1720, winding from gv1-dht to gv1-client-0 [2019-03-22 06:47:34.563443] T [rpc-clnt.c:1496:rpc_clnt_record] 0-gv1-client-0: Auth Info: pid: 10233, uid: 0, gid: 0, owner: [2019-03-22 06:47:34.563451] T [rpc-clnt.c:1353:rpc_clnt_record_build_header] 0-rpc-clnt: Request fraglen 112, payload: 40, rpc hdr: 72 [2019-03-22 06:47:34.563478] T [rpc-clnt.c:1699:rpc_clnt_submit] 0-rpc-clnt: submitted request (XID: 0xc Program: GlusterFS 3.3, ProgVers: 330, Proc: 48) to rpc-transport (gv1-client-0) [2019-03-22 06:47:34.563990] T [rpc-clnt.c:675:rpc_clnt_reply_init] 0-gv1-client-0: received rpc message (RPC XID: 0xc Program: GlusterFS 3.3, ProgVers: 330, Proc: 48) from rpc-transport (gv1-client-0) [2019-03-22 06:47:34.564008] W [MSGID: 114031] [client-rpc-fops.c:2156:client3_3_seek_cbk] 0-gv1-client-0: remote operation failed [No such device or address] [2019-03-22 06:47:34.564028] D [MSGID: 0] [client-rpc-fops.c:2160:client3_3_seek_cbk] 0-stack-trace: stack-address: 0x55ce03dd1720, gv1-client-0 returned -1 error: No such device or address [No such device or address] [2019-03-22 06:47:34.564041] D [MSGID: 0] [defaults.c:1531:default_seek_cbk] 0-stack-trace: stack-address: 0x55ce03dd1720, gv1-io-threads returned -1 error: No such device or address [No such device or address] [2019-03-22 06:47:34.564051] D [MSGID: 0] [io-stats.c:2548:io_stats_seek_cbk] 0-stack-trace: stack-address: 0x55ce03dd1720, gv1 returned -1 error: No such device or address [No such device or address] client3_seek_cbk() received '-1'. We may first need to check why the fop was failed by server. If its reproducible, it should be fairly easy to check. @gaojianan Can you share the data asked by Soumya and share the brick logs along with data(client-logs and tcpdump)? Created attachment 1638327 [details] tcpdump log and gfapi log of the client (In reply to Mohit Agrawal from comment #18) > @gaojianan > Can you share the data asked by Soumya and share the brick logs along with > data(client-logs and tcpdump)? client version: glusterfs-client-xlators-6.0-20.el8.x86_64 glusterfs-libs-6.0-20.el8.x86_64 qemu-kvm-block-gluster-4.1.0-13.module+el8.1.0+4313+ef76ec61.x86_64 glusterfs-fuse-6.0-20.el8.x86_64 libvirt-daemon-driver-storage-gluster-5.6.0-7.module+el8.1.1+4483+2f45aaa2.x86_64 glusterfs-api-6.0-20.el8.x86_64 glusterfs-cli-6.0-20.el8.x86_64 glusterfs-6.0-20.el8.x86_64 Try again with the step as comment1. Steps to Reproduce: 1.Mount the gluster directory to local /tmp/gluster # mount.glusterfs 10.66.85.243:/jgao-vol1 /tmp/gluster 2.Create a new qcow2 file # qemu-img create -f qcow2 /tmp/gluster/test.img 100M 3.check it with qemu-img with gluster [root@localhost ~]# qemu-img info gluster://10.66.85.243/jgao-vol1/test.img qemu-img: Could not open 'gluster://10.66.85.243/jgao-vol1/test.img': Could not read L1 table: Input/output error More detail info in the attachment If any other question,you can needinfo me again. (In reply to gaojianan from comment #19) > Created attachment 1638327 [details] > tcpdump log and gfapi log of the client the tcpdump file contains too much other protocol data. It is better to use filter to get only glusterfs related network traffic. BTW, I have a questions, what ports are used in gluserfs by default for gluster-server-6.0.x ? 24007-24009? 49152? > > (In reply to Mohit Agrawal from comment #18) > > @gaojianan > > Can you share the data asked by Soumya and share the brick logs along with > > data(client-logs and tcpdump)? > client version: > glusterfs-client-xlators-6.0-20.el8.x86_64 > glusterfs-libs-6.0-20.el8.x86_64 > qemu-kvm-block-gluster-4.1.0-13.module+el8.1.0+4313+ef76ec61.x86_64 > glusterfs-fuse-6.0-20.el8.x86_64 > libvirt-daemon-driver-storage-gluster-5.6.0-7.module+el8.1.1+4483+2f45aaa2. > x86_64 > glusterfs-api-6.0-20.el8.x86_64 > glusterfs-cli-6.0-20.el8.x86_64 > glusterfs-6.0-20.el8.x86_64 > > > > Try again with the step as comment1. > Steps to Reproduce: > 1.Mount the gluster directory to local /tmp/gluster > # mount.glusterfs 10.66.85.243:/jgao-vol1 /tmp/gluster > > 2.Create a new qcow2 file > # qemu-img create -f qcow2 /tmp/gluster/test.img 100M > > 3.check it with qemu-img with gluster > [root@localhost ~]# qemu-img info gluster://10.66.85.243/jgao-vol1/test.img > qemu-img: Could not open 'gluster://10.66.85.243/jgao-vol1/test.img': Could > not read L1 table: Input/output error > > More detail info in the attachment > If any other question,you can needinfo me again. What's more, please update brick logs as comment18 said. That log is located in /var/log/glusterfs/bricks/ on glusterfs server. Created attachment 1638331 [details]
update brick log and tcp log for last log file
In the bricks log,the "gluster-vol1" is the same as "jgao-vol1" in other two files because i destroyed my env and setup again.
@Soumya Please check the latest logs and tcpdump? Thanks, Mohit Agrawal From the latest debug.log provided, I see this error - [2019-11-21 06:34:15.127610] D [MSGID: 0] [client-helpers.c:427:client_get_remote_fd] 0-jgao-vol1-client-0: not a valid fd for gfid: 59ca8bf2-f75a-427f-857e-98843a85dbac [Bad file descriptor] [2019-11-21 06:34:15.127620] W [MSGID: 114061] [client-common.c:1288:client_pre_seek] 0-jgao-vol1-client-0: (59ca8bf2-f75a-427f-857e-98843a85dbac) remote_fd is -1. EBADFD [File descriptor in bad state] [2019-11-21 06:34:15.127628] D [MSGID: 0] [client-rpc-fops.c:5949:client3_3_seek] 0-stack-trace: stack-address: 0x5625eed41b08, jgao-vol1-client-0 returned -1 error: File descriptor in bad state [File descriptor in bad state] [2019-11-21 06:34:15.127636] D [MSGID: 0] [defaults.c:1617:default_seek_cbk] 0-stack-trace: stack-address: 0x5625eed41b08, jgao-vol1-io-threads returned -1 error: File descriptor in bad state [File descriptor in bad state] client3_seek fop got EBADFD error. The fd used in the flag may have got flushed and no more valid. On further code-reading found that there is a bug in glfs_seek() fop. There is a missing ref on glfd which may have led to this issue. I will send patch to fix that. But however I am unable to reproduce this issue to test it. On my system the test always pass - [root@dhcp35-198 ~]# qemu-img create -f qcow2 /fuse-mnt/test.img 100M Formatting '/fuse-mnt/test.img', fmt=qcow2 size=104857600 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16 [root@dhcp35-198 ~]# [root@dhcp35-198 ~]# [root@dhcp35-198 ~]# qemu-img info gluster://localhost/rep_vol/test.img [2019-11-22 06:36:43.703941] E [MSGID: 108006] [afr-common.c:5322:__afr_handle_child_down_event] 0-rep_vol-replicate-0: All subvolumes are down. Going offline until at least one of them comes back up. [2019-11-22 06:36:43.705035] I [io-stats.c:4027:fini] 0-rep_vol: io-stats translator unloaded image: gluster://localhost/rep_vol/test.img file format: qcow2 virtual size: 100M (104857600 bytes) disk size: 193K cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false [root@dhcp35-198 ~]# I am using latest master branch of gluster. I shall post the fix for the bug in glfs_seek mentioned above. But if someone could test it, that shall be helpful. We can share the test build if Jianan agrees to test the same. @gaojianan Would it be possible for you to test the patch? Can you please confirm if you are able to reproduce the issue on rhgs 3.5? Thanks, Mohit Agrawal https://review.gluster.org/#/c/glusterfs/+/23739/ is the patch posted for fix in glfs_seek (In reply to Mohit Agrawal from comment #25) > We can share the test build if Jianan agrees to test the same. > > @gaojianan > > Would it be possible for you to test the patch? > Can you please confirm if you are able to reproduce the issue on rhgs 3.5? > > Thanks, > Mohit Agrawal ok,i will try it as soon as possible |
Created attachment 1518327 [details] about gluster server log and local glusterfs log Description of problem: Get errors"Could not read qcow2 header" when read qcow2 file in glusterfs Version-Release number of selected component (if applicable): gluster server:glusterfs-3.12.2-19.el7rhgs.x86_64 client:glusterfs-3.12.2-15.4.el8.x86_64 How reproducible: 100% Steps to Reproduce: 1.Mount the gluster directory to local /mnt # mount.glusterfs 10.66.4.119:/gv0 /mnt/ 2.Create a new qcow2 file # qemu-img create -f qcow2 /mnt/qcow2mnt.img 10M 3.check it with qemu-img with gluster [root@localhost ~]# qemu-img info gluster://10.66.4.119/gv0/qcow2mnt.img qemu-img: Could not open 'gluster://10.66.4.119/gv0/qcow2mnt.img': Could not read L1 table: Input/output error Actual results: As above Expected results: Can get the correct info of the qcow2 file. Additional info: 1."raw" image is ok in this scenario. 2.qemu-img info /mnt/qcow2mnt.img works well