Bug 986812

Summary: [performance/write-behind] errno always set to EIO eventhough the brick returns ENOSPC/EDQUOT
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: pushpesh sharma <psharma>
Component: glusterfsAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED EOL QA Contact: amainkar
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: david.macdonald, ppai, rhs-bugs, rwheeler, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-03 17:18:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 985862    

Description pushpesh sharma 2013-07-22 07:25:20 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
 
[root@dhcp207-210 ~]# gluster volume info
 
Volume Name: test
Type: Distribute
Volume ID: 7fdf46ad-9476-487f-8896-dfe94d554bde
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.65.207.210:/mnt/lv1/lv1
Brick2: 10.65.207.210:/mnt/lv2/lv2
Brick3: 10.65.207.210:/mnt/lv3/lv3
Brick4: 10.65.207.210:/mnt/lv4/lv4
 
Volume Name: test2
Type: Distribute
Volume ID: 32ce6cde-7e09-49fa-83a8-d08713883466
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.65.207.210:/mnt/lv5/lv5
Brick2: 10.65.207.210:/mnt/lv6/lv6
Brick3: 10.65.207.210:/mnt/lv7/lv7
Brick4: 10.65.207.210:/mnt/lv8/lv8
[root@dhcp207-210 ~]# 



[root@dhcp207-210 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp207210-lv_root
                       50G  2.0G   45G   5% /
tmpfs                 1.5G     0  1.5G   0% /dev/shm
/dev/vda1             485M   32M  428M   7% /boot
/dev/mapper/vg_dhcp207210-lv_home
                      5.0G  160M  4.6G   4% /home
/dev/mapper/vg_dhcp207210-lv1
                      4.0G   83M  4.0G   3% /mnt/lv1
/dev/mapper/vg_dhcp207210-lv2
                      4.0G  4.0G  440K 100% /mnt/lv2
/dev/mapper/vg_dhcp207210-lv3
                      4.0G   83M  4.0G   3% /mnt/lv3
/dev/mapper/vg_dhcp207210-lv4
                      4.0G   83M  4.0G   3% /mnt/lv4
/dev/mapper/vg_dhcp207210-lv5
                      4.0G   82M  4.0G   2% /mnt/lv5
/dev/mapper/vg_dhcp207210-lv6
                      4.0G   82M  4.0G   2% /mnt/lv6
/dev/mapper/vg_dhcp207210-lv7
                      4.0G   82M  4.0G   2% /mnt/lv7
/dev/mapper/vg_dhcp207210-lv8
                      4.0G   82M  4.0G   2% /mnt/lv8
/dev/mapper/vg_dhcp207210-lv9

2. [psharma@dhcp193-66 dummy_files]$ scp 5g.txt root.207.210:/mnt/gluster-object/test/dir1/5g.txt
reverse mapping checking getaddrinfo for dhcp207-210.lab.eng.pnq.redhat.com [10.65.207.210] failed - POSSIBLE BREAK-IN ATTEMPT!
root.207.210's password: 
5g.txt                                                                      100% 5120MB  46.6MB/s   01:50    
scp: /mnt/gluster-object/test/dir1/5g.txt: Bad file descriptor
[psharma@dhcp193-66 dummy_files]$ 


3.[2013-07-20 21:42:01.697015] I [glusterfsd-mgmt.c:1544:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-22 05:49:22.434780] W [fuse-bridge.c:2631:fuse_writev_cbk] 0-glusterfs-fuse: 1030761: WRITE => -1 (Input/output error)
[2013-07-22 05:49:46.584326] W [fuse-bridge.c:1563:fuse_err_cbk] 0-glusterfs-fuse: 1030762: FLUSH() ERR => -1 (Bad file descriptor)
[2013-07-22 05:52:26.827174] W [fuse-bridge.c:2631:fuse_writev_cbk] 0-glusterfs-fuse: 1094900: WRITE => -1 (Input/output error)
[2013-07-22 05:52:47.887726] W [fuse-bridge.c:1563:fuse_err_cbk] 0-glusterfs-fuse: 1094901: FLUSH() ERR => -1 (Bad file descriptor)


4. [2013-07-17 10:46:42.315818] I [server-helpers.c:752:server_connection_put] 0-test-server: Shutting down connection dhcp207-210.lab.eng.pnq.redhat.com-4025-2013/07/15-08:20:59:455906-test-client-1-0
[2013-07-17 10:46:42.337189] I [server-helpers.c:585:server_log_conn_destroy] 0-test-server: destroyed connection of dhcp207-210.lab.eng.pnq.redhat.com-4025-2013/07/15-08:20:59:455906-test-client-1-0  
[2013-07-17 10:46:57.357146] I [server-handshake.c:569:server_setvolume] 0-test-server: accepted client from dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0 (version: 3.4.0.12rhs.beta3)
[2013-07-17 10:48:38.728039] I [server.c:773:server_rpc_notify] 0-test-server: disconnecting connection from dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0, Number of pending operations: 1
[2013-07-17 10:48:38.728053] I [server-helpers.c:752:server_connection_put] 0-test-server: Shutting down connection dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0
[2013-07-17 10:48:38.728067] I [server-helpers.c:585:server_log_conn_destroy] 0-test-server: destroyed connection of dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0  
[2013-07-17 10:49:47.740998] I [server-handshake.c:569:server_setvolume] 0-test-server: accepted client from dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0 (version: 3.4.0.12rhs.beta3)
[2013-07-17 10:50:11.500038] I [server-handshake.c:569:server_setvolume] 0-test-server: accepted client from dhcp207-210.lab.eng.pnq.redhat.com-5211-2013/07/17-10:50:11:458044-test-client-1-0 (version: 3.4.0.12rhs.beta3)
[2013-07-17 10:50:29.143459] I [server.c:773:server_rpc_notify] 0-test-server: disconnecting connection from dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0, Number of pending operations: 1
[2013-07-17 10:50:29.143487] I [server-helpers.c:752:server_connection_put] 0-test-server: Shutting down connection dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0
[2013-07-17 10:50:29.143505] I [server-helpers.c:585:server_log_conn_destroy] 0-test-server: destroyed connection of dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0  
[2013-07-20 21:42:01.595902] I [glusterfsd.c:1096:reincarnate] 0-glusterfsd: Fetching the volume file from server...
Actual results:


Expected results:

1. Is it a know limitation, never sighted a documentation. Might be I overlooked,any reference can be provided.Else it should be documented as know limitation. 

2. What should be error message in this case I/O error does not make much sense. 

3.   
Additional info:

Comment 2 Amar Tumballi 2013-09-27 12:26:59 UTC
> Expected results:
> 
> 1. Is it a know limitation, never sighted a documentation. Might be I 
> overlooked,any reference can be provided.Else it should be documented as know 
> limitation. 

Yes it is. I think we should document this.

> 
> 2. What should be error message in this case I/O error does not make much sense. 

The correct error in this case is ENOSPC (No space left on the device) and not EIO. That is the bug we should fix here.

Comment 3 Prashanth Pai 2013-10-22 09:52:17 UTC
In the presence of write-behind translator, EIO is returned. On turning off write-behind, ENOSPC is returned as expected. This is most likely a bug in write-behind translator. Refer to the following code snippet from 

xlators/performance/write-behind/src/write-behind.c : wb_fulfill_cbk

    if (op_ret == -1) {
        wb_fulfill_err (head, op_errno);
    } else if (op_ret < head->total_size) {
        /*
         * We've encountered a short write, for whatever reason.
         * Set an EIO error for the next fop. This should be
         * valid for writev or flush (close).
         *
         * TODO: Retry the write so we can potentially capture
         * a real error condition (i.e., ENOSPC).
         */
        wb_fulfill_err (head, EIO);
    }

Comment 7 Vivek Agarwal 2015-12-03 17:18:40 UTC
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.