986812 – [performance/write-behind] errno always set to EIO eventhough the brick returns ENOSPC/EDQUOT

Bug 986812 - [performance/write-behind] errno always set to EIO eventhough the brick returns ENOSPC/EDQUOT

Summary: [performance/write-behind] errno always set to EIO eventhough the brick retur...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterfs
Sub Component:
Version:	2.1
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Bug Updates Notification Mailing List
QA Contact:	amainkar
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	985862
TreeView+	depends on / blocked

Reported:	2013-07-22 07:25 UTC by pushpesh sharma
Modified:	2015-12-03 17:18 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-12-03 17:18:40 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description pushpesh sharma 2013-07-22 07:25:20 UTC

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
 
[root@dhcp207-210 ~]# gluster volume info
 
Volume Name: test
Type: Distribute
Volume ID: 7fdf46ad-9476-487f-8896-dfe94d554bde
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.65.207.210:/mnt/lv1/lv1
Brick2: 10.65.207.210:/mnt/lv2/lv2
Brick3: 10.65.207.210:/mnt/lv3/lv3
Brick4: 10.65.207.210:/mnt/lv4/lv4
 
Volume Name: test2
Type: Distribute
Volume ID: 32ce6cde-7e09-49fa-83a8-d08713883466
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 10.65.207.210:/mnt/lv5/lv5
Brick2: 10.65.207.210:/mnt/lv6/lv6
Brick3: 10.65.207.210:/mnt/lv7/lv7
Brick4: 10.65.207.210:/mnt/lv8/lv8
[root@dhcp207-210 ~]# 



[root@dhcp207-210 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_dhcp207210-lv_root
                       50G  2.0G   45G   5% /
tmpfs                 1.5G     0  1.5G   0% /dev/shm
/dev/vda1             485M   32M  428M   7% /boot
/dev/mapper/vg_dhcp207210-lv_home
                      5.0G  160M  4.6G   4% /home
/dev/mapper/vg_dhcp207210-lv1
                      4.0G   83M  4.0G   3% /mnt/lv1
/dev/mapper/vg_dhcp207210-lv2
                      4.0G  4.0G  440K 100% /mnt/lv2
/dev/mapper/vg_dhcp207210-lv3
                      4.0G   83M  4.0G   3% /mnt/lv3
/dev/mapper/vg_dhcp207210-lv4
                      4.0G   83M  4.0G   3% /mnt/lv4
/dev/mapper/vg_dhcp207210-lv5
                      4.0G   82M  4.0G   2% /mnt/lv5
/dev/mapper/vg_dhcp207210-lv6
                      4.0G   82M  4.0G   2% /mnt/lv6
/dev/mapper/vg_dhcp207210-lv7
                      4.0G   82M  4.0G   2% /mnt/lv7
/dev/mapper/vg_dhcp207210-lv8
                      4.0G   82M  4.0G   2% /mnt/lv8
/dev/mapper/vg_dhcp207210-lv9

2. [psharma@dhcp193-66 dummy_files]$ scp 5g.txt root.207.210:/mnt/gluster-object/test/dir1/5g.txt
reverse mapping checking getaddrinfo for dhcp207-210.lab.eng.pnq.redhat.com [10.65.207.210] failed - POSSIBLE BREAK-IN ATTEMPT!
root.207.210's password: 
5g.txt                                                                      100% 5120MB  46.6MB/s   01:50    
scp: /mnt/gluster-object/test/dir1/5g.txt: Bad file descriptor
[psharma@dhcp193-66 dummy_files]$ 


3.[2013-07-20 21:42:01.697015] I [glusterfsd-mgmt.c:1544:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2013-07-22 05:49:22.434780] W [fuse-bridge.c:2631:fuse_writev_cbk] 0-glusterfs-fuse: 1030761: WRITE => -1 (Input/output error)
[2013-07-22 05:49:46.584326] W [fuse-bridge.c:1563:fuse_err_cbk] 0-glusterfs-fuse: 1030762: FLUSH() ERR => -1 (Bad file descriptor)
[2013-07-22 05:52:26.827174] W [fuse-bridge.c:2631:fuse_writev_cbk] 0-glusterfs-fuse: 1094900: WRITE => -1 (Input/output error)
[2013-07-22 05:52:47.887726] W [fuse-bridge.c:1563:fuse_err_cbk] 0-glusterfs-fuse: 1094901: FLUSH() ERR => -1 (Bad file descriptor)


4. [2013-07-17 10:46:42.315818] I [server-helpers.c:752:server_connection_put] 0-test-server: Shutting down connection dhcp207-210.lab.eng.pnq.redhat.com-4025-2013/07/15-08:20:59:455906-test-client-1-0
[2013-07-17 10:46:42.337189] I [server-helpers.c:585:server_log_conn_destroy] 0-test-server: destroyed connection of dhcp207-210.lab.eng.pnq.redhat.com-4025-2013/07/15-08:20:59:455906-test-client-1-0  
[2013-07-17 10:46:57.357146] I [server-handshake.c:569:server_setvolume] 0-test-server: accepted client from dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0 (version: 3.4.0.12rhs.beta3)
[2013-07-17 10:48:38.728039] I [server.c:773:server_rpc_notify] 0-test-server: disconnecting connection from dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0, Number of pending operations: 1
[2013-07-17 10:48:38.728053] I [server-helpers.c:752:server_connection_put] 0-test-server: Shutting down connection dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0
[2013-07-17 10:48:38.728067] I [server-helpers.c:585:server_log_conn_destroy] 0-test-server: destroyed connection of dhcp207-210.lab.eng.pnq.redhat.com-5026-2013/07/17-10:46:57:246503-test-client-1-0  
[2013-07-17 10:49:47.740998] I [server-handshake.c:569:server_setvolume] 0-test-server: accepted client from dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0 (version: 3.4.0.12rhs.beta3)
[2013-07-17 10:50:11.500038] I [server-handshake.c:569:server_setvolume] 0-test-server: accepted client from dhcp207-210.lab.eng.pnq.redhat.com-5211-2013/07/17-10:50:11:458044-test-client-1-0 (version: 3.4.0.12rhs.beta3)
[2013-07-17 10:50:29.143459] I [server.c:773:server_rpc_notify] 0-test-server: disconnecting connection from dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0, Number of pending operations: 1
[2013-07-17 10:50:29.143487] I [server-helpers.c:752:server_connection_put] 0-test-server: Shutting down connection dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0
[2013-07-17 10:50:29.143505] I [server-helpers.c:585:server_log_conn_destroy] 0-test-server: destroyed connection of dhcp207-210.lab.eng.pnq.redhat.com-5135-2013/07/17-10:49:47:702632-test-client-1-0  
[2013-07-20 21:42:01.595902] I [glusterfsd.c:1096:reincarnate] 0-glusterfsd: Fetching the volume file from server...
Actual results:


Expected results:

1. Is it a know limitation, never sighted a documentation. Might be I overlooked,any reference can be provided.Else it should be documented as know limitation. 

2. What should be error message in this case I/O error does not make much sense. 

3.   
Additional info:

Comment 2 Amar Tumballi 2013-09-27 12:26:59 UTC

> Expected results:
> 
> 1. Is it a know limitation, never sighted a documentation. Might be I 
> overlooked,any reference can be provided.Else it should be documented as know 
> limitation. 

Yes it is. I think we should document this.

> 
> 2. What should be error message in this case I/O error does not make much sense. 

The correct error in this case is ENOSPC (No space left on the device) and not EIO. That is the bug we should fix here.

Comment 3 Prashanth Pai 2013-10-22 09:52:17 UTC

In the presence of write-behind translator, EIO is returned. On turning off write-behind, ENOSPC is returned as expected. This is most likely a bug in write-behind translator. Refer to the following code snippet from 

xlators/performance/write-behind/src/write-behind.c : wb_fulfill_cbk

    if (op_ret == -1) {
        wb_fulfill_err (head, op_errno);
    } else if (op_ret < head->total_size) {
        /*
         * We've encountered a short write, for whatever reason.
         * Set an EIO error for the next fop. This should be
         * valid for writev or flush (close).
         *
         * TODO: Retry the write so we can potentially capture
         * a real error condition (i.e., ENOSPC).
         */
        wb_fulfill_err (head, EIO);
    }

Comment 7 Vivek Agarwal 2015-12-03 17:18:40 UTC

Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.

Note You need to log in before you can comment on or make changes to this bug.