990397 – VMs are moving to paused state because file descriptors going bad

Bug 990397 - VMs are moving to paused state because file descriptors going bad

Summary: VMs are moving to paused state because file descriptors going bad

Keywords:
Status:	CLOSED DUPLICATE of bug 998352
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	replicate
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Pranith Kumar K
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	998352
Blocks:
TreeView+	depends on / blocked

Reported:	2013-07-31 06:29 UTC by Pranith Kumar K
Modified:	2014-07-11 18:18 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-07-11 18:18:47 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Logs which lead to paused vms. (56.37 KB, application/zip) 2013-07-31 06:30 UTC, Pranith Kumar K	no flags	Details
View All

Description Pranith Kumar K 2013-07-31 06:29:21 UTC

Description of problem:

This issue is reported by Samuli on bug 953887, copy pasting the discussion about this issue.

 Samuli Heinonen 2013-07-01 05:28:00 EDT

Are there some other patches needed for this? If I'm correct this patch should be included in GlusterFS 3.4 beta4.

I'm hitting this issue with oVirt 3.3 (nightly) and GlusterFS 3.4.0beta4 when doing rebalance. Self heal alone seems to work fine. My setup consists of two Gluster servers and two oVirt nodes.

Gluster volume has following configuration:
Volume Name: ovirtsas
Type: Distributed-Replicate
Volume ID: 238fabac-911f-4283-b98a-a44e18beb02f
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: boar2:/gluster/sas/brick1/ovirtsas
Brick2: boar1:/gluster/sas/brick1/ovirtsas
Brick3: boar1:/gluster/sas/brick3/ovirtsas
Brick4: boar2:/gluster/sas/brick3/ovirtsas
Options Reconfigured:
storage.owner-uid: 36
storage.owner-gid: 36
network.ping-timeout: 10
performance.quick-read: off
performance.io-cache: off
performance.stat-prefetch: off
network.remote-dio: enable
performance.client-io-threads: enable

Filesystem is mounted with following options:
/dev/mapper/sas--brick1-export1 on /gluster/sas/brick1 type xfs (rw,noatime,inode64,nobarrier)
/dev/mapper/sas--brick1-export3 on /gluster/sas/brick3 type xfs (rw,noatime,inode64,nobarrier)

Gluster volume has been configured as POSIX storage domain in oVirt with background-qlen=32768 mount option.

Steps to reproduce:
1. Create distributed-replicated Gluster volume
2. Create and start some virtual machines with oVirt
2. Add new bricks to Gluster volume
3. Start rebalance

Actual results:
- Virtual machines which image file is being rebalanced are paused because of unknown storage error. VMs have to be shutdown ungracefully before they can be started again. 
- On some cases mountpoint is showing "Transpoint endpoint is not connected" error and oVirt node has to be rebooted before oVirt is able to connect to it again.

Expected results:
Rebalance doesn't affect running virtual machines.

[reply] [−]
Private
Comment 4 Samuli Heinonen 2013-07-24 08:04:57 EDT

I'm still seeing this issue with GlusterFS 3.4 GA and oVirt 3.2.2 using fuse . All systems are running CentOS 6.4 and GlusterFS installed from http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/.

Is anyone else able to reproduce this?

[reply] [−]
Private
Comment 5 Pranith Kumar K 2013-07-24 09:40:19 EDT

Samuli,
     Could you provide the mount and rebalance and brick logs of this test run.
Just zip them and attach to this bug.
Let me take a look at the logs and we will take it from there.

Pranith.

[reply] [−]
Private
Comment 6 Samuli Heinonen 2013-07-24 11:51:05 EDT

Created attachment 777850 [details]
Gluster log files from servers and clients

Log file attached. Please let me know if you need anything else.

Amar Tumballi 2013-07-24 13:24:36 EDT
Status: MODIFIED → CLOSED
Fixed In Version: glusterfs-3.4.0
Resolution: --- → CURRENTRELEASE
Last Closed: 2013-07-24 13:24:36
[reply] [−]
Private
Comment 7 Samuli Heinonen 2013-07-26 02:48:49 EDT

I rested this without background-qlen mount option and issue exists also without it.

[reply] [−]
Private
Comment 8 Pranith Kumar K 2013-07-26 03:03:59 EDT

Samuli,
     I shall get back to you on this by tuesday.

Pranith.

[reply] [−]
Private
Comment 9 Pranith Kumar K 2013-07-31 02:25:04 EDT

Samuli,
   We found that there are logs as below which indicate the vms moving to paused state. We are yet to root cause what lead to file descriptors going bad. 

Root cause for this issue seems to be different than why this bug is opened. Will be opening a new bug and add the logs you posted to that bug. You will be put in CC list of the bug. Will post my results in 2-3 days. Appreciate your help in giving the logs.

Pranith.

[2013-07-24 09:51:36.882031] W [client-rpc-fops.c:873:client3_3_writev_cbk] 1-hiomo1-dev1-sas1-client-4: remote operation failed: Bad file descriptor
[2013-07-24 09:51:36.882135] W [client-rpc-fops.c:873:client3_3_writev_cbk] 1-hiomo1-dev1-sas1-client-5: remote operation failed: Bad file descriptor
[2013-07-24 09:51:52.031444] W [fuse-bridge.c:2127:fuse_writev_cbk] 0-glusterfs-fuse: 571386: WRITE => -1 (Bad file descriptor)



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Pranith Kumar K 2013-07-31 06:30:22 UTC

Created attachment 780939 [details]
Logs which lead to paused vms.

Comment 2 Caerie Houchins 2013-08-15 17:00:47 UTC

We are seeing this symptom as well with Ovirt 3.3 latest builds and glusterfs-3.4 on CentOS 6.4.

Any attempt at a volume rebalance leads to paused vm's that need to be force powered off then restarted to regain functionality.  They do not recover from paused status.

Will provide logs if requested.

This is using:
distributed replicated gluster volume
glusterfs-cli-3.4.0-8.el6.x86_64
glusterfs-3.4.0-8.el6.x86_64
glusterfs-fuse-3.4.0-8.el6.x86_64
glusterfs-server-3.4.0-8.el6.x86_64
glusterfs-libs-3.4.0-8.el6.x86_64
glusterfs-geo-replication-3.4.0-8.el6.x86_64

ovirt-engine-websocket-proxy-3.3.0-0.3.beta1.el6.noarch
ovirt-engine-setup-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-engine-sdk-python-3.3.0.5-1.20130814.git988a6d3.el6.noarch
ovirt-engine-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-engine-backend-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-image-uploader-3.3.0-0.2.master.20130715.git7674462.el6.noarch
ovirt-host-deploy-1.1.0-0.2.master.20130813.gitd813ae3.el6.noarch
ovirt-engine-userportal-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-engine-restapi-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-engine-cli-3.3.0.4-1.20130718.gite0f993f.el6.noarch
ovirt-engine-webadmin-portal-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-engine-dbscripts-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-host-deploy-java-1.1.0-0.2.master.20130813.gitd813ae3.el6.noarch
ovirt-engine-tools-3.3.0-0.2.master.20130814135834.gitb3a5fe3.el6.noarch
ovirt-engine-lib-3.3.0-0.3.beta1.el6.noarch
ovirt-log-collector-3.3.0-0.2.master.20130723.git77829a0.el6.noarch
ovirt-iso-uploader-3.3.0-0.2.master.20130813.git0067d55.el6.noarch

Gluster volume info: 
 
Volume Name: vmstorage
Type: Distributed-Replicate
Volume ID: cb521898-912c-491f-adc7-3373b8b7d9a5
Status: Started
Number of Bricks: 15 x 2 = 30
Transport-type: tcp
Bricks:
Brick1: 192.168.12.108:/bricks/brick1
Brick2: 192.168.12.109:/bricks/brick1
Brick3: 192.168.12.108:/bricks/brick2
Brick4: 192.168.12.109:/bricks/brick2
Brick5: 192.168.12.110:/bricks/brick2
Brick6: 192.168.12.112:/bricks/brick2
Brick7: 192.168.12.108:/bricks/brick3
Brick8: 192.168.12.109:/bricks/brick3
Brick9: 192.168.12.110:/bricks/brick3
Brick10: 192.168.12.112:/bricks/brick3
Brick11: 192.168.12.108:/bricks/brick4
Brick12: 192.168.12.109:/bricks/brick4
Brick13: 192.168.12.110:/bricks/brick4
Brick14: 192.168.12.112:/bricks/brick4
Brick15: 192.168.12.108:/bricks/brick5
Brick16: 192.168.12.109:/bricks/brick5
Brick17: 192.168.12.110:/bricks/brick5
Brick18: 192.168.12.112:/bricks/brick5
Brick19: 192.168.12.108:/bricks/brick6
Brick20: 192.168.12.109:/bricks/brick6
Brick21: 192.168.12.110:/bricks/brick6
Brick22: 192.168.12.112:/bricks/brick6
Brick23: 192.168.12.108:/bricks/brick7
Brick24: 192.168.12.109:/bricks/brick7
Brick25: 192.168.12.110:/bricks/brick7
Brick26: 192.168.12.112:/bricks/brick7
Brick27: 192.168.12.108:/bricks/brick8
Brick28: 192.168.12.109:/bricks/brick8
Brick29: 192.168.12.110:/bricks/brick8
Brick30: 192.168.12.112:/bricks/brick8
Options Reconfigured:
performance.open-behind: off
network.remote-dio: on
cluster.eager-lock: enable
performance.stat-prefetch: off
performance.io-cache: off
performance.read-ahead: off
performance.quick-read: off
server.allow-insecure: on
storage.owner-gid: 36
storage.owner-uid: 36

From Ovirt engine.log

2013-08-15 11:37:51,345 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-61) VM spwalkp1m2 52137001-54d0-47ba-b096-a84f1c9457e6 moved from Up --> Paused
2013-08-15 11:37:51,354 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-61) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM spwalkp1m2 has paused due to unknown storage error.
2013-08-15 11:40:04,307 INFO  [org.ovirt.engine.core.bll.RunVmCommand] (ajp--127.0.0.1-8702-1) [214acbfc] Lock Acquired to object EngineLock [exclusiveLocks= key: 52137001-54d0-47ba-b096-a84f1c9457e6 value: VM
, sharedLocks= ]

From gluster node nfsp14m2 bricks-brick8.log 

bricks/bricks-brick8.log:[2013-08-15 16:38:05.819315] E [posix.c:2135:posix_writev] 0-vmstorage-posix: write failed: offset 749867008, Bad file descriptor
bricks/bricks-brick8.log:[2013-08-15 16:38:05.819431] I [server-rpc-fops.c:1439:server_writev_cbk] 0-vmstorage-server: 1019: WRITEV 0 (4ed70355-1ef5-4283-bfc9-d75399f08b0f) ==> (Bad file descriptor)
bricks/bricks-brick8.log:[2013-08-15 16:38:05.819530] E [posix.c:2135:posix_writev] 0-vmstorage-posix: write failed: offset 352321536, Bad file descriptor
bricks/bricks-brick8.log:[2013-08-15 16:38:05.819565] I [server-rpc-fops.c:1439:server_writev_cbk] 0-vmstorage-server: 1020: WRITEV 0 (4ed70355-1ef5-4283-bfc9-d75399f08b0f) ==> (Bad file descriptor)
bricks/bricks-brick8.log:[2013-08-15 16:38:05.819629] E [posix.c:2135:posix_writev] 0-vmstorage-posix: write failed: offset 352329728, Bad file descriptor
bricks/bricks-brick8.log:[2013-08-15 16:38:05.819659] I [server-rpc-fops.c:1439:server_writev_cbk] 0-vmstorage-server: 1021: WRITEV 0 (4ed70355-1ef5-4283-bfc9-d75399f08b0f) ==> (Bad file descriptor)

From gluster node nfsp15m2 bricks-brick8.log 

bricks/bricks-brick8.log:[2013-08-15 16:38:05.181461] E [posix.c:2135:posix_writev] 0-vmstorage-posix: write failed: offset 749867008, Bad file descriptor
bricks/bricks-brick8.log:[2013-08-15 16:38:05.181548] I [server-rpc-fops.c:1439:server_writev_cbk] 0-vmstorage-server: 984: WRITEV 0 (4ed70355-1ef5-4283-bfc9-d75399f08b0f) ==> (Bad file descriptor)
bricks/bricks-brick8.log:[2013-08-15 16:38:05.181657] E [posix.c:2135:posix_writev] 0-vmstorage-posix: write failed: offset 352321536, Bad file descriptor
bricks/bricks-brick8.log:[2013-08-15 16:38:05.181691] I [server-rpc-fops.c:1439:server_writev_cbk] 0-vmstorage-server: 985: WRITEV 0 (4ed70355-1ef5-4283-bfc9-d75399f08b0f) ==> (Bad file descriptor)
bricks/bricks-brick8.log:[2013-08-15 16:38:05.181756] E [posix.c:2135:posix_writev] 0-vmstorage-posix: write failed: offset 352329728, Bad file descriptor
bricks/bricks-brick8.log:[2013-08-15 16:38:05.181785] I [server-rpc-fops.c:1439:server_writev_cbk] 0-vmstorage-server: 986: WRITEV 0 (4ed70355-1ef5-4283-bfc9-d75399f08b0f) ==> (Bad file descriptor)

Comment 3 Pranith Kumar K 2013-09-06 07:12:49 UTC

COMMIT: http://review.gluster.org/5601 committed in master by Anand Avati (avati) 
------
commit 41fa8da33435b8ba05a7eddbccddd96cde1aa762
Author: Raghavendra Bhat <raghavendra>
Date:   Tue Aug 13 19:47:01 2013 +0530

    mount/fuse: save the basefd flags in the new fd
    
    Upon graph switch, the basefd's flags were not saved in the new fd created
    for the new graph upon which all the further requests for the open file
    would come. Thus posix was treating the fd as a read-only fd and was
    denying the write on the fds.
    
    Change-Id: I781b62b376a85d1a938c091559270c3f242f1a2a
    BUG: 998352
    Signed-off-by: Raghavendra Bhat <raghavendra>
    Reviewed-on: http://review.gluster.org/5601
    Reviewed-by: Amar Tumballi <amarts>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Hi,
     the patch above, which fixes the bug 998352 which has similar symptoms should address this issue as well. Moving this bug to MODIFIED State for now. Please feel free to re-open if the issue still persists.

Pranith.

Comment 4 Niels de Vos 2014-07-11 18:18:47 UTC


*** This bug has been marked as a duplicate of bug 998352 ***

Note You need to log in before you can comment on or make changes to this bug.