Bug 1627617 - SAS job aborts complaining about file doesn't exist
Summary: SAS job aborts complaining about file doesn't exist
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: fuse
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.4.z Batch Update 1
Assignee: Raghavendra G
QA Contact: nchilaka
URL:
Whiteboard:
: 1630735 (view as bug list)
Depends On:
Blocks: 1627620
TreeView+ depends on / blocked
 
Reported: 2018-09-11 04:32 UTC by Raghavendra G
Modified: 2018-10-31 08:47 UTC (History)
9 users (show)

Fixed In Version: glusterfs-3.12.2-20
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1627620 (view as bug list)
Environment:
Last Closed: 2018-10-31 08:46:17 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:3432 None None None 2018-10-31 08:47:53 UTC

Description Raghavendra G 2018-09-11 04:32:15 UTC
Description of problem:

10SEP2018_18_47_24_refresh_mp.log..di_wrap_mg_process_unit.9.log:ERROR: A lock is not available for P_BF.PARTITION_416_594_1_1.DATA.
10SEP2018_18_47_24_refresh_mp.log..di_wrap_mg_process_unit.9.log:ERROR: File P_BF.PARTITION_416_594_1_1.DATA does not exist.
10SEP2018_18_47_24_refresh_mp.log..di_wrap_mg_process_unit.9.log:ERROR: File P_BF.PARTITION_416_594_1_1.DATA does not exist.

Note that the actual file created on glusterfs is /sasdata/bulked/base_forecast/partition_416_594_1_1.sas7bdat as SAS masks file name


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Raghavendra G 2018-09-11 04:47:38 UTC

From logs, I can see that /sasdata/bulked/base_forecast/partition_416_594_1_1.sas7bdat is a destination of rename:

sasdata.log:[2018-09-09 02:40:10.772241] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (84ea03de-3032-44ff-81fc-8cd47cf14694) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat ((null)) (hash=sasdata-replicate-1/cache=<nul>) 
sasdata.log:[2018-09-09 02:40:10.861406] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (cbb8c3dd-9317-4398-b92f-39146ab26bed) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat (84ea03de-3032-44ff-81fc-8cd47cf14694) (hash=sasdata-replicate-1/cache=sasdata-replicate-0) 
sasdata.log:[2018-09-10 13:43:16.713796] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (b65e9f74-3151-40ae-b87b-2450a95796dd) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat ((null)) (hash=sasdata-replicate-1/cache=<nul>) 
sasdata.log:[2018-09-10 13:43:16.961512] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (c36b2bc3-eebd-413f-91c4-894caa18e89c) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat (b65e9f74-3151-40ae-b87b-2450a95796dd) (hash=sasdata-replicate-1/cache=sasdata-replicate-0) 

So, what's causing SAS to error out is lack of rename atomicity on glusterfs. Note that as per rename atomicity rules, operations on destination should never fail. However, here they are failing. So, its a bug in glusterfs.

Normally path based operations happen in two parts:
* path is resolved into a unique handle. This is done through lookups
* Actual operations like open happen on this handle

Now when rename overwrites the destination, the inode associated with destination changes, so the previous handle/inode returned by lookup on dst before rename are no longer present in fs. But, when open happens on old handle, it fails.

[root@rhs-client22 glusterfs]# grep OPEN sasdata.log*
sasdata.log:[2018-09-09 01:34:14.671548] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51854: OPEN a9d26828-24fe-4da9-b3a0-d1693bf21f34 resolution failed
sasdata.log:[2018-09-09 01:34:14.680534] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51873: OPEN e129abc7-7762-46e5-a704-99d7ea586c1e resolution failed
sasdata.log:[2018-09-09 01:57:37.204277] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 563046: OPEN 0eb6df58-5b25-4f4e-aabc-df1a0f615b93 resolution failed
sasdata.log:[2018-09-09 02:40:10.820746] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967246: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:10.937086] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967646: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.551310] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972175: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.604315] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972201: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.687723] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972240: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:41:27.403833] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982942: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.405453] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982946: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.407134] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982952: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-10 13:43:17.007100] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1011630: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.980324] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081799: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.982690] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081806: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.617909] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117549: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.748022] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117585: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed

VFS provides a way out of this by a mechanism of retrying (exactly once) the lookup and operation like open on receving an error with errno ESTALE. What might've caused the issue here is that the error returned by Glusterfs is ENOENT instead of ESTALE (note the error reported by SAS, it says "P_BF.PARTITION_416_594_1_1.DATA does not exist" not "stale file handle").

We've evidence for that from fuse-dump of /sasdata,
2018-09-10T19:19:45.73953758+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14 Unique:1117585 Nodeid:140296383928016 Uid:0 Gid:0 Pid:29342 Padding:0} {Flags:32770 Unused:0}
2018-09-10T19:19:45.748056325+05:30 "GLUSTER\xf5" {Len:16 Error:-2 Unique:1117585} ""
2018-09-10T19:19:45.617027131+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14 Unique:1117549 Nodeid:140296383928016 Uid:0 Gid:0 Pid:29330 Padding:0} {Flags:32768 Unused:0} 
2018-09-10T19:19:45.617932073+05:30 "GLUSTER\xf5" {Len:16 Error:-2 Unique:1117549} "" 

Note the error returned in above two calls its -2 (ENOENT) and not ESTALE. This would prevent kernel from retrying.

Note that earlier as part of readdirplus, partition_416_594_1_1.sas7bdat resolved to Nodeid:140296383928016
[{Nodeid:140296383928016 Generation:0 EntryValid:1 AttrValid:1 EntryValidNsec:0 AttrValidNsec:0 Attr:{Ino:13293258653971224285 Size:196608 Blocks:264 Atime:1536586996 Mtime:1536586997 Ctime:1536586997 Atimensec:836653129 Mtimensec:125644779 Ctimensec:126644750 Mode:33188 Nlink:1 Uid:0 Gid:0 Rdev:0 Blksize:131072 Padding:0}} {Ino:13293258653971224285 Off:1441 Namelen:30 Type:8 Name:[]} partition_416_594_1_1.sas7bdat]

Earlier, I had submitted a patch to convert ENOENT errors to ESTALE,
https://review.gluster.org/#/c/glusterfs/+/18521/

But this patch had missed out the codepath of open failing when inode resolution fails. As per the logs above, inode resolution in fact failed:
[root@rhs-client22 glusterfs]# grep OPEN sasdata.log*
sasdata.log:[2018-09-09 01:34:14.671548] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51854: OPEN a9d26828-24fe-4da9-b3a0-d1693bf21f34 resolution failed
sasdata.log:[2018-09-09 01:34:14.680534] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51873: OPEN e129abc7-7762-46e5-a704-99d7ea586c1e resolution failed
sasdata.log:[2018-09-09 01:57:37.204277] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 563046: OPEN 0eb6df58-5b25-4f4e-aabc-df1a0f615b93 resolution failed
sasdata.log:[2018-09-09 02:40:10.820746] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967246: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:10.937086] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967646: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.551310] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972175: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.604315] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972201: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:40:16.687723] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972240: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed
sasdata.log:[2018-09-09 02:41:27.403833] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982942: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.405453] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982946: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-09 02:41:27.407134] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982952: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed
sasdata.log:[2018-09-10 13:43:17.007100] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1011630: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.980324] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081799: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:47:50.982690] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081806: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.617909] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117549: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed
sasdata.log:[2018-09-10 13:49:45.748022] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117585: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed

So, patching up those codepaths to convert ENOENT to ESTALE will fix the issue. Also, its recommended to make sure all inode based operations (like GETATTR, SETATTR etc) never fail with ENOENT, but instead fail with ESTALE.

Comment 3 Raghavendra G 2018-09-11 05:06:04 UTC
https://review.gluster.org/#/c/glusterfs/+/21146

Comment 11 Raghavendra G 2018-10-09 16:15:18 UTC
[{Nodeid:139984058438544 Generation:0 EntryValid:1 AttrValid:1 EntryValidNsec:0 AttrValidNsec:0 Attr:{Ino:9581958107213933434 Size:29949952 Blocks:58376 Atime:1538
986217 Mtime:1538986217 Ctime:1538986217 Atimensec:306568917 Mtimensec:782562917 Ctimensec:792562791 Mode:33188 Nlink:1 Uid:0 Gid:0 Rdev:0 Blksize:131072 Padding:0}} {Ino:9581958107213933434 Off:24207 Namelen:25
 Type:8 Name:[]} partition_44_595.sas7bdat]

2018-10-08T13:41:21.757563122+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14 Unique:13750097 Nodeid:139984058438544 Uid:0 Gid:0 Pid:18718 Padding:0} {Flags:32768 Unused:0} 
2018-10-08T13:41:21.799384956+05:30 "GLUSTER\xf5" {Len:16 Error:-2 Unique:13750097} "" 

open on partition_44_595.sas7bdat returned ENOENT. This is the cause for failure. I'll investigate why the error was ENOENT instead of ESTALE and post a fix shortly.

Comment 12 Raghavendra G 2018-10-09 16:30:39 UTC
[2018-10-08 08:11:21.790678] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13750097: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed
[2018-10-08 08:11:23.607324] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13751444: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed
[2018-10-08 08:11:24.655378] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13752230: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed
[2018-10-08 08:11:25.678211] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13752750: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed
[2018-10-08 08:11:26.682015] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13753576: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed
[2018-10-08 08:11:27.688885] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13754078: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed

Looks like https://code.engineering.redhat.com/gerrit/#/c/150109/ is not present in this build.


[root@rhs-client1 glusterfs]# rpm -qa | grep -i glusterfs
glusterfs-libs-3.12.2-18.el7rhgs.x86_64
glusterfs-3.12.2-18.el7rhgs.x86_64
glusterfs-fuse-3.12.2-18.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-18.el7rhgs.x86_64

Isn't 3.12.2-18 corresponds to rhgs-3.4.0? I am pretty sure that this is not a rhgs-3.4.1 build

Comment 16 nchilaka 2018-10-18 13:22:28 UTC
Failqa'ing this bug as we are still seeing the issue, after discussing with Dev.

Comment 17 Raghavendra G 2018-10-20 01:12:50 UTC
(In reply to nchilaka from comment #16)
> Failqa'ing this bug as we are still seeing the issue, after discussing with
> Dev.

Though I am yet to find evidence for this, one likely hypothesis for lock failure would be a stat done before lk might've failed with ENOENT. I'll be taking a look at fusedumps of previous runs to find whether application has this I/O pattern.

Comment 18 Raghavendra G 2018-10-22 04:54:26 UTC
*** Bug 1630735 has been marked as a duplicate of this bug. ***

Comment 20 Raghavendra G 2018-10-23 04:30:43 UTC
(In reply to Raghavendra G from comment #17)
> (In reply to nchilaka from comment #16)
> > Failqa'ing this bug as we are still seeing the issue, after discussing with
> > Dev.
> 
> Though I am yet to find evidence for this, one likely hypothesis for lock
> failure would be a stat done before lk might've failed with ENOENT. I'll be
> taking a look at fusedumps of previous runs to find whether application has
> this I/O pattern.

I did check one of the older fuse-dumps. The dominant pattern is,

* lookup/readdirplus resolves the path to a nodeid.
* open is done on a nodeid
* getlk is done on fh got by open
* setlk is done on fh got by open

So, I didn't find any evidence to support the above hypothesis. Nevertheless patch https://review.gluster.org/#/c/glusterfs/+/21147/ is merged upstream as it fixes a genuine issue.

Comment 25 nchilaka 2018-10-24 12:26:43 UTC
raised a new bug to track part of issue which is yet to be fixed 1642442 - SAS job aborts complaining about file doesn't exist

Comment 28 errata-xmlrpc 2018-10-31 08:46:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3432


Note You need to log in before you can comment on or make changes to this bug.