Bug 1627617
Summary: | SAS job aborts complaining about file doesn't exist | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Raghavendra G <rgowdapp> | |
Component: | fuse | Assignee: | Raghavendra G <rgowdapp> | |
Status: | CLOSED ERRATA | QA Contact: | Nag Pavan Chilakam <nchilaka> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.4 | CC: | apaladug, bmarson, nchilaka, rgowdapp, rhs-bugs, sanandpa, sankarshan, sheggodu, storage-qa-internal | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | RHGS 3.4.z Batch Update 1 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.12.2-20 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1627620 (view as bug list) | Environment: | ||
Last Closed: | 2018-10-31 08:46:17 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1627620 |
Description
Raghavendra G
2018-09-11 04:32:15 UTC
From logs, I can see that /sasdata/bulked/base_forecast/partition_416_594_1_1.sas7bdat is a destination of rename: sasdata.log:[2018-09-09 02:40:10.772241] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (84ea03de-3032-44ff-81fc-8cd47cf14694) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat ((null)) (hash=sasdata-replicate-1/cache=<nul>) sasdata.log:[2018-09-09 02:40:10.861406] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (cbb8c3dd-9317-4398-b92f-39146ab26bed) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat (84ea03de-3032-44ff-81fc-8cd47cf14694) (hash=sasdata-replicate-1/cache=sasdata-replicate-0) sasdata.log:[2018-09-10 13:43:16.713796] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (b65e9f74-3151-40ae-b87b-2450a95796dd) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat ((null)) (hash=sasdata-replicate-1/cache=<nul>) sasdata.log:[2018-09-10 13:43:16.961512] I [MSGID: 109066] [dht-rename.c:2042:dht_rename] 0-sasdata-dht: renaming /bulked/base_forecast/partition_416_594_1_1.sas7bdat.lck (c36b2bc3-eebd-413f-91c4-894caa18e89c) (hash=sasdata-replicate-0/cache=sasdata-replicate-0) => /bulked/base_forecast/partition_416_594_1_1.sas7bdat (b65e9f74-3151-40ae-b87b-2450a95796dd) (hash=sasdata-replicate-1/cache=sasdata-replicate-0) So, what's causing SAS to error out is lack of rename atomicity on glusterfs. Note that as per rename atomicity rules, operations on destination should never fail. However, here they are failing. So, its a bug in glusterfs. Normally path based operations happen in two parts: * path is resolved into a unique handle. This is done through lookups * Actual operations like open happen on this handle Now when rename overwrites the destination, the inode associated with destination changes, so the previous handle/inode returned by lookup on dst before rename are no longer present in fs. But, when open happens on old handle, it fails. [root@rhs-client22 glusterfs]# grep OPEN sasdata.log* sasdata.log:[2018-09-09 01:34:14.671548] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51854: OPEN a9d26828-24fe-4da9-b3a0-d1693bf21f34 resolution failed sasdata.log:[2018-09-09 01:34:14.680534] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51873: OPEN e129abc7-7762-46e5-a704-99d7ea586c1e resolution failed sasdata.log:[2018-09-09 01:57:37.204277] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 563046: OPEN 0eb6df58-5b25-4f4e-aabc-df1a0f615b93 resolution failed sasdata.log:[2018-09-09 02:40:10.820746] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967246: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:10.937086] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967646: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:16.551310] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972175: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:16.604315] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972201: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:16.687723] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972240: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:41:27.403833] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982942: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed sasdata.log:[2018-09-09 02:41:27.405453] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982946: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed sasdata.log:[2018-09-09 02:41:27.407134] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982952: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed sasdata.log:[2018-09-10 13:43:17.007100] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1011630: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:47:50.980324] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081799: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:47:50.982690] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081806: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:49:45.617909] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117549: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:49:45.748022] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117585: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed VFS provides a way out of this by a mechanism of retrying (exactly once) the lookup and operation like open on receving an error with errno ESTALE. What might've caused the issue here is that the error returned by Glusterfs is ENOENT instead of ESTALE (note the error reported by SAS, it says "P_BF.PARTITION_416_594_1_1.DATA does not exist" not "stale file handle"). We've evidence for that from fuse-dump of /sasdata, 2018-09-10T19:19:45.73953758+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14 Unique:1117585 Nodeid:140296383928016 Uid:0 Gid:0 Pid:29342 Padding:0} {Flags:32770 Unused:0} 2018-09-10T19:19:45.748056325+05:30 "GLUSTER\xf5" {Len:16 Error:-2 Unique:1117585} "" 2018-09-10T19:19:45.617027131+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14 Unique:1117549 Nodeid:140296383928016 Uid:0 Gid:0 Pid:29330 Padding:0} {Flags:32768 Unused:0} 2018-09-10T19:19:45.617932073+05:30 "GLUSTER\xf5" {Len:16 Error:-2 Unique:1117549} "" Note the error returned in above two calls its -2 (ENOENT) and not ESTALE. This would prevent kernel from retrying. Note that earlier as part of readdirplus, partition_416_594_1_1.sas7bdat resolved to Nodeid:140296383928016 [{Nodeid:140296383928016 Generation:0 EntryValid:1 AttrValid:1 EntryValidNsec:0 AttrValidNsec:0 Attr:{Ino:13293258653971224285 Size:196608 Blocks:264 Atime:1536586996 Mtime:1536586997 Ctime:1536586997 Atimensec:836653129 Mtimensec:125644779 Ctimensec:126644750 Mode:33188 Nlink:1 Uid:0 Gid:0 Rdev:0 Blksize:131072 Padding:0}} {Ino:13293258653971224285 Off:1441 Namelen:30 Type:8 Name:[]} partition_416_594_1_1.sas7bdat] Earlier, I had submitted a patch to convert ENOENT errors to ESTALE, https://review.gluster.org/#/c/glusterfs/+/18521/ But this patch had missed out the codepath of open failing when inode resolution fails. As per the logs above, inode resolution in fact failed: [root@rhs-client22 glusterfs]# grep OPEN sasdata.log* sasdata.log:[2018-09-09 01:34:14.671548] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51854: OPEN a9d26828-24fe-4da9-b3a0-d1693bf21f34 resolution failed sasdata.log:[2018-09-09 01:34:14.680534] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 51873: OPEN e129abc7-7762-46e5-a704-99d7ea586c1e resolution failed sasdata.log:[2018-09-09 01:57:37.204277] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 563046: OPEN 0eb6df58-5b25-4f4e-aabc-df1a0f615b93 resolution failed sasdata.log:[2018-09-09 02:40:10.820746] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967246: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:10.937086] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 967646: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:16.551310] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972175: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:16.604315] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972201: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:40:16.687723] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 972240: OPEN d6e38db6-9fae-49e8-847e-bccdf15244a5 resolution failed sasdata.log:[2018-09-09 02:41:27.403833] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982942: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed sasdata.log:[2018-09-09 02:41:27.405453] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982946: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed sasdata.log:[2018-09-09 02:41:27.407134] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 982952: OPEN 1c84e06c-8551-4f18-9777-2a1730919bcf resolution failed sasdata.log:[2018-09-10 13:43:17.007100] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1011630: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:47:50.980324] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081799: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:47:50.982690] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1081806: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:49:45.617909] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117549: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed sasdata.log:[2018-09-10 13:49:45.748022] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 1117585: OPEN b65e9f74-3151-40ae-b87b-2450a95796dd resolution failed So, patching up those codepaths to convert ENOENT to ESTALE will fix the issue. Also, its recommended to make sure all inode based operations (like GETATTR, SETATTR etc) never fail with ENOENT, but instead fail with ESTALE. [{Nodeid:139984058438544 Generation:0 EntryValid:1 AttrValid:1 EntryValidNsec:0 AttrValidNsec:0 Attr:{Ino:9581958107213933434 Size:29949952 Blocks:58376 Atime:1538 986217 Mtime:1538986217 Ctime:1538986217 Atimensec:306568917 Mtimensec:782562917 Ctimensec:792562791 Mode:33188 Nlink:1 Uid:0 Gid:0 Rdev:0 Blksize:131072 Padding:0}} {Ino:9581958107213933434 Off:24207 Namelen:25 Type:8 Name:[]} partition_44_595.sas7bdat] 2018-10-08T13:41:21.757563122+05:30 "GLUSTER\xf5" OPEN {Len:48 Opcode:14 Unique:13750097 Nodeid:139984058438544 Uid:0 Gid:0 Pid:18718 Padding:0} {Flags:32768 Unused:0} 2018-10-08T13:41:21.799384956+05:30 "GLUSTER\xf5" {Len:16 Error:-2 Unique:13750097} "" open on partition_44_595.sas7bdat returned ENOENT. This is the cause for failure. I'll investigate why the error was ENOENT instead of ESTALE and post a fix shortly. [2018-10-08 08:11:21.790678] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13750097: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed [2018-10-08 08:11:23.607324] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13751444: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed [2018-10-08 08:11:24.655378] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13752230: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed [2018-10-08 08:11:25.678211] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13752750: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed [2018-10-08 08:11:26.682015] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13753576: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed [2018-10-08 08:11:27.688885] E [fuse-bridge.c:2236:fuse_open_resume] 0-glusterfs-fuse: 13754078: OPEN f5d1f79d-6f3d-494c-84f9-f42191292b7a resolution failed Looks like https://code.engineering.redhat.com/gerrit/#/c/150109/ is not present in this build. [root@rhs-client1 glusterfs]# rpm -qa | grep -i glusterfs glusterfs-libs-3.12.2-18.el7rhgs.x86_64 glusterfs-3.12.2-18.el7rhgs.x86_64 glusterfs-fuse-3.12.2-18.el7rhgs.x86_64 glusterfs-client-xlators-3.12.2-18.el7rhgs.x86_64 Isn't 3.12.2-18 corresponds to rhgs-3.4.0? I am pretty sure that this is not a rhgs-3.4.1 build Failqa'ing this bug as we are still seeing the issue, after discussing with Dev. (In reply to nchilaka from comment #16) > Failqa'ing this bug as we are still seeing the issue, after discussing with > Dev. Though I am yet to find evidence for this, one likely hypothesis for lock failure would be a stat done before lk might've failed with ENOENT. I'll be taking a look at fusedumps of previous runs to find whether application has this I/O pattern. *** Bug 1630735 has been marked as a duplicate of this bug. *** (In reply to Raghavendra G from comment #17) > (In reply to nchilaka from comment #16) > > Failqa'ing this bug as we are still seeing the issue, after discussing with > > Dev. > > Though I am yet to find evidence for this, one likely hypothesis for lock > failure would be a stat done before lk might've failed with ENOENT. I'll be > taking a look at fusedumps of previous runs to find whether application has > this I/O pattern. I did check one of the older fuse-dumps. The dominant pattern is, * lookup/readdirplus resolves the path to a nodeid. * open is done on a nodeid * getlk is done on fh got by open * setlk is done on fh got by open So, I didn't find any evidence to support the above hypothesis. Nevertheless patch https://review.gluster.org/#/c/glusterfs/+/21147/ is merged upstream as it fixes a genuine issue. raised a new bug to track part of issue which is yet to be fixed 1642442 - SAS job aborts complaining about file doesn't exist Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3432 |