Bug 1331260
Summary: | Swift: The GET on object manifest with certain byte range fails to show the content of file. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | surabhi <sbhaloth> | ||||
Component: | glusterfs | Assignee: | Prashanth Pai <ppai> | ||||
Status: | CLOSED ERRATA | QA Contact: | surabhi <sbhaloth> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | rhgs-3.1 | CC: | asrivast, nlevinki, ppai, rcyriac, rgowdapp, rhinduja, rhs-bugs, sankarshan, sashinde, vbellur | ||||
Target Milestone: | --- | Keywords: | ZStream | ||||
Target Release: | RHGS 3.1.3 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | glusterfs-3.7.9-5 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1333023 (view as bug list) | Environment: | |||||
Last Closed: | 2016-06-23 05:20:06 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1333023, 1334700 | ||||||
Bug Blocks: | 1311817 | ||||||
Attachments: |
|
Description
surabhi
2016-04-28 07:05:22 UTC
Reset the component to glusterfs from gluster-swift as it's a bug in glusterfs client. RCA so far: This is roughly the series of operations done by gluster-swift object server: [pid 2937] open("/mnt/gluster-object/one/c1/manifest", O_RDONLY|O_CLOEXEC) = 10 [pid 2937] fgetxattr(10, "user.swift.metadata", 0x0, 0) = 216 [pid 2937] fgetxattr(10, "user.swift.metadata", "{"Content-Length":"0","X-Object-Manifest":"c1/segments1/","ETag":"d41d8cd98f00b204e9800998ecf8427e","X-Timestamp":"1462164335.27522","X-Object-Type":"file","X-Type":"Object","Content-Type":"application/octet-stream"}", 216) = 216 [pid 2937] close(10) = 0 [pid 2937] close(8) = 0 [pid 2937] open("/mnt/gluster-object/one/c1/segments1/0", O_RDONLY|O_CLOEXEC) = 10 [pid 2937] fgetxattr(10, "user.swift.metadata", 0x0, 0) = 189 [pid 2937] fgetxattr(10, "user.swift.metadata", "{"Content-Length":"3","ETag":"f97c5d29941bfb1b2fdab0874906ab82","X-Timestamp":"1461849857.48425","X-Object-Type":"file","X-Type":"Object","Content-Type":"application/x-www-form-urlencoded"}", 189) = 189 [pid 2937] close(10) = 0 [pid 2937] close(8) = 0 [pid 2937] open("/mnt/gluster-object/one/c1/segments1/1", O_RDONLY|O_CLOEXEC) = 10 [pid 2937] fgetxattr(10, "user.swift.metadata", 0x0, 0) = 189 [pid 2937] fgetxattr(10, "user.swift.metadata", "{"Content-Length":"3","ETag":"b8a9f715dbb64fd5c56e7783c6820a61","X-Timestamp":"1461849867.95959","X-Object-Type":"file","X-Type":"Object","Content-Type":"application/x-www-form-urlencoded"}", 189) = 189 [pid 2937] close(10) = 0 [pid 2937] close(8) = 0 [pid 2937] open("/mnt/gluster-object/one/c1/manifest", O_RDONLY|O_CLOEXEC) = 10 [pid 2937] fgetxattr(10, "user.swift.metadata", 0x0, 0) = -1 ENODATA (No data available) [pid 2937] close(13) = 0 [pid 2937] fgetxattr(10, "user.swift.metadata", 0x0, 0) = -1 ENODATA (No data available) [pid 2937] fsetxattr(10, "user.swift.metadata", "{"Content-Length":0,"ETag":"d41d8cd98f00b204e9800998ecf8427e","X-Timestamp":"1462164335.35395","X-Object-Type":"file","X-Type":"Object","Content-Type":"application/octet-stream"}", 178, 0) = 0 [pid 2937] fgetxattr(10, "user.swift.metadata", 0x0, 0) = 178 [pid 2937] fgetxattr(10, "user.swift.metadata", "{"Content-Length":0,"ETag":"d41d8cd98f00b204e9800998ecf8427e","X-Timestamp":"1462164335.35395","X-Object-Type":"file","X-Type":"Object","Content-Type":"application/octet-stream"}", 178) = 178 [pid 2937] close(10) = 0 [pid 2937] close(8) = 0 As it can be seen in above strace output, In the second set of getattr() operations, glusterfs client returns ENODATA to gluster-swift. This should never happen as the xattr named user.swift.metadata always exists on the object in the backend bricks. To summarize, glusterfs wrongly reports to gluster-swift that the xattr does not exist even though it exists in the backend. This triggers gluster-swift to create the xattr, thus over-writing the existing xattr causing metadata loss! This issue cannot be reproduced when either one of these volume options are turned off. # gluster volume set cache-swift-metadata off OR # gluster volume set one readdir-ahead off Had a look at tcpdump with wireshark and this is probably what the RCA is, will have to confirm it with further testing later: 1. md-cache caches user.swift.metadata xattr either on lookup_cbk or getxattr_cbk path. 2. md-cache returns the xattr from cache the first time gluster-swift asks for it. 3. When gluster-swift asks for the xattr the second time after a brief period, md-cache finds that it's not in it's cache. It reports back to gluster-swift with ENODATA. Between operation 2 and 3 listed above, readdirp_cbk would have updated the cache for the file, effectively resetting the xattr cache. But during this reset, the concerned xattr isn't present. #0 mdc_inode_xatt_set (this=<value optimized out>, inode=<value optimized out>, dict=<value optimized out>) at md-cache.c:598 #1 0x00007f6f26ac975e in mdc_readdirp_cbk (frame=0x7f6f3308c678, cookie=<value optimized out>, this=0x7f6f20010630, op_ret=4, op_errno=2, entries=0x7f6f251dc850, xdata=0x7f6f32aa7b14) at md-cache.c:1998 #2 0x00007f6f26edc814 in qr_readdirp_cbk (frame=0x7f6f3308ca80, cookie=<value optimized out>, this=<value optimized out>, op_ret=4, op_errno=2, entries=0x7f6f251dc850, xdata=0x7f6f32aa7b14) at quick-read.c:520 $28 = (struct md_cache *) 0x7f6f20078090 $29 = {md_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 1 '\001', write = 1 '\001', exec = 1 '\001'}, group = {read = 1 '\001', write = 0 '\000', exec = 1 '\001'}, other = {read = 1 '\001', write = 0 '\000', exec = 1 '\001'}}, md_nlink = 1, md_uid = 0, md_gid = 0, md_atime = 1461997370, md_atime_nsec = 442368834, md_mtime = 1461997370, md_mtime_nsec = 442368834, md_ctime = 1461997370, md_ctime_nsec = 476368690, md_rdev = 0, md_size = 0, md_blocks = 0, xattr = 0x0, linkname = 0x0, ia_time = 1461997385, xa_time = 1461997385, lock = 1} Breakpoint 1, mdc_inode_xatt_get (this=<value optimized out>, inode=<value optimized out>, dict=0x7f6f251dcac8) at md-cache.c:675 675 if (!mdc->xattr) #0 mdc_inode_xatt_get (this=<value optimized out>, inode=<value optimized out>, dict=0x7f6f251dcac8) at md-cache.c:675 #1 0x00007f6f26aca4c1 in mdc_getxattr (frame=0x7f6f3308c9d4, this=0x7f6f20010630, loc=0x7f6f1c0493e0, key=0x7f6f1c048350 "user.swift.metadata", xdata=0x0) at md-cache.c:1808 #2 0x00007f6f268ac681 in io_stats_getxattr (frame=0x7f6f3308cb2c, this=0x7f6f200119e0, loc=0x7f6f1c0493e0, name=0x7f6f1c048350 "user.swift.metadata", xdata=0x0) at io-stats.c:2289 $97 = (struct md_cache *) 0x7f6f20078090 $98 = {md_prot = {suid = 0 '\000', sgid = 0 '\000', sticky = 0 '\000', owner = {read = 1 '\001', write = 1 '\001', exec = 1 '\001'}, group = {read = 1 '\001', write = 0 '\000', exec = 1 '\001'}, other = {read = 1 '\001', write = 0 '\000', exec = 1 '\001'}}, md_nlink = 1, md_uid = 0, md_gid = 0, md_atime = 1461997370, md_atime_nsec = 442368834, md_mtime = 1461997370, md_mtime_nsec = 442368834, md_ctime = 1461997370, md_ctime_nsec = 476368690, md_rdev = 0, md_size = 0, md_blocks = 0, xattr = 0x0, linkname = 0x0, ia_time = 1461997385, xa_time = 1461997385, lock = 0} During readdirp, md-cache will set the xattr keys it wants to cache in xdata which is fetched from the backend bricks. However, during opendir_cbk fop, readdir-ahead xlator internally issues readdirp() calls which do not have any of the keys to be cached set in xdata. So they are not fetched. For the next readdirp() call, readdir-ahead returns the dirent entries from it's cache. And as these entries do not contain the xattrs, md-cache updates it's cache with wrong information which is inconsistent with the state in backend bricks. Created attachment 1152857 [details]
tcpdump
Upstream changes on review: http://review.gluster.org/#/c/14213/ http://review.gluster.org/#/c/14214/ Downstream backport on review here: https://code.engineering.redhat.com/gerrit/#/c/74100/ This does not require a doc text as it's a regression. testObjectManifest (test.functional.gluster_swift_tests.TestFile) ... ok test_PUT_large_object (test.functional.gluster_swift_tests.TestFile) ... ok testObjectManifest (test.functional.gluster_swift_tests.TestFileUTF8) ... ok test_PUT_large_object (test.functional.gluster_swift_tests.TestFileUTF8) ... ok testObjectContentFromMountPoint (test.functional.gluster_swift_tests.TestMultiProtocolAccess) ... ok testObjectMetadataWhenFileModified (test.functional.gluster_swift_tests.TestMultiProtocolAccess) ... ok testObjectsFromMountPoint (test.functional.gluster_swift_tests.TestMultiProtocolAccess) ... ok ********************************************************************** tree /mnt/gluster-object/test /mnt/gluster-object/test └── c1 1 directory, 0 files [root@dhcp46-62 swift]# curl -v -X PUT http://localhost:8080/v1/AUTH_test/c1/segments1/0 -d'one' * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > PUT /v1/AUTH_test/c1/segments1/0 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > Content-Length: 3 > Content-Type: application/x-www-form-urlencoded > * upload completely sent off: 3 out of 3 bytes < HTTP/1.1 201 Created < Last-Modified: Tue, 24 May 2016 09:58:02 GMT < Content-Length: 0 < Etag: f97c5d29941bfb1b2fdab0874906ab82 < Content-Type: text/html; charset=UTF-8 < X-Trans-Id: tx1d34c73fd2b24ad08e7c2-00574425a9 < Date: Tue, 24 May 2016 09:58:01 GMT < * Connection #0 to host localhost left intact [root@dhcp46-62 swift]# tree /mnt/gluster-object/test/mnt/gluster-object/test └── c1 └── segments1 └── 0 2 directories, 1 file [root@dhcp46-62 swift]# curl -v -X PUT http://localhost:8080/v1/AUTH_test/c1/segments1/1 -d'two' * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > PUT /v1/AUTH_test/c1/segments1/1 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > Content-Length: 3 > Content-Type: application/x-www-form-urlencoded > * upload completely sent off: 3 out of 3 bytes < HTTP/1.1 201 Created < Last-Modified: Tue, 24 May 2016 09:58:19 GMT < Content-Length: 0 < Etag: b8a9f715dbb64fd5c56e7783c6820a61 < Content-Type: text/html; charset=UTF-8 < X-Trans-Id: tx2fe861a2c36341178268f-00574425ba < Date: Tue, 24 May 2016 09:58:18 GMT < * Connection #0 to host localhost left intact [root@dhcp46-62 swift]# tree /mnt/gluster-object/test/mnt/gluster-object/test └── c1 └── segments1 ├── 0 └── 1 2 directories, 2 files [root@dhcp46-62 swift]# curl -v -X PUT http://localhost:8080/v1/AUTH_test/c1/segments1/2 -d'three' * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > PUT /v1/AUTH_test/c1/segments1/2 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > Content-Length: 5 > Content-Type: application/x-www-form-urlencoded > * upload completely sent off: 5 out of 5 bytes < HTTP/1.1 201 Created < Last-Modified: Tue, 24 May 2016 09:58:30 GMT < Content-Length: 0 < Etag: 35d6d33467aae9a2e3dccb4b6b027878 < Content-Type: text/html; charset=UTF-8 < X-Trans-Id: tx247729199a0d4f2ea0c29-00574425c5 < Date: Tue, 24 May 2016 09:58:29 GMT < * Connection #0 to host localhost left intact [root@dhcp46-62 swift]# tree /mnt/gluster-object/test/mnt/gluster-object/test └── c1 └── segments1 ├── 0 ├── 1 └── 2 2 directories, 3 files [root@dhcp46-62 swift]# curl -v -X PUT http://localhost:8080/v1/AUTH_test/c1/segments1/3 -d'four' * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > PUT /v1/AUTH_test/c1/segments1/3 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > Content-Length: 4 > Content-Type: application/x-www-form-urlencoded > * upload completely sent off: 4 out of 4 bytes < HTTP/1.1 201 Created < Last-Modified: Tue, 24 May 2016 09:58:39 GMT < Content-Length: 0 < Etag: 8cbad96aced40b3838dd9f07f6ef5772 < Content-Type: text/html; charset=UTF-8 < X-Trans-Id: txf4d73f8d236f4cac9e80b-00574425ce < Date: Tue, 24 May 2016 09:58:38 GMT < * Connection #0 to host localhost left intact [root@dhcp46-62 swift]# tree /mnt/gluster-object/test/mnt/gluster-object/test └── c1 └── segments1 ├── 0 ├── 1 ├── 2 └── 3 2 directories, 4 files [root@dhcp46-62 swift]# curl -v -X PUT http://localhost:8080/v1/AUTH_test/c1/segments1/4 -d'five' * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > PUT /v1/AUTH_test/c1/segments1/4 HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > Content-Length: 4 > Content-Type: application/x-www-form-urlencoded > * upload completely sent off: 4 out of 4 bytes < HTTP/1.1 201 Created < Last-Modified: Tue, 24 May 2016 09:58:49 GMT < Content-Length: 0 < Etag: 30056e1cab7a61d256fc8edd970d14f5 < Content-Type: text/html; charset=UTF-8 < X-Trans-Id: tx819ad1922bed43518fb1d-00574425d8 < Date: Tue, 24 May 2016 09:58:48 GMT < * Connection #0 to host localhost left intact [root@dhcp46-62 swift]# tree /mnt/gluster-object/test/mnt/gluster-object/test └── c1 └── segments1 ├── 0 ├── 1 ├── 2 ├── 3 └── 4 2 directories, 5 files [root@dhcp46-62 swift]# curl -v -X PUT http://localhost:8080/v1/AUTH_test/c1/manifest -H 'X-Object-Manifest: c1/segments1/' -H 'Content-Length: 0' * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > PUT /v1/AUTH_test/c1/manifest HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > X-Object-Manifest: c1/segments1/ > Content-Length: 0 > < HTTP/1.1 201 Created < Last-Modified: Tue, 24 May 2016 09:59:04 GMT < Content-Length: 0 < Etag: d41d8cd98f00b204e9800998ecf8427e < Content-Type: text/html; charset=UTF-8 < X-Trans-Id: tx9ffd58de02f242ca861cc-00574425e7 < Date: Tue, 24 May 2016 09:59:03 GMT < * Connection #0 to host localhost left intact [root@dhcp46-62 swift]# tree /mnt/gluster-object/test/mnt/gluster-object/test └── c1 ├── manifest └── segments1 ├── 0 ├── 1 ├── 2 ├── 3 └── 4 2 directories, 6 files [root@dhcp46-62 swift]# curl -v -X GET http://localhost:8080/v1/AUTH_test/c1/manifest * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > GET /v1/AUTH_test/c1/manifest HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > < HTTP/1.1 200 OK < Content-Length: 19 < Etag: "e154d45351dc4d2ed293cc6026714010" < Accept-Ranges: bytes < Last-Modified: Tue, 24 May 2016 09:59:04 GMT < X-Object-Manifest: c1/segments1/ < X-Timestamp: 1464083943.33414 < Content-Type: application/octet-stream < X-Trans-Id: tx12a7e1ef996d4245acab8-00574425f4 < Date: Tue, 24 May 2016 09:59:16 GMT < * Connection #0 to host localhost left intact onetwothreefourfive[root@dhcp46-62 swift]# curl -v -X GET http://localhost:8080/v1est -H 'Range: bytes=3-' * About to connect() to localhost port 8080 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 8080 (#0) > GET /v1/AUTH_test/c1/manifest HTTP/1.1 > User-Agent: curl/7.29.0 > Host: localhost:8080 > Accept: */* > Range: bytes=3- > < HTTP/1.1 206 Partial Content < Content-Length: 16 < Etag: "e154d45351dc4d2ed293cc6026714010" < Accept-Ranges: bytes < Last-Modified: Tue, 24 May 2016 09:59:04 GMT < Content-Range: bytes 3-18/19 < X-Object-Manifest: c1/segments1/ < X-Timestamp: 1464083943.33414 < Content-Type: application/octet-stream < X-Trans-Id: tx4e7a89071ba74e84a9f6c-0057442607 < Date: Tue, 24 May 2016 09:59:35 GMT < * Connection #0 to host localhost left intact twothreefourfive[root@dhcp46-62 swift]# Verified multiple times with different byte range, the test passes with autmation test suite as well as manually. Marking the BZ verified with following build: glusterfs-3.7.9-6.el7rhgs.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240 |