Bug 1351640 - [RHEL-7] Account and container server makes excessive and redundant stat() calls
Summary: [RHEL-7] Account and container server makes excessive and redundant stat() calls
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gluster-swift
Version: rhgs-3.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: Development Freeze
: RHGS 3.2.0
Assignee: Prashanth Pai
QA Contact: surabhi
URL:
Whiteboard:
Depends On: 1374302
Blocks: 1351522 1378424
TreeView+ depends on / blocked
 
Reported: 2016-06-30 13:11 UTC by Prashanth Pai
Modified: 2017-03-23 05:13 UTC (History)
4 users (show)

Fixed In Version: swiftonfile-2.3.0-5.el7rhgs
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1378424 (view as bug list)
Environment:
Last Closed: 2017-03-23 05:13:39 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0488 0 normal SHIPPED_LIVE Red Hat Gluster Storage 3.2.0 gluster-swift bug fix and enhancement update 2017-03-23 09:07:21 UTC

Description Prashanth Pai 2016-06-30 13:11:08 UTC
Description of problem:
Account server and container server processes make excessive and redundant stat() calls while serving GET requests


Version-Release number of selected component (if applicable):
RHGS 3.1.3
swiftonfile-2.3.0-3.el7rhgs


How reproducible:
Always. Do a GET on a container and watch syscalls made by the container server process.


Steps to Reproduce:
1. Create a container
2. Create some objects in the container
3. Do GET on the container while strace is attached to container server process

root# ps aux | grep container-server
root      1396  0.0  1.2 272164 25396 ?        Ss   18:10   0:00 /usr/bin/python /usr/bin/swift-container-server /etc/swift/container-server.conf
root      1429  0.0  1.0 272564 20976 ?        S    18:10   0:00 /usr/bin/python /usr/bin/swift-container-server /etc/swift/container-server.conf

root# strace -ff -p 1396,1429
Process 1396 attached
Process 1429 attached

[pid  1429] stat("/mnt/gluster-object/test/c1", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid  1429] getxattr("/mnt/gluster-object/test/c1", "user.swift.metadata", 0x0, 0) = 148
[pid  1429] getxattr("/mnt/gluster-object/test/c1", "user.swift.metadata", "{\"X-Object-Count\":[0,0],\"X-Timestamp\":[\"1467290429.21123\",0],\"X-Type\":[\"container\",0],\"X-PUT-Timestamp\":[\"1467290429.34439\",0],\"X-Bytes-Used\":[0,0]}", 148) = 148
[pid  1429] stat("/mnt/gluster-object/test/c1", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid  1429] stat("/mnt/gluster-object/test/c1", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid  1429] openat(AT_FDCWD, "/mnt/gluster-object/test/c1", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 8
[pid  1429] getdents(8, /* 8 entries */, 32768) = 200
[pid  1429] getdents(8, /* 0 entries */, 32768) = 0
[pid  1429] close(8)                    = 0
[pid  1429] stat("/mnt/gluster-object/test/c1/o1", {st_mode=S_IFREG|0755, st_size=1, ...}) = 0
[pid  1429] stat("/mnt/gluster-object/test/c1/o2", {st_mode=S_IFREG|0755, st_size=1, ...}) = 0
[pid  1429] stat("/mnt/gluster-object/test/c1/o3", {st_mode=S_IFREG|0755, st_size=1, ...}) = 0
[pid  1429] stat("/mnt/gluster-object/test/c1/o4", {st_mode=S_IFREG|0755, st_size=1, ...}) = 0
[pid  1429] stat("/mnt/gluster-object/test/c1/dirobject", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid  1429] stat("/mnt/gluster-object/test/c1/o5", {st_mode=S_IFREG|0755, st_size=1, ...}) = 0
[pid  1429] getxattr("/mnt/gluster-object/test/c1/dirobject", "user.swift.metadata", 0x0, 0) = 174
[pid  1429] getxattr("/mnt/gluster-object/test/c1/dirobject", "user.swift.metadata", "{\"Content-Length\":0,\"ETag\":\"d41d8cd98f00b204e9800998ecf8427e\",\"X-Object-Type\":\"dir\",\"X-Timestamp\":\"1467291058.57983\",\"X-Type\":\"Object\",\"Content-Type\":\"application/directory\"}", 174) = 174
[pid  1429] poll([{fd=4, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
[pid  1429] lstat("/mnt/gluster-object/test/c1/dirobject", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid  1429] openat(AT_FDCWD, "/mnt/gluster-object/test/c1/dirobject", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 8
[pid  1429] getdents(8, /* 2 entries */, 32768) = 48
[pid  1429] getdents(8, /* 0 entries */, 32768) = 0
[pid  1429] close(8)                    = 0


Actual results:
Lot of stat() calls observed while serving the container listing.


Expected results:
Container server should not do additional stat() to figure out if an entry is a file or directory. Container server should use d_type information returned by readdir call.

Comment 2 Prashanth Pai 2016-07-28 13:30:12 UTC
Upstream patches:
http://review.gluster.org/#/c/14543/
http://review.gluster.org/#/c/14460/

Comment 6 Prashanth Pai 2016-11-21 09:53:30 UTC
Downstream patches merged:
https://code.engineering.redhat.com/gerrit/#/q/topic:bug-1351640

Comment 7 surabhi 2016-12-15 05:44:17 UTC
Verified that the number of stat calls are reduced on a GET operation:

Created 10000 objects in a container and executed GET on the container while strace is attached to the container process:

curl -i http://localhost:8080/v1/AUTH_test/dir1/ -X GET
strace -ff -p 11858,11888

pid 11888] restart_syscall(<... resuming interrupted call ...>) = 1
[pid 11888] accept(4, {sa_family=AF_INET, sin_port=htons(54862), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
[pid 11888] fcntl(5, F_GETFL)           = 0x2 (flags O_RDWR)
[pid 11888] fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 11888] fcntl(5, F_GETFL)           = 0x802 (flags O_RDWR|O_NONBLOCK)
[pid 11888] fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid 11888] accept(4, 0x7fffa157a040, [16]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 11888] recvfrom(5, "GET /test/0/AUTH_test/dir1/ HTTP"..., 8192, 0, NULL, NULL) = 287
[pid 11888] getsockname(5, {sa_family=AF_INET, sin_port=htons(6011), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
[pid 11888] lstat("/mnt/gluster-object/test", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 11888] lstat("/mnt/gluster-object/test/..", {st_mode=S_IFDIR|0755, st_size=63, ...}) = 0
[pid 11888] stat("/mnt/gluster-object/test/dir1", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
[pid 11888] getxattr("/mnt/gluster-object/test/dir1", "user.swift.metadata", 0x0, 0) = 150
[pid 11888] getxattr("/mnt/gluster-object/test/dir1", "user.swift.metadata", "{"X-Object-Count":[500,0],"X-Timestamp":["1481622561.19635",0],"X-Type":["container",0],"X-PUT-Timestamp":["1481622561.11308",0],"X-Bytes-Used":[0,0]}", 150) = 150
[pid 11888] openat(AT_FDCWD, "/mnt/gluster-object/test/dir1", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 7
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 23 entries */, 32768) = 552
[pid 11888] getdents(7, /* 19 entries */, 32768) = 456
[pid 11888] getdents(7, /* 0 entries */, 32768) = 0
[pid 11888] close(7)                    = 0
[pid 11888] poll([{fd=4, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
[pid 11888] sendto(5, "HTTP/1.1 200 OK\r\nContent-Length:"..., 2864, 0, NULL, 0) = 2864
[pid 11888] shutdown(5, SHUT_RDWR)      = 0
[pid 11888] close(5)                    = 0
[pid 11888] poll([{fd=4, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 60000

Detailed performance testing is not been done , only verification of reduced stat calls is performed. Marking the BZ verified.

swiftonfile-2.3.0-5.el7rhgs.noarch

Comment 9 errata-xmlrpc 2017-03-23 05:13:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0488.html


Note You need to log in before you can comment on or make changes to this bug.