Created attachment 1399948 [details] test for aligned and unligned write with O_DIRECT Description of problem: I catched in a brick log billions errors about Invalid argument during write: 018-02-23 14:57:37.624075] E [MSGID: 113072] [posix.c:3631:posix_writev] 0-ftp-pub-posix: write failed: offset 131072, [Invalid argument] [2018-02-23 14:57:37.624260] E [MSGID: 115067] [server-rpc-fops.c:1407:server_writev_cbk] 0-ftp-pub-server: 18548605: WRITEV 2 (cda02ff8-011e-4ecc-9e22-86741aa9fee5), client: multi.office.etersoft.ru-31148-2018/02/22-14:44:24:479443-ftp-pub-client-2-0-0, error-xlator: ftp-pub-posix [Invalid argument] In strace -y -f -p on glusterfsd process it seems like [pid 31198] pwrite64(28</var/local/eterglust/pub/.glusterfs/c1/a6/c1a6f57f-2082-466a-8f25-5430e281da58>, "libgl1-mesa-glx\nlibwine-vanilla\n", 32, 0) = -1 EINVAL (Invalid argument) The line in xlators/storage/posix/src/posix.c where we got error has the comment: /* not sure whether writev works on O_DIRECT'd fd */ retval = sys_pwrite (fd, buf, vector[idx].iov_len, internal_off); I wrote a little program (is attached) and discovered I have the error with newest kernels (4.4.*) and no problems with 2.6.32 kernel. As I see we need for buffer address and for buffer size both use aligned (512) values only. On both 32 and 64 bit system glusterfs 3.12.5 kernel 2.6.32, 4.4.105 test result: UNALIGNED address write: FAILED ALIGNED address write: FAILED UNALIGNED address with aligned size write: FAILED ALIGNED address and size write: SUCCESSFUL OpenVZ container result: UNALIGNED address write: SUCCESSFUL ALIGNED address write: SUCCESSFUL UNALIGNED address with aligned size write: SUCCESSFUL ALIGNED address and size write: SUCCESSFUL
Release 3.12 has been EOLd and this bug was still found to be in the NEW state, hence moving the version to mainline, to triage the same and take appropriate actions.
We have not noticed the problem in later kernels of Fedora29/30 etc. Needs to be tested again.
i'm seeing a similar issue on gluster 6.6 with centos 7 (kernel 3.10.0-1062.4.3.el7.x86_64); [2019-11-19 14:56:04.017381] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-ovirt-data-posix: write failed: offset 0, [Invalid argument] [2019-11-19 14:56:04.017462] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-ovirt-data-server: 221969: WRITEV 0 (309c077f-8882-43f7-a95b-ca2c4d27d2b5), client: CTX_ID:b3c80b69-0651-4e87-96d1-ee767cb7e425-GRAPH_ID:10-PID:19184-HOST:lease-16.dc01.adsolutions-PC_NAME:ovirt-data-client-1-RECON_NO:-0, error-xlator: ovirt-data-posix [Invalid argument] [2019-11-19 14:56:12.430962] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-ovirt-data-server: 219748: WRITEV 0 (921dfa09-b252-4087-9c7c-47eda2a6266d), client: CTX_ID:05f7b92c-8dd6-434b-b835-7254dae1d1bc-GRAPH_ID:4-PID:93937-HOST:lease-23.dc01.adsolutions-PC_NAME:ovirt-data-client-1-RECON_NO:-0, error-xlator: ovirt-data-posix [Invalid argument] [2019-11-19 14:56:27.345631] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-ovirt-data-server: 203815: WRITEV 4 (981676ff-6dbe-4a4c-8478-6e4f991a04f4), client: CTX_ID:366e668d-91ba-4373-960e-82e56f1ed7af-GRAPH_ID:0-PID:22624-HOST:lease-08.dc01.adsolutions-PC_NAME:ovirt-data-client-1-RECON_NO:-0, error-xlator: ovirt-data-posix [Invalid argument] [2019-11-19 14:56:45.491788] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-ovirt-data-server: 210249: WRITEV 2 (a27a81c0-de78-40ee-9855-a62b6be01ffe), client: CTX_ID:4472864a-0fec-4e2c-ad3f-b9684b0808f6-GRAPH_ID:0-PID:30323-HOST:lease-21.dc01.adsolutions-PC_NAME:ovirt-data-client-1-RECON_NO:-0, error-xlator: ovirt-data-posix [Invalid argument] Also i notice the cpu usage when this error occurs is very high. The volume is configured with O_DIRECT; Volume Name: ovirt-data Type: Distributed-Replicate Volume ID: 2775dc10-c197-446e-a73f-275853d38666 Status: Started Snapshot Count: 0 Number of Bricks: 4 x (2 + 1) = 12 Transport-type: tcp Bricks: Brick1: 10.201.0.5:/data5/gfs/bricks/brick1/ovirt-data Brick2: 10.201.0.1:/data5/gfs/bricks/brick1/ovirt-data Brick3: 10.201.0.9:/data0/gfs/bricks/bricka/ovirt-data (arbiter) Brick4: 10.201.0.7:/data5/gfs/bricks/brick1/ovirt-data Brick5: 10.201.0.9:/data5/gfs/bricks/brick1/ovirt-data Brick6: 10.201.0.11:/data0/gfs/bricks/bricka/ovirt-data (arbiter) Brick7: 10.201.0.6:/data5/gfs/bricks/brick1/ovirt-data Brick8: 10.201.0.8:/data5/gfs/bricks/brick1/ovirt-data Brick9: 10.201.0.12:/data0/gfs/bricks/bricka/ovirt-data (arbiter) Brick10: 10.201.0.12:/data5/gfs/bricks/brick1/ovirt-data Brick11: 10.201.0.11:/data5/gfs/bricks/brick1/ovirt-data Brick12: 10.201.0.10:/data0/gfs/bricks/bricka/ovirt-data (arbiter) Options Reconfigured: performance.strict-o-direct: on server.event-threads: 6 performance.cache-size: 384MB performance.write-behind-window-size: 512MB user.cifs: off features.shard: on cluster.shd-wait-qlength: 10000 cluster.shd-max-threads: 8 cluster.locking-scheme: granular cluster.data-self-heal-algorithm: full cluster.server-quorum-type: server cluster.quorum-type: auto cluster.eager-lock: enable network.remote-dio: off performance.low-prio-threads: 32 performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off nfs.disable: on performance.readdir-ahead: on transport.address-family: inet storage.owner-uid: 36 storage.owner-gid: 36 server.outstanding-rpc-limit: 1024 cluster.choose-local: off cluster.brick-multiplex: on
This bug is moved to https://github.com/gluster/glusterfs/issues/946, and will be tracked there from now on. Visit GitHub issues URL for further details