REVIEW: https://review.gluster.org/17644 (features/shard: Remove ctx from LRU in shard_forget) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)
Problem: There is a race when the following two commands are executed on the mount in parallel from two different terminals on a sharded volume, which leads to use-after-free. Terminal-1: while true; do dd if=/dev/zero of=file1 bs=1M count=4; done Terminal-2: while true; do cat file1 > /dev/null; done In the normal case this is the life-cycle of a shard-inode 1) Shard is added to LRU when it is first looked-up 2) For every operation on the shard it is moved up in LRU 3) When "unlink of the shard"/"LRU limit is hit" happens it is removed from LRU But we are seeing a race where the inode stays in Shard LRU even after it is forgotten which leads to Use-after-free and then some memory-corruptions. These are the steps: 1) Shard is added to LRU when it is first looked-up 2) For every operation on the shard it is moved up in LRU Reader-handler Truncate-handler 1) Reader handler needs shard-x to be read. 1) Truncate has just deleted shard-x 2) In shard_common_resolve_shards(), it does inode_resolve() and that leads to a hit in LRU, so it is going to call __shard_update_shards_inode_list() to move the inode to top of LRU 2) shard-x gets unlinked from the itable and inode_forget(inode, 0) is called to make sure the inode can be purged upon last unref 3) when __shard_update_shards_inode_list() is called it finds that the inode is not in LRU so it adds it back to the LRU-list Both these operations complete and call inode_unref(shard-x) which leads to the inode getting freed and forgotten, even when it is in Shard LRU list. When more inodes are added to LRU, use-after-free will happen and it leads to undefined behaviors. ASAN trace: root@dhcp35-190 - ~ 18:25:38 :) ⚡ /usr/local/sbin/glusterfs --volfile-server=localhost.localdomain --volfile-id=/r3 /mnt/r3 -N ==388==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ================================================================= ==388==ERROR: AddressSanitizer: heap-use-after-free on address 0x611001aa7940 at pc 0x7f187f51023c bp 0x7f187d647da0 sp 0x7f187d647d90 WRITE of size 8 at 0x611001aa7940 thread T8 #0 0x7f187f51023b in list_del ../../../../libglusterfs/src/list.h:76 #1 0x7f187f51037f in list_move_tail ../../../../libglusterfs/src/list.h:106 #2 0x7f187f515850 in __shard_update_shards_inode_list /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:540 #3 0x7f187f519a88 in shard_common_resolve_shards /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:653 #4 0x7f187f51c9a9 in shard_refresh_dot_shard /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:884 #5 0x7f187f556eed in shard_post_lookup_readv_handler /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:3550 #6 0x7f187f520f27 in shard_lookup_base_file /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:1197 #7 0x7f187f55861a in shard_readv /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:3609 #8 0x7f187f2ea89d in wb_readv_helper /home/pk/workspace/rhs-glusterfs/xlators/performance/write-behind/src/write-behind.c:1689 #9 0x7f1890e4a629 in call_resume_wind /home/pk/workspace/rhs-glusterfs/libglusterfs/src/call-stub.c:2039 #10 0x7f1890e73bf9 in call_resume_keep_stub /home/pk/workspace/rhs-glusterfs/libglusterfs/src/call-stub.c:2578 #11 0x7f187f2e8229 in wb_do_winds /home/pk/workspace/rhs-glusterfs/xlators/performance/write-behind/src/write-behind.c:1524 #12 0x7f187f2e8516 in wb_process_queue /home/pk/workspace/rhs-glusterfs/xlators/performance/write-behind/src/write-behind.c:1558 #13 0x7f187f2ea9b4 in wb_readv /home/pk/workspace/rhs-glusterfs/xlators/performance/write-behind/src/write-behind.c:1715 #14 0x7f1890fa8ce2 in default_readv /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:2345 #15 0x7f1890f8fec6 in default_readv_resume /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:1687 #16 0x7f1890e4a629 in call_resume_wind /home/pk/workspace/rhs-glusterfs/libglusterfs/src/call-stub.c:2039 #17 0x7f1890e7363f in call_resume /home/pk/workspace/rhs-glusterfs/libglusterfs/src/call-stub.c:2508 #18 0x7f187eeb64dc in open_and_resume /home/pk/workspace/rhs-glusterfs/xlators/performance/open-behind/src/open-behind.c:245 #19 0x7f187eeb8085 in ob_readv /home/pk/workspace/rhs-glusterfs/xlators/performance/open-behind/src/open-behind.c:401 #20 0x7f187ec7e548 in io_stats_readv /home/pk/workspace/rhs-glusterfs/xlators/debug/io-stats/src/io-stats.c:2913 #21 0x7f1890fa8ce2 in default_readv /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:2345 #22 0x7f187ea1308d in meta_readv /home/pk/workspace/rhs-glusterfs/xlators/meta/src/meta.c:74 #23 0x7f1884f1a1bd in fuse_readv_resume /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-bridge.c:2246 #24 0x7f1884efb36d in fuse_fop_resume /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-bridge.c:556 #25 0x7f1884ef5b4b in fuse_resolve_done /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:663 #26 0x7f1884ef5ce7 in fuse_resolve_all /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:690 #27 0x7f1884ef5b2c in fuse_resolve /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:654 #28 0x7f1884ef5c94 in fuse_resolve_all /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:686 #29 0x7f1884ef5d45 in fuse_resolve_continue /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:706 #30 0x7f1884ef570e in fuse_resolve_fd /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:566 #31 0x7f1884ef5ada in fuse_resolve /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:643 #32 0x7f1884ef5bf1 in fuse_resolve_all /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:679 #33 0x7f1884ef5daa in fuse_resolve_and_resume /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-resolve.c:718 #34 0x7f1884f1a63c in fuse_readv /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-bridge.c:2281 #35 0x7f1884f3ec79 in fuse_thread_proc /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-bridge.c:5071 #36 0x7f188fbcd6c9 in start_thread (/lib64/libpthread.so.0+0x76c9) #37 0x7f188f4a7f7e in clone (/lib64/libc.so.6+0x107f7e) 0x611001aa7940 is located 192 bytes inside of 240-byte region [0x611001aa7880,0x611001aa7970) freed by thread T4 here: #0 0x7f18912f4b00 in free (/lib64/libasan.so.3+0xc6b00) previously allocated by thread T6 here: #0 0x7f18912f5020 in calloc (/lib64/libasan.so.3+0xc7020) #1 0x7f1890e7c67b in __gf_calloc /home/pk/workspace/rhs-glusterfs/libglusterfs/src/mem-pool.c:117 #2 0x7f187f51205a in __shard_inode_ctx_get /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:74 #3 0x7f187f5123b4 in __shard_inode_ctx_set /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:112 #4 0x7f187f513631 in shard_inode_ctx_set /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:172 #5 0x7f187f52deab in shard_link_block_inode /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:1656 #6 0x7f187f551306 in shard_common_mknod_cbk /home/pk/workspace/rhs-glusterfs/xlators/features/shard/src/shard.c:3246 #7 0x7f187f87f3ac in dht_newfile_cbk /home/pk/workspace/rhs-glusterfs/xlators/cluster/dht/src/dht-common.c:5580 #8 0x7f187fb83315 in afr_mknod_unwind /home/pk/workspace/rhs-glusterfs/xlators/cluster/afr/src/afr-dir-write.c:553 #9 0x7f187fb7e65d in __afr_dir_write_cbk /home/pk/workspace/rhs-glusterfs/xlators/cluster/afr/src/afr-dir-write.c:265 #10 0x7f187fb833ca in afr_mknod_wind_cbk /home/pk/workspace/rhs-glusterfs/xlators/cluster/afr/src/afr-dir-write.c:567 #11 0x7f187ff2e8da in client3_3_mknod_cbk /home/pk/workspace/rhs-glusterfs/xlators/protocol/client/src/client-rpc-fops.c:240 #12 0x7f1890b6a3ec in rpc_clnt_handle_reply /home/pk/workspace/rhs-glusterfs/rpc/rpc-lib/src/rpc-clnt.c:794 #13 0x7f1890b6af04 in rpc_clnt_notify /home/pk/workspace/rhs-glusterfs/rpc/rpc-lib/src/rpc-clnt.c:987 #14 0x7f1890b6172c in rpc_transport_notify /home/pk/workspace/rhs-glusterfs/rpc/rpc-lib/src/rpc-transport.c:538 #15 0x7f18824ad582 in socket_event_poll_in /home/pk/workspace/rhs-glusterfs/rpc/rpc-transport/socket/src/socket.c:2306 #16 0x7f18824ae3ed in socket_event_handler /home/pk/workspace/rhs-glusterfs/rpc/rpc-transport/socket/src/socket.c:2458 #17 0x7f1890f006e6 in event_dispatch_epoll_handler /home/pk/workspace/rhs-glusterfs/libglusterfs/src/event-epoll.c:572 #18 0x7f1890f00d9c in event_dispatch_epoll_worker /home/pk/workspace/rhs-glusterfs/libglusterfs/src/event-epoll.c:648 #19 0x7f188fbcd6c9 in start_thread (/lib64/libpthread.so.0+0x76c9) Thread T8 created by T6 here: #0 0x7f189125f488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488) #1 0x7f1890e26887 in gf_thread_create /home/pk/workspace/rhs-glusterfs/libglusterfs/src/common-utils.c:3733 #2 0x7f1884f402fa in notify /home/pk/workspace/rhs-glusterfs/xlators/mount/fuse/src/fuse-bridge.c:5312 #3 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #4 0x7f1890fb0814 in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3107 #5 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #6 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #7 0x7f187ec97243 in notify /home/pk/workspace/rhs-glusterfs/xlators/debug/io-stats/src/io-stats.c:4150 #8 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #9 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #10 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #11 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #12 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #13 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #14 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #15 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #16 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #17 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #18 0x7f187f8b81aa in dht_notify /home/pk/workspace/rhs-glusterfs/xlators/cluster/dht/src/dht-common.c:9391 #19 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #20 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #21 0x7f187fc940c0 in afr_notify /home/pk/workspace/rhs-glusterfs/xlators/cluster/afr/src/afr-common.c:4833 #22 0x7f187fc9ce0f in notify /home/pk/workspace/rhs-glusterfs/xlators/cluster/afr/src/afr.c:43 #23 0x7f1890e02b79 in xlator_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/xlator.c:549 #24 0x7f1890fb08af in default_notify /home/pk/workspace/rhs-glusterfs/libglusterfs/src/defaults.c:3113 #25 0x7f187fef815e in client_notify_dispatch /home/pk/workspace/rhs-glusterfs/xlators/protocol/client/src/client.c:90 #26 0x7f187fef7fad in client_notify_dispatch_uniq /home/pk/workspace/rhs-glusterfs/xlators/protocol/client/src/client.c:68 #27 0x7f187ff7d8e1 in client_notify_parents_child_up /home/pk/workspace/rhs-glusterfs/xlators/protocol/client/src/client-handshake.c:137 #28 0x7f187ff83a9c in client_post_handshake /home/pk/workspace/rhs-glusterfs/xlators/protocol/client/src/client-handshake.c:1059 #29 0x7f187ff84950 in client_setvolume_cbk /home/pk/workspace/rhs-glusterfs/xlators/protocol/client/src/client-handshake.c:1228 #30 0x7f1890b6a3ec in rpc_clnt_handle_reply /home/pk/workspace/rhs-glusterfs/rpc/rpc-lib/src/rpc-clnt.c:794 #31 0x7f1890b6af04 in rpc_clnt_notify /home/pk/workspace/rhs-glusterfs/rpc/rpc-lib/src/rpc-clnt.c:987 #32 0x7f1890b6172c in rpc_transport_notify /home/pk/workspace/rhs-glusterfs/rpc/rpc-lib/src/rpc-transport.c:538 #33 0x7f18824ad582 in socket_event_poll_in /home/pk/workspace/rhs-glusterfs/rpc/rpc-transport/socket/src/socket.c:2306 #34 0x7f18824ae3ed in socket_event_handler /home/pk/workspace/rhs-glusterfs/rpc/rpc-transport/socket/src/socket.c:2458 #35 0x7f1890f006e6 in event_dispatch_epoll_handler /home/pk/workspace/rhs-glusterfs/libglusterfs/src/event-epoll.c:572 #36 0x7f1890f00d9c in event_dispatch_epoll_worker /home/pk/workspace/rhs-glusterfs/libglusterfs/src/event-epoll.c:648 #37 0x7f188fbcd6c9 in start_thread (/lib64/libpthread.so.0+0x76c9) Thread T6 created by T0 here: #0 0x7f189125f488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488) #1 0x7f1890f010b4 in event_dispatch_epoll /home/pk/workspace/rhs-glusterfs/libglusterfs/src/event-epoll.c:700 #2 0x7f1890e7a8b6 in event_dispatch /home/pk/workspace/rhs-glusterfs/libglusterfs/src/event.c:124 #3 0x40fddb in main /home/pk/workspace/rhs-glusterfs/glusterfsd/src/glusterfsd.c:2479 #4 0x7f188f3c0400 in __libc_start_main (/lib64/libc.so.6+0x20400) Thread T4 created by T0 here: #0 0x7f189125f488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488) #1 0x7f1890e26887 in gf_thread_create /home/pk/workspace/rhs-glusterfs/libglusterfs/src/common-utils.c:3733 #2 0x7f1890eaa398 in syncenv_new /home/pk/workspace/rhs-glusterfs/libglusterfs/src/syncop.c:827 #3 0x40fc7b in main /home/pk/workspace/rhs-glusterfs/glusterfsd/src/glusterfsd.c:2461 #4 0x7f188f3c0400 in __libc_start_main (/lib64/libc.so.6+0x20400) SUMMARY: AddressSanitizer: heap-use-after-free ../../../../libglusterfs/src/list.h:76 in list_del Shadow bytes around the buggy address: 0x0c228034ced0: fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa 0x0c228034cee0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd 0x0c228034cef0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c228034cf00: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c228034cf10: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c228034cf20: fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd fa fa 0x0c228034cf30: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd 0x0c228034cf40: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c228034cf50: fd fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c228034cf60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c228034cf70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb I added logs, this confirms that, after removing from LRU, the inode is added again to the list. [2017-06-28 19:01:45.119877] I [shard.c:542:__shard_update_shards_inode_list] 0-r3-shard: 0e2f4fdd-4f7b-4d0e-8027-4247560507b0 is moved to top ... 0-r3-shard: 0e2f4fdd-4f7b-4d0e-8027-4247560507b0 is removed to list [2017-06-28 19:01:45.136590] I [shard.c:509:__shard_update_shards_inode_list] 0-r3-shard: 0e2f4fdd-4f7b-4d0e-8027-4247560507b0 is added to list [2017-06-28 19:01:45.144234] E [MSGID: 109040] [dht-helper.c:1197:dht_migration_complete_check_task] 0-r3-dht: 0e2f4fdd-4f7b-4d0e-8027-4247560507b0: failed to lookup the file on r3-dht [Stale file handle] [2017-06-28 19:01:45.145572] I [shard.c:5030:shard_forget] 0-r3-shard: 0e2f4fdd-4f7b-4d0e-8027-4247560507b0 is freed
REVIEW: https://review.gluster.org/17644 (features/shard: Remove ctx from LRU in shard_forget) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)
COMMIT: https://review.gluster.org/17644 committed in master by Pranith Kumar Karampuri (pkarampu) ------ commit 97defef2375b911c7b6a3924c242ba8ef4593686 Author: Pranith Kumar K <pkarampu> Date: Wed Jun 28 09:10:53 2017 +0530 features/shard: Remove ctx from LRU in shard_forget Problem: There is a race when the following two commands are executed on the mount in parallel from two different terminals on a sharded volume, which leads to use-after-free. Terminal-1: while true; do dd if=/dev/zero of=file1 bs=1M count=4; done Terminal-2: while true; do cat file1 > /dev/null; done In the normal case this is the life-cycle of a shard-inode 1) Shard is added to LRU when it is first looked-up 2) For every operation on the shard it is moved up in LRU 3) When "unlink of the shard"/"LRU limit is hit" happens it is removed from LRU But we are seeing a race where the inode stays in Shard LRU even after it is forgotten which leads to Use-after-free and then some memory-corruptions. These are the steps: 1) Shard is added to LRU when it is first looked-up 2) For every operation on the shard it is moved up in LRU Reader-handler Truncate-handler 1) Reader handler needs shard-x to be read. 1) Truncate has just deleted shard-x 2) In shard_common_resolve_shards(), it does inode_resolve() and that leads to a hit in LRU, so it is going to call __shard_update_shards_inode_list() to move the inode to top of LRU 2) shard-x gets unlinked from the itable and inode_forget(inode, 0) is called to make sure the inode can be purged upon last unref 3) when __shard_update_shards_inode_list() is called it finds that the inode is not in LRU so it adds it back to the LRU-list Both these operations complete and call inode_unref(shard-x) which leads to the inode getting freed and forgotten, even when it is in Shard LRU list. When more inodes are added to LRU, use-after-free will happen and it leads to undefined behaviors. Fix: I see that the inode can be removed from LRU even by the protocol layers like gfapi/gNFS when LRU limit is reached. So it is better to add a check in shard_forget() to remove itself from LRU list if it exists. BUG: 1466037 Change-Id: Ia79c0c5c9d5febc56c41ddb12b5daf03e5281638 Signed-off-by: Pranith Kumar K <pkarampu> Reviewed-on: https://review.gluster.org/17644 Smoke: Gluster Build System <jenkins.org> CentOS-regression: Gluster Build System <jenkins.org> Reviewed-by: Krutika Dhananjay <kdhananj>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report. glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html [2] https://www.gluster.org/pipermail/gluster-users/