Hide Forgot
Log file for this session: ================================================================================ Version : glusterfs 3.0.3 built on Mar 19 2010 11:55:08 git: v3.0.2-41-g029062c Starting Time: 2010-03-31 17:17:10 Command line : /sbin/glusterfs --log-level=NORMAL --volfile=/etc/glusterfs/glusterfs.vol /mnt/backups/ PID : 3263 System name : Linux Nodename : backup0 Kernel Release : 2.6.30-2-686 Hardware Identifier: i686 Given volfile: +------------------------------------------------------------------------------+ 1: 2: ##################################### 3: ### GlusterFS Client Volume File ## 4: ##################################### 5: 6: #### CONFIG FILE RULES: 7: ### "#" is comment character. 8: ### - Config file is case sensitive 9: ### - Options within a volume block can be in any order. 10: ### - Spaces or tabs are used as delimitter within a line. 11: ### - Each option should end within a line. 12: ### - Missing or commented fields will assume default values. 13: ### - Blank/commented lines are allowed. 14: ### - Sub-volumes should already be defined above before referring. 15: 16: # drives 17: volume backup1-sda1 18: type protocol/client 19: option transport-type tcp 20: option remote-host 209.163.224.123 21: option transport.socket.nodelay on 22: option remote-subvolume sda1 23: end-volume 24: 25: volume backup2-sda1 26: type protocol/client 27: option transport-type tcp 28: option remote-host 209.163.224.124 29: option transport.socket.nodelay on 30: option remote-subvolume sda1 31: end-volume 32: 33: volume backup3-sda1 34: type protocol/client 35: option transport-type tcp 36: option remote-host 209.163.224.125 37: option transport.socket.nodelay on 38: option remote-subvolume sda1 39: end-volume 40: 41: volume backup4-sda1 42: type protocol/client 43: option transport-type tcp 44: option remote-host 209.163.224.126 45: option transport.socket.nodelay on 46: option remote-subvolume sda1 47: end-volume 48: 49: # replication 50: volume b-1-2-sda1 51: type cluster/replicate 52: subvolumes backup1-sda1 backup2-sda1 53: end-volume 54: 55: volume b-3-4-sda1 56: type cluster/replicate 57: subvolumes backup3-sda1 backup4-sda1 58: end-volume 59: 60: # striping 61: volume distribute 62: type cluster/distribute 63: subvolumes b-1-2-sda1 b-3-4-sda1 64: end-volume 65: 66: #### Add readahead feature 67: volume readahead 68: type performance/read-ahead 69: # option page-size 1MB # unit in bytes 70: option page-count 4 # cache per file = (page-count x page-size) 71: subvolumes distribute 72: end-volume 73: 74: #### Add IO-Cache feature 75: volume iocache 76: type performance/io-cache 77: option cache-size `grep 'MemTotal' /proc/meminfo | awk '{print $2 * 0.2 / 1024}' | cut -f1 -d.`MB 78: option cache-timeout 1 79: subvolumes readahead 80: end-volume 81: 82: #### Add quick-read feature 83: volume quickread 84: type performance/quick-read 85: option cache-timeout 1 86: option max-file-size 64kB 87: subvolumes iocache 88: end-volume 89: 90: ### JG: we seem to need this, without this, rsync crashes 91: #volume statprefetch 92: # type performance/stat-prefetch 93: # subvolumes quickread 94: #end-volume 95: 96: #### Add writeback feature 97: #volume writeback 98: # type performance/write-behind 99: # option aggregate-size 1MB 100: # option window-size 2MB 101: # option flush-behind off 102: # subvolumes quickread 103: #end-volume 104: +------------------------------------------------------------------------------+ [2010-03-31 17:17:10] N [glusterfsd.c:1396:main] glusterfs: Successfully started [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup2-sda1: Connected to 209.163.224.124:6996, attached to remote volume 'sda1'. [2010-03-31 17:17:10] N [afr.c:2627:notify] b-1-2-sda1: Subvolume 'backup2-sda1' came back up; going online. [2010-03-31 17:17:10] N [fuse-bridge.c:2942:fuse_init] glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.13 kernel 7.11 [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup2-sda1: Connected to 209.163.224.124:6996, attached to remote volume 'sda1'. [2010-03-31 17:17:10] N [afr.c:2627:notify] b-1-2-sda1: Subvolume 'backup2-sda1' came back up; going online. [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup1-sda1: Connected to 209.163.224.123:6996, attached to remote volume 'sda1'. [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup1-sda1: Connected to 209.163.224.123:6996, attached to remote volume 'sda1'. [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup3-sda1: Connected to 209.163.224.125:6996, attached to remote volume 'sda1'. [2010-03-31 17:17:10] N [afr.c:2627:notify] b-3-4-sda1: Subvolume 'backup3-sda1' came back up; going online. [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup4-sda1: Connected to 209.163.224.126:6996, attached to remote volume 'sda1'. [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup3-sda1: Connected to 209.163.224.125:6996, attached to remote volume 'sda1'. [2010-03-31 17:17:10] N [client-protocol.c:6246:client_setvolume_cbk] backup4-sda1: Connected to 209.163.224.126:6996, attached to remote volume 'sda1'. [2010-03-31 17:55:02] E [client-protocol.c:415:client_ping_timer_expired] backup1-sda1: Server 209.163.224.123:6996 has not responded in the last 42 seconds, disconnecting. [2010-03-31 17:55:02] E [client-protocol.c:415:client_ping_timer_expired] backup2-sda1: Server 209.163.224.124:6996 has not responded in the last 42 seconds, disconnecting. [2010-03-31 17:55:02] E [client-protocol.c:415:client_ping_timer_expired] backup3-sda1: Server 209.163.224.125:6996 has not responded in the last 42 seconds, disconnecting. [2010-03-31 17:55:02] E [client-protocol.c:415:client_ping_timer_expired] backup4-sda1: Server 209.163.224.126:6996 has not responded in the last 42 seconds, disconnecting. [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup1-sda1: forced unwinding frame type(1) op(LOOKUP) [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup1-sda1: forced unwinding frame type(2) op(PING) [2010-03-31 17:55:02] N [client-protocol.c:6994:notify] backup1-sda1: disconnected [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup2-sda1: forced unwinding frame type(1) op(LOOKUP) [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup2-sda1: forced unwinding frame type(2) op(PING) [2010-03-31 17:55:02] N [client-protocol.c:6994:notify] backup2-sda1: disconnected [2010-03-31 17:55:02] E [afr.c:2657:notify] b-1-2-sda1: All subvolumes are down. Going offline until atleast one of them comes back up. [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup3-sda1: forced unwinding frame type(1) op(LOOKUP) [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup3-sda1: forced unwinding frame type(2) op(PING) [2010-03-31 17:55:02] N [client-protocol.c:6994:notify] backup3-sda1: disconnected [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup4-sda1: forced unwinding frame type(1) op(LOOKUP) [2010-03-31 17:55:02] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 1022: LOOKUP() / => -1 (Transport endpoint is not connected) [2010-03-31 17:55:02] E [saved-frames.c:165:saved_frames_unwind] backup4-sda1: forced unwinding frame type(2) op(PING) [2010-03-31 17:55:02] N [client-protocol.c:6994:notify] backup4-sda1: disconnected [2010-03-31 17:55:02] E [afr.c:2657:notify] b-3-4-sda1: All subvolumes are down. Going offline until atleast one of them comes back up. [2010-03-31 17:55:02] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 1023: LOOKUP() / => -1 (Transport endpoint is not connected) [2010-03-31 17:55:02] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 1024: LOOKUP() / => -1 (Transport endpoint is not connected) [2010-03-31 17:55:02] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 1025: LOOKUP() / => -1 (Transport endpoint is not connected) [2010-03-31 17:55:05] E [socket.c:762:socket_connect_finish] backup1-sda1: connection to 209.163.224.123:6996 failed (No route to host) [2010-03-31 17:55:05] E [socket.c:762:socket_connect_finish] backup1-sda1: connection to 209.163.224.123:6996 failed (No route to host) [2010-03-31 17:55:05] E [socket.c:762:socket_connect_finish] backup2-sda1: connection to 209.163.224.124:6996 failed (No route to host) [2010-03-31 17:55:05] E [socket.c:762:socket_connect_finish] backup2-sda1: connection to 209.163.224.124:6996 failed (No route to host) [2010-03-31 17:55:05] E [socket.c:762:socket_connect_finish] backup3-sda1: connection to 209.163.224.125:6996 failed (No route to host) [2010-03-31 17:55:05] E [socket.c:762:socket_connect_finish] backup3-sda1: connection to 209.163.224.125:6996 failed (No route to host) [2010-03-31 17:55:05] E [socket.c:762:socket_connect_finish] backup4-sda1: connection to 209.163.224.126:6996 failed (No route to host) [2010-03-31 17:55:08] W [fuse-bridge.c:2352:fuse_statfs_cbk] glusterfs-fuse: 1026: ERR => -1 (Transport endpoint is not connected) [2010-03-31 17:55:08] E [socket.c:762:socket_connect_finish] backup4-sda1: connection to 209.163.224.126:6996 failed (No route to host)
Disregard the log file. The core was not produced in the time frame that the log file is for.
Backtrace: #0 0xb74f07e0 in dht_first_up_subvol (this=0x9f7f288) at dht-helper.c:200 #1 0xb74f0855 in dht_subvol_get_hashed (this=0x9f7f288, loc=0xb6c9e314) at dht-helper.c:219 #2 0xb7503f6b in dht_lookup (frame=0xb6301318, this=0x9f7f288, loc=0xb6c9e314, xattr_req=0x9f80f28) at dht-common.c:814 #3 0xb7f2b585 in default_lookup (frame=0xb63012e8, this=0x9f7f898, loc=0xb6c9e314, xattr_req=0x9f80f28) at defaults.c:63 #4 0xb74dc345 in ioc_lookup (frame=0xb63012a0, this=0x9f7fe58, loc=0xb6c9e314, xattr_req=0x9f80f28) at io-cache.c:226 #5 0xb74d75f7 in qr_lookup (frame=0xb6301220, this=0x9f80418, loc=0xb6c9e314, xattr_req=0x9f80f28) at quick-read.c:315 #6 0xb74ce401 in sp_lookup (frame=0xb6300f58, this=0x9f809f8, loc=0xb6c9e314, xattr_req=0x9f80f28) at stat-prefetch.c:1087 #7 0xb74add78 in fuse_first_lookup (this=0x9f771d0) at fuse-bridge.c:3027 #8 0xb74ae19b in fuse_thread_proc (data=0x9f771d0) at fuse-bridge.c:3123 #9 0xb7ef9595 in start_thread () from /lib/i686/cmov/libpthread.so.0 #10 0xb7e7a18e in clone () from /lib/i686/cmov/libc.so.6 #0 0xb74f07e0 in dht_first_up_subvol (this=0x9f7f288) at dht-helper.c:200 200 if (conf->subvolume_status[i]) { (gdb) p *conf $1 = {subvolume_lock = 0, subvolume_cnt = 8, subvolumes = 0xeeeeeeee, subvolume_status = 0xeeeeeeee <Address 0xeeeeeeee out of bounds>, file_layouts = 0xeeeeeeee, dir_layouts = 0x0, default_dir_layout = 0x0, search_unhashed = 2, gen = 10, du_stats = 0x9fb8ac8, min_free_disk = 10, disk_unit = 112 'p', refresh_interval = 0, unhashed_sticky_bit = _gf_false, last_stat_fetch = {tv_sec = 0, tv_usec = 0}, layout_lock = 1, private = 0x1c0}
This is a race at the time of process termination. Here 'fini()' was called and during the same time, there was a lookup() call. Will be fixed in 3.1 where we need proper 'fini()' for dynamic volumes.
In 3.1 fini will be only called on the detached old graphs, hence the race condition between lookup and fini is obsoleted