Hide Forgot
After 3.0.3 upgrade on Solaris 10, there are loads spiking on the gateway node, I'm looking at the system now and I can't seem to figure out what's causing it, and I can't kill off the large number of smbd processes that are active. Load averages are in the 50s and climbing. I'm not seeing anything too out of the ordinary in the store and afr logs (attached the last few lines of those logs from today) store.log [2010-03-22 00:16:47] E [client-protocol.c:313:call_bail] client1: bailing out frame FSTAT(25) frame sent = 2010-03-21 23:46:46. frame-timeout = 1800 [2010-03-22 00:16:47] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 32738115: FSTAT() /clients/rics/Retrospect Backup Sets/Documents and Settings/All Users/Application Data/Retrospect/RtrSec.dir/DSM (Dell R200 - RICS04147).dat => -1 (Transport endpoint is not connected) [2010-03-22 00:16:47] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=1988197 type=4 op=12 [2010-03-22 00:46:49] E [client-protocol.c:313:call_bail] client1: bailing out frame FLUSH(14) frame sent = 2010-03-22 00:16:47. frame-timeout = 1800 [2010-03-22 00:46:49] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 35853482: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 00:46:49] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2002133 type=4 op=14 [2010-03-22 00:46:49] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2000180 type=4 op=25 [2010-03-22 08:11:17] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00008678/00016095/00053729/02644290) inode (ptr=0x10754020, ino=292648296, gen=1395287) found conflict (ptr=0x10755c10, ino=292648296, gen=1395287) [2010-03-22 09:09:22] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00008940/00017034/00055322/pt3011833.tmp) inode (ptr=0x2aaab5576b20, ino=292913364, gen=1543994) found conflict (ptr=0x11d2f840, ino=292913364, gen=1543994) [2010-03-22 09:40:14] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00008984/00017275/00055872/02737241) inode (ptr=0x12d3fff0, ino=293028444, gen=1580767) found conflict (ptr=0x12d45760, ino=293028444, gen=1580767) [2010-03-22 09:54:41] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/data/meetings_ppt/pt26152855.tmp) inode (ptr=0x2aaac1061660, ino=213510982, gen=22259) found conflict (ptr=0x12ae8ff0, ino=213510982, gen=22259) [2010-03-22 10:22:21] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/data/software/pt31170326.tmp) inode (ptr=0x2aaab0b8c600, ino=117805818, gen=22159) found conflict (ptr=0x2aaac0efd2b0, ino=117805818, gen=22159) [2010-03-22 10:23:35] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 09:53:31. frame-timeout = 1800 [2010-03-22 10:23:35] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2169013 type=4 op=12 [2010-03-22 10:28:42] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00009255/00018372/00057160/02788315) inode (ptr=0x2aaab0e09eb0, ino=293903753, gen=1507760) found conflict (ptr=0x13a18870, ino=293903753, gen=1507760) [2010-03-22 10:29:05] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 09:58:57. frame-timeout = 1800 [2010-03-22 10:29:05] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 09:58:57. frame-timeout = 1800 [2010-03-22 10:29:05] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 09:58:57. frame-timeout = 1800 [2010-03-22 10:29:05] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2211024 type=4 op=12 [2010-03-22 10:40:26] E [client-protocol.c:313:call_bail] client1: bailing out frame WRITE(12) frame sent = 2010-03-22 10:10:22. frame-timeout = 1800 [2010-03-22 10:40:26] E [client-protocol.c:313:call_bail] client1: bailing out frame WRITE(12) frame sent = 2010-03-22 10:10:22. frame-timeout = 1800 [2010-03-22 10:40:26] E [client-protocol.c:313:call_bail] client1: bailing out frame WRITE(12) frame sent = 2010-03-22 10:10:22. frame-timeout = 1800 [2010-03-22 10:40:26] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2382976 type=4 op=12 [2010-03-22 10:43:06] E [client-protocol.c:313:call_bail] client1: bailing out frame WRITE(12) frame sent = 2010-03-22 10:12:57. frame-timeout = 1800 [2010-03-22 10:43:06] E [client-protocol.c:313:call_bail] client1: bailing out frame WRITE(12) frame sent = 2010-03-22 10:12:57. frame-timeout = 1800 [2010-03-22 10:43:06] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2389320 type=4 op=12 [2010-03-22 10:53:37] E [client-protocol.c:313:call_bail] client4: bailing out frame FSTAT(25) frame sent = 2010-03-22 10:23:35. frame-timeout = 1800 [2010-03-22 10:53:37] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 43988459: FSTAT() /clients/dbtarch/Raid3/TOMO/TOMOAllCase5471-now/5471-/ClinicalCases/Screening/6020/20050822/MG802/raw/00000014 => -1 (Transport endpoint is not connected) [2010-03-22 10:53:37] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2349956 type=4 op=25 [2010-03-22 10:53:37] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 45738825: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 10:59:07] E [client-protocol.c:313:call_bail] client4: bailing out frame FSTAT(25) frame sent = 2010-03-22 10:29:05. frame-timeout = 1800 [2010-03-22 10:59:07] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 44168088: FSTAT() /clients/dbtarch/Raid3/TOMO/TOMOAllCase5471-now/5471-/ClinicalCases/Screening/6021/20050822/MG808/gainmap.808 => -1 (Transport endpoint is not connected) [2010-03-22 10:59:07] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2211025 type=4 op=12 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:36:25. frame-timeout = 1800 [2010-03-22 11:06:28] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2422749 type=4 op=12 [2010-03-22 11:10:28] E [client-protocol.c:313:call_bail] client1: bailing out frame FSTAT(25) frame sent = 2010-03-22 10:40:26. frame-timeout = 1800 [2010-03-22 11:10:28] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 44453483: FSTAT() /clients/dbtarch/Raid3/TOMO/TOMOAllCase5471-now/5471-/ClinicalCases/Screening/6028/20050826/MG882/6028_882L_thick68_post_PE.smv => -1 (Transport endpoint is not connected) [2010-03-22 11:10:28] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2382977 type=4 op=12 [2010-03-22 11:13:08] E [client-protocol.c:313:call_bail] client1: bailing out frame FSTAT(25) frame sent = 2010-03-22 10:43:06. frame-timeout = 1800 [2010-03-22 11:13:08] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 44501364: FSTAT() /clients/dbtarch/Raid3/TOMO/TOMOAllCase5471-now/5471-/ClinicalCases/Screening/6028/20050826/MG884/6028_884R_thick65_post_PE.smv => -1 (Transport endpoint is not connected) [2010-03-22 11:13:08] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2389321 type=4 op=12 [2010-03-22 11:22:59] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:52:55. frame-timeout = 1800 [2010-03-22 11:22:59] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 10:52:55. frame-timeout = 1800 [2010-03-22 11:22:59] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2531612 type=4 op=12 [2010-03-22 11:25:01] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/data/admin/resume/pt14231053.tmp) inode (ptr=0x2aaabcb471d0, ino=387320621, gen=54052) found conflict (ptr=0x151481e0, ino=387320621, gen=54052) [2010-03-22 11:28:18] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/data/mip/meetings/documents/tmp/pt31161636.tmp) inode (ptr=0x2aaab25dc4a0, ino=117767577, gen=20226) found conflict (ptr=0x2aaab670f120, ino=117767577, gen=20226) [2010-03-22 11:28:18] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/data/mip/meetings/documents/tmp/pt26155428.tmp) inode (ptr=0x2aaab25dbb20, ino=213511374, gen=24332) found conflict (ptr=0x12e28eb0, ino=213511374, gen=24332) [2010-03-22 11:28:36] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/data/mip/protocols/pt14232326.tmp) inode (ptr=0x2aaab25dfd70, ino=387321020, gen=20604) found conflict (ptr=0x2aaac0c85a80, ino=387321020, gen=20604) [2010-03-22 11:29:09] E [client-protocol.c:313:call_bail] client4: bailing out frame FLUSH(14) frame sent = 2010-03-22 10:59:07. frame-timeout = 1800 [2010-03-22 11:29:09] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 45890463: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 11:29:09] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2211026 type=4 op=12 [2010-03-22 11:32:45] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00009764/00018918/00057738/02921306) inode (ptr=0x2aaabcd5b0e0, ino=323712147, gen=1565639) found conflict (ptr=0x2aaabcd59780, ino=323712147, gen=1565639) [2010-03-22 11:36:29] E [client-protocol.c:313:call_bail] client4: bailing out frame FSTAT(25) frame sent = 2010-03-22 11:06:28. frame-timeout = 1800 [2010-03-22 11:36:29] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 45157650: FSTAT() /clients/dbtarch/Raid3/TOMO/TOMOAllCase5471-now/5471-/ClinicalCases/Screening/6046/20050907/MG1014/raw/00000005 => -1 (Transport endpoint is not connected) [2010-03-22 11:36:29] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2422750 type=4 op=12 [2010-03-22 11:40:30] E [client-protocol.c:313:call_bail] client1: bailing out frame FLUSH(14) frame sent = 2010-03-22 11:10:28. frame-timeout = 1800 [2010-03-22 11:40:30] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 46215664: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 11:40:30] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2382978 type=4 op=12 [2010-03-22 11:43:10] E [client-protocol.c:313:call_bail] client1: bailing out frame FLUSH(14) frame sent = 2010-03-22 11:13:08. frame-timeout = 1800 [2010-03-22 11:43:10] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 46304370: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 11:43:10] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2806100 type=4 op=14 [2010-03-22 11:43:10] W [client-protocol.c:6518:protocol_client_interpret] client1: no frame for callid=2593089 type=4 op=25 [2010-03-22 11:53:00] E [client-protocol.c:313:call_bail] client4: bailing out frame FSTAT(25) frame sent = 2010-03-22 11:22:59. frame-timeout = 1800 [2010-03-22 11:53:00] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 45728262: FSTAT() /clients/dbtarch/Raid3/TOMO/TOMOAllCase5471-now/5471-/ClinicalCases/Screening/6055/20050913/MG1076/raw/00000012 => -1 (Transport endpoint is not connected) [2010-03-22 11:53:00] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2531613 type=4 op=12 [2010-03-22 11:59:11] E [client-protocol.c:313:call_bail] client4: bailing out frame FLUSH(14) frame sent = 2010-03-22 11:29:09. frame-timeout = 1800 [2010-03-22 11:59:11] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 46867669: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 12:06:31] E [client-protocol.c:313:call_bail] client4: bailing out frame FLUSH(14) frame sent = 2010-03-22 11:36:29. frame-timeout = 1800 [2010-03-22 12:06:31] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 47136370: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 12:06:31] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2422751 type=4 op=12 [2010-03-22 12:10:31] E [client-protocol.c:313:call_bail] client1: bailing out frame FLUSH(14) frame sent = 2010-03-22 11:40:30. frame-timeout = 1800 [2010-03-22 12:10:31] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 47270035: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 12:23:02] E [client-protocol.c:313:call_bail] client4: bailing out frame FLUSH(14) frame sent = 2010-03-22 11:53:00. frame-timeout = 1800 [2010-03-22 12:23:02] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 47670050: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 12:23:02] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=3048245 type=4 op=14 [2010-03-22 12:23:02] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=2768095 type=4 op=25 [2010-03-22 12:32:59] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00010165/00019453/00058277/03020555) inode (ptr=0x167aaf60, ino=113143383, gen=1609013) found conflict (ptr=0x2aaac2498d40, ino=113143383, gen=1609013) [2010-03-22 12:36:33] E [client-protocol.c:313:call_bail] client4: bailing out frame FLUSH(14) frame sent = 2010-03-22 12:06:31. frame-timeout = 1800 [2010-03-22 12:36:33] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 48057581: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 12:51:14] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 12:21:05. frame-timeout = 1800 [2010-03-22 12:51:14] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 12:21:05. frame-timeout = 1800 [2010-03-22 12:51:14] E [client-protocol.c:313:call_bail] client4: bailing out frame WRITE(12) frame sent = 2010-03-22 12:21:05. frame-timeout = 1800 [2010-03-22 12:51:14] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=3235922 type=4 op=12 [2010-03-22 13:11:12] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00010398/00019722/00058546/pt4060927.tmp) inode (ptr=0x2aaabf924020, ino=358327488, gen=1744190) found conflict (ptr=0x2aaac3b86870, ino=358327488, gen=1744190) [2010-03-22 13:19:03] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00010434/00019775/00058599/03185854) inode (ptr=0x2aaac401eea0, ino=302688577, gen=1798592) found conflict (ptr=0x181f23f0, ino=302688577, gen=1798592) [2010-03-22 13:21:16] E [client-protocol.c:313:call_bail] client4: bailing out frame FSTAT(25) frame sent = 2010-03-22 12:51:14. frame-timeout = 1800 [2010-03-22 13:21:16] W [fuse-bridge.c:722:fuse_attr_cbk] glusterfs-fuse: 48509677: FSTAT() /clients/dbtarch/Raid3/TOMO/TOMOAllCase5471-now/5471-/ClinicalCases/Screening/6131/20051101/MG87/RAW/00000007 => -1 (Transport endpoint is not connected) [2010-03-22 13:21:16] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=3235923 type=4 op=12 [2010-03-22 13:51:18] E [client-protocol.c:313:call_bail] client4: bailing out frame FLUSH(14) frame sent = 2010-03-22 13:21:16. frame-timeout = 1800 [2010-03-22 13:51:18] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 50768180: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 13:51:18] W [client-protocol.c:6518:protocol_client_interpret] client4: no frame for callid=3235924 type=4 op=12 [2010-03-22 14:21:04] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00010945/00020409/00059233/03409762) inode (ptr=0x2aaac650e170, ino=472637305, gen=1728642) found conflict (ptr=0x2aaac650ad40, ino=472637305, gen=1728642) [2010-03-22 14:21:20] E [client-protocol.c:313:call_bail] client4: bailing out frame FLUSH(14) frame sent = 2010-03-22 13:51:18. frame-timeout = 1800 [2010-03-22 14:21:20] W [fuse-bridge.c:1174:fuse_err_cbk] glusterfs-fuse: 51845125: FLUSH() ERR => -1 (Transport endpoint is not connected) [2010-03-22 14:45:12] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011114/00020621/00059445/03504702) inode (ptr=0x2aaacb314220, ino=318467312, gen=1946787) found conflict (ptr=0x2aaacb311000, ino=318467312, gen=1946787) [2010-03-22 14:54:28] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011209/00020717/00059541/03536961) inode (ptr=0x2aaac79fa3a0, ino=124916508, gen=1826864) found conflict (ptr=0x1bb0a240, ino=124916508, gen=1826864) [2010-03-22 14:56:26] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011233/00020743/00059567/03543956) inode (ptr=0x1bc27220, ino=375734059, gen=1920868) found conflict (ptr=0x1bc24da0, ino=375734059, gen=1920868) [2010-03-22 15:00:47] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011307/00020823/00059647/03557874) inode (ptr=0x2aaacbc03530, ino=318764203, gen=1971935) found conflict (ptr=0x1be56dc0, ino=318764203, gen=1971935) [2010-03-22 15:04:12] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011330/00020846/00059670/pt4160745.tmp) inode (ptr=0x2aaac7f3fe40, ino=319145682, gen=1977151) found conflict (ptr=0x1c044040, ino=319145682, gen=1977151) [2010-03-22 15:13:29] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011468/00020989/00059813/pt4164811.tmp) inode (ptr=0x1c54a1c0, ino=319262211, gen=1991838) found conflict (ptr=0x2aaad0320ec0, ino=319262211, gen=1991838) [2010-03-22 15:13:37] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011475/00020996/00059820/03600760) inode (ptr=0x1c55f5c0, ino=319262939, gen=1992006) found conflict (ptr=0x1c55ea40, ino=319262939, gen=1992006) [2010-03-22 15:15:05] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011494/00021015/00059839/03605937) inode (ptr=0x1c637860, ino=476342748, gen=1809873) found conflict (ptr=0x1c6373d0, ino=476342748, gen=1809873) [2010-03-22 15:41:31] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011760/00021364/00061224/03679243) inode (ptr=0x2aaad10bdbc0, ino=460370836, gen=2028853) found conflict (ptr=0x2aaad10bd580, ino=460370836, gen=2028853) [2010-03-22 15:44:06] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011791/00021395/00061295/pt4183430.tmp) inode (ptr=0x1d365860, ino=376502848, gen=1987327) found conflict (ptr=0x2aaad120d340, ino=376502848, gen=1987327) [2010-03-22 15:44:34] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011791/00021395/00064781/pt4183655.tmp) inode (ptr=0x1d3c0f40, ino=376508861, gen=1988375) found conflict (ptr=0x2aaad1276370, ino=376508861, gen=1988375) [2010-03-22 15:47:15] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011853/00021457/00061472/03693248) inode (ptr=0x2aaacd4a7030, ino=477389360, gen=1847729) found conflict (ptr=0x2aaacd4a6ef0, ino=477389360, gen=1847729) [2010-03-22 15:51:31] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00011919/00021531/00061668/pt4185729.tmp) inode (ptr=0x2aaacd63fb30, ino=319539531, gen=1900719) found conflict (ptr=0x2aaacd6464e0, ino=319539531, gen=1900719) [2010-03-22 16:21:50] W [fuse-bridge.c:491:fuse_entry_cbk] glusterfs-fuse: LOOKUP(/clients/csb-archive/NAS-backup/mnt/scippy_images/00012247/00021997/00063410/pt4230200.tmp) inode (ptr=0x2aaad218e970, ino=468905214, gen=2030086) found conflict (ptr=0x1e34a570, ino=468905214, gen=2030086) [root@bricklayer01 glusterfs]#
Created attachment 166 [details] This program will illustrate the bug if executed on RH6.x See comment from eatdirt.
Hi, same error here. By the way, I reported this bug for version 3.0.0 but you closed it as fixed; and as I said at that time it was not. This is a major feature. Files disappear and reappear randomly and this destroy many script working. Here the kind of error you get: mv: cannot move `xch50m245' to `xch50m245_3': No such file or directory then you do "ls" and the file is here again. Please, fix this bug, it makes glusterfs unusable. I am now considering to give up gluster. Cheers, Chris.
Chris, It will help us debug this problem if you can share your scripts. We need to understand the sequence of system calls being triggered. Also, can you share your server log files? Thanks, Avati > Hi, > same error here. By the way, I reported this bug for version 3.0.0 but you > closed it as fixed; and as I said at that time it was not. > > This is a major feature. Files disappear and reappear randomly and this destroy > many script working. Here the kind of error you get: > > mv: cannot move `xch50m245' to `xch50m245_3': No such file or directory > > then you do "ls" and the file is here again. > > Please, fix this bug, it makes glusterfs unusable. I am now considering to give > up gluster. > > Cheers, > Chris.
Created attachment 167 [details] Sorry about that. ( " ) Here they are. The script strgs_stop.bash simply look for some files and rename them. Here I get the random "file not found errors". Just after the script failing, if I look to the file they are there and I can do the mv command by hand and it works fine. I even tried to add a sleep 0.1 in the script, but still the "not found" errors shows up. In the next attachment I put the server log file of one node. When this error occurs, the fuse-lookup error shows up in the log, precisely on the nodes in which these files are located. Finally, it may be worth mentioning that this script is run a few times. I am using it to rename files "outputfilename" to "outputfilename_1, _2 etc..." after each run of some codes. So if the filesystem does not record that outputfilename has disappeared, then I imagine that the next time my codes are creating outputfilename, some nasty stuffs appear. Especially if I move again outputfilename to outputfilename_nextindex.
Created attachment 168 [details] OK, here is the source then. This is the server log file for the node "mars", client-05 of the nufa mode.
We suspect write-behind missing a frame in flush cbk. Soon will have fix on this.
Du, Did we submitted a patch on this ?? Lets make sure this goes in mainline.
There are some fixes to wb_flush in http://patches.gluster.com/patch/3453/ This patch has fixes equivalent to the ones present in patch sent by Avati for release-3.0. http://patches.gluster.com/patch/3522/ Patch 3522 deals with duplicate flushes sent to server. Since, whole of wb_flush is re-wrote, this bug might've been fixed. We can be sure of that if we could reproduce this bug and rerun the tests with patch-3453.