Created attachment 913802 [details] e2image of /home/metavera tdcslb1 Description of problem: We have two linux test vms. Each has its own /home/metavera file system. One directory in that has gluster replication on top of it: # df -h /home/metavera/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/rootvg-rootvol 2.5G 2.3G 109M 96% / # df -h /home/metavera/ Filesystem Size Used Avail Use% Mounted on /dev/mapper/rootvg-rootvol 2.5G 2.0G 322M 87% / #gluster volume info Volume Name: datastore Type: Replicate Volume ID: c73bd310-8be7-4021-ade5-a5cfb1e91fac Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: tdcslb1:/home/metavera/enterprisecarshare.com/test Brick2: tdcslb2:/home/metavera/enterprisecarshare.com/test On each of these system this gluster volume "datastore" is mounted as /mnt/glusterfs. On tdcslb1 the following iozone job was running: [root@tdcslb1 current]# ./iozone -Ra -g 2G -i0 -i 1 -i 2 -f /mnt/glusterfs/iozone_test Iozone: Performance Test of File I/O Version $Revision: 3.424 $ Compiled for 64 bit mode. Build: linux Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root, Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer, Vangel Bojaxhi, Ben England, Vikentsi Lapa. Run began: Mon Jun 30 13:44:16 2014 Excel chart generation enabled Auto Mode Using maximum file size of 2097152 kilobytes. Command line used: ./iozone -Ra -g 2G -i0 -i 1 -i 2 -f /mnt/glusterfs/iozone_test Output is in kBytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 kBytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 64 4 35512 36507 92634 2006158 1105556 16252 64 8 26325 23364 78817 2923952 2052169 24586 ...etc... 524288 2048 108769 115576 224147 1699914 1705063 121318 524288 4096 110032 111976 213880 1440644 1783924 124043 524288 8192 <-------------------------------------------------------reboot 103925 120128 231561 1352328 1454694 122545 524288 16384 117790 16484 193729 1403840 1478762 74662 1048576 64 99938 73909 135913 67536 34160 61833 1048576 128 104751 106741 ...etc... At the start of the test gluster was replicating data as expected to tdcslb2. At the point where the <---reboot is noted tdcslb2 was rebooted. When it came up glusterfs was started and a self heal was started. It was at this point that we noticed ext4 errors on both nodes, including JBD: Spotted dirty metadata buffer: Jun 30 13:52:40 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_add_entry: bad entry in directory #18890585: directory entry across blocks - block=18982400offset=0(0), inode=4109694196, rec_len=62708, name_len=244 Jun 30 13:53:15 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 449: 2015 blocks in bitmap, 32736 in gd Jun 30 13:53:15 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 450: 0 blocks in bitmap, 32768 in gd Jun 30 13:53:32 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 641: 31341 blocks in bitmap, 31346 in gd Jun 30 13:53:32 tdcslb2 kernel: JBD: Spotted dirty metadata buffer (dev = dm-3, blocknr = 0). There's a risk of filesystem corruption in case of system crash. Jun 30 13:54:45 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 579: 24482 blocks in bitmap, 24481 in gd Jun 30 13:54:45 tdcslb2 kernel: JBD: Spotted dirty metadata buffer (dev = dm-3, blocknr = 0). There's a risk of filesystem corruption in case of system crash. Shortly after this iozone failed with a data mismatch on tdcslb1 where it was expecting one data pattern but received another. Oddly when the filesystem was cleaned no errors were reported on the iozone file ïozone_test" but most appeared to be directories and files that were already there and not being accessed/used for this test. Version-Release number of selected component (if applicable): glusterfs-server-3.5.0-2.el6.x86_64 glusterfs-fuse-3.5.0-2.el6.x86_64 RHEL 6.5 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Dec 13 06:58:20 EST 2013 x86_64 How reproducible: This happened on our first first bring one of the nodes that had replicated down hard and bring it backup. This suggests it is reproducible but we have not yet tried again. Steps to Reproduce: 1. Two nodes each with their own file system. Replicate one directory of this file system to each node 2. Start iozone test on one node 3. After the test has run for awhile (the test was about at 1GB written when we tried) reboot the other node 4. Start self heal after the node is up and gluster is running again on the node that was brought down hard. Actual results: iozone expected one pattern but received another File system errors: Jun 30 13:53:15 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 450: 0 blocks in bitmap, 32768 in gd Jun 30 13:53:32 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 641: 31341 blocks in bitmap, 31346 in gd Jun 30 13:53:32 tdcslb2 kernel: JBD: Spotted dirty metadata buffer (dev = dm-3, blocknr = 0). There's a risk of filesystem corruption in case of system crash. Jun 30 13:54:45 tdcslb2 kernel: EXT4-fs error (device dm-3): ext4_mb_generate_buddy: EXT4-fs: group 579: 24482 blocks in bitmap, 24481 in gd Jun 30 13:54:45 tdcslb2 kernel: JBD: Spotted dirty metadata buffer (dev = dm-3, blocknr = 0). There's a risk of filesystem corruption in case of system crash. Expected results: iozone test to finish data to be replicated successfully to other node Additional info:
Created attachment 913803 [details] e2image tdcslb2
Created attachment 913804 [details] gluster log files tdcslb1
Created attachment 913805 [details] gluster log files tdcslb2
Created attachment 913806 [details] sosreport tdcslb1
Created attachment 913807 [details] sosreport tdcslb2
Created attachment 913808 [details] clean tdcslb1
Created attachment 913809 [details] clean tdcslb2
I have forced crash dumps available if they are useful I can figure out a way to link them to the case (they are too big to attach).
This looks like a upstream bug and should be moved to product GlusterFS
Changing product/version to upstream as the issue is reported on glusterfs 3.5.0. Please correct if I am mistaken.
John, is there any way you could test with 3.5.1 instead of 3.5.0? Just asking, as we fixed a bunch of bugs in 3.5.1. :)
Sure I guess that must have released after my initial install. I see the software I will go out and get it installed today / tomorrow and re-run.
Okay I am now running: [root@tdcslb2 reserve]# rpm -qa |grep -i gluster glusterfs-server-3.5.1-1.el6.x86_64 glusterfs-libs-3.5.1-1.el6.x86_64 glusterfs-fuse-3.5.1-1.el6.x86_64 glusterfs-api-3.5.1-1.el6.x86_64 glusterfs-3.5.1-1.el6.x86_64 glusterfs-cli-3.5.1-1.el6.x86_64 I ran through the test and was able to bring down each node hard and self heal ran automatically (so when I ran it by hand it had already run) and there were no issues. I will repeat several more times today and tomorrow and close this if no issues are found.
John, Could you please update the bug with your findings. Thanks Pranith
This bug can be closed, I have not been able to reproduce since I updated gluster and we have been running for 2 months now successfully