Description of problem: a single mkdir command in a Gluster mountpoint generates all the below log messages, just think of what untar would do! This should be assigned to the disperse translator but I can't find it in list. Version-Release number of selected component (if applicable): RHS 3.1 - glusterfs-3.7dev-0.1017.git7fb85e3.el6.x86_64 RHEL6.6 - kernel-2.6.32-504.el6.x86_64 How reproducible: every time Steps to Reproduce: 1. create distributed-disperse volume shown below 2. mount it and do "tail -f /var/log/glusterfs/your-mountpoint.log 3. mkdir /your/mountpoint/d Actual results: See log messages below, and imagine untarring a large tarball. excessive logging slows down Gluster and can fill system disk leading to damage to volume Why is default log level INFO instead of WARNING? ec translator: - Why do we get "MIsmatching xdata in answers of 'LOOKUP'? - Why do we get operation failed on some subvolumes? Why isn't this an ERROR? - Why do we get Mismatching dictionary in answers of 'GF_FOP_XATTROP'? why isn't this a WARNING? what is log level 'N' ? dht translator: - why does it have to tell us directory layout? If anyone is interested they can read it from the xattr. Expected results: NOTHING! Gluster should not be normally be logging events that are commonplace and do not represent a significant failure or state change, unless the user has requested it to do so by raising the log level. Additional info: log messages generated from 1 mkdir command are: [2015-04-27 19:45:14.177042] W [ec-combine.c:867:ec_combine_check] 0-ec42-disperse-0: Mismatching xdata in answers of 'LOOKUP' [2015-04-27 19:45:14.177282] W [ec-common.c:162:ec_check_status] 0-ec42-disperse-0: Operation failed on some subvolumes (up=3F, mask=3F, remaining=0, good=2F, bad=10) [2015-04-27 19:45:14.177324] I [ec-heal.c:495:ec_heal_init] 0-ec: Healing '/', gfid 00000000-0000-0000-0000-000000000001 [2015-04-27 19:45:14.177281] W [ec-combine.c:867:ec_combine_check] 0-ec42-disperse-1: Mismatching xdata in answers of 'LOOKUP' [2015-04-27 19:45:14.177373] W [ec-common.c:162:ec_check_status] 0-ec42-disperse-1: Operation failed on some subvolumes (up=3F, mask=3F, remaining=0, good=1F, bad=20) [2015-04-27 19:45:14.177406] I [ec-heal.c:495:ec_heal_init] 0-ec: Healing '/', gfid 00000000-0000-0000-0000-000000000001 [2015-04-27 19:45:14.178187] W [ec-combine.c:867:ec_combine_check] 0-ec42-disperse-0: Mismatching xdata in answers of 'LOOKUP' [2015-04-27 19:45:14.178316] W [ec-combine.c:867:ec_combine_check] 0-ec42-disperse-1: Mismatching xdata in answers of 'LOOKUP' [2015-04-27 19:45:14.178657] W [ec-combine.c:867:ec_combine_check] 0-ec42-disperse-11: Mismatching xdata in answers of 'LOOKUP' [2015-04-27 19:45:14.178745] W [ec-common.c:162:ec_check_status] 0-ec42-disperse-11: Operation failed on some subvolumes (up=3F, mask=3F, remaining=0, good=2F, bad=10) [2015-04-27 19:45:14.178787] I [ec-heal.c:495:ec_heal_init] 0-ec: Healing '/', gfid 00000000-0000-0000-0000-000000000001 [2015-04-27 19:45:14.179487] W [ec-combine.c:867:ec_combine_check] 0-ec42-disperse-11: Mismatching xdata in answers of 'LOOKUP' [2015-04-27 19:45:14.234873] I [dht-selfheal.c:1587:dht_selfheal_layout_new_directory] 0-ec42-dht: chunk size = 0xffffffff / 43428960 = 0x62 [2015-04-27 19:45:14.234906] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-9 [2015-04-27 19:45:14.234921] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-0 [2015-04-27 19:45:14.234934] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-1 [2015-04-27 19:45:14.234948] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-10 [2015-04-27 19:45:14.234963] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-11 [2015-04-27 19:45:14.234970] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-2 [2015-04-27 19:45:14.234978] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-3 [2015-04-27 19:45:14.234986] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-4 [2015-04-27 19:45:14.234993] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-5 [2015-04-27 19:45:14.235002] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-6 [2015-04-27 19:45:14.235009] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-7 [2015-04-27 19:45:14.235016] I [dht-selfheal.c:1626:dht_selfheal_layout_new_directory] 0-ec42-dht: assigning range size 0x1523d510 to ec42-disperse-8 [2015-04-27 19:45:15.718230] N [ec-generic.c:1303:ec_combine_xattrop] 0-ec42-disperse-0: Mismatching dictionary in answers of 'GF_FOP_XATTROP' [2015-04-27 19:45:15.718811] N [ec-generic.c:1303:ec_combine_xattrop] 0-ec42-disperse-1: Mismatching dictionary in answers of 'GF_FOP_XATTROP' [2015-04-27 19:45:15.718846] N [ec-generic.c:1303:ec_combine_xattrop] 0-ec42-disperse-11: Mismatching dictionary in answers of 'GF_FOP_XATTROP' [2015-04-27 19:45:14.241533] I [MSGID: 109036] [dht-common.c:6476:dht_log_new_layout_for_dir_selfheal] 0-ec42-dht: Setting layout of /ben with [Subvol_name: ec42-disperse-0, Err: -1 , Start: 354669840 , Stop: 709339679 ], [Subvol_name: ec42-disperse-1, Err: -1 , Start: 709339680 , Stop: 1064009519 ], [Subvol_name: ec42-disperse-10, Err: -1 , Start: 1064009520 , Stop: 1418679359 ], [Subvol_name: ec42-disperse-11, Err: -1 , Start: 1418679360 , Stop: 1773349199 ], [Subvol_name: ec42-disperse-2, Err: -1 , Start: 1773349200 , Stop: 2128019039 ], [Subvol_name: ec42-disperse-3, Err: -1 , Start: 2128019040 , Stop: 2482688879 ], [Subvol_name: ec42-disperse-4, Err: -1 , Start: 2482688880 , Stop: 2837358719 ], [Subvol_name: ec42-disperse-5, Err: -1 , Start: 2837358720 , Stop: 3192028559 ], [Subvol_name: ec42-disperse-6, Err: -1 , Start: 3192028560 , Stop: 3546698399 ], [Subvol_name: ec42-disperse-7, Err: -1 , Start: 3546698400 , Stop: 3901368239 ], [Subvol_name: ec42-disperse-8, Err: -1 , Start: 3901368240 , Stop: 4294967295 ], [Subvol_name: ec42-disperse-9, Err: -1 , Start: 0 , Stop: 354669839 ], The mountpoint process is: root 28619 0.0 0.3 1155156 193956 ? Ssl Apr26 0:09 /usr/sbin/glusterfs --volfile-server=gprfs022-10ge --volfile-id=/ec42 /mnt/ec42 The volume is: [root@gprfs024 ~]# gluster v i Volume Name: ec42 Type: Distributed-Disperse Volume ID: 61ffe960-77b7-4f4d-9988-2fc3ca65b561 Status: Started Number of Bricks: 12 x (4 + 2) = 72 Transport-type: tcp Bricks: Brick1: gprfs022-10ge:/bricks/b01/g Brick2: gprfs024-10ge:/bricks/b01/g Brick3: gprfs037-10ge:/bricks/b01/g Brick4: gprfs038-10ge:/bricks/b01/g Brick5: gprfs039-10ge:/bricks/b01/g Brick6: gprfs040-10ge:/bricks/b01/g Brick7: gprfs022-10ge:/bricks/b02/g Brick8: gprfs024-10ge:/bricks/b02/g Brick9: gprfs037-10ge:/bricks/b02/g Brick10: gprfs038-10ge:/bricks/b02/g Brick11: gprfs039-10ge:/bricks/b02/g Brick12: gprfs040-10ge:/bricks/b02/g Brick13: gprfs022-10ge:/bricks/b03/g Brick14: gprfs024-10ge:/bricks/b03/g Brick15: gprfs037-10ge:/bricks/b03/g Brick16: gprfs038-10ge:/bricks/b03/g Brick17: gprfs039-10ge:/bricks/b03/g Brick18: gprfs040-10ge:/bricks/b03/g Brick19: gprfs022-10ge:/bricks/b04/g Brick20: gprfs024-10ge:/bricks/b04/g Brick21: gprfs037-10ge:/bricks/b04/g Brick22: gprfs038-10ge:/bricks/b04/g Brick23: gprfs039-10ge:/bricks/b04/g Brick24: gprfs040-10ge:/bricks/b04/g Brick25: gprfs022-10ge:/bricks/b05/g Brick26: gprfs024-10ge:/bricks/b05/g Brick27: gprfs037-10ge:/bricks/b05/g Brick28: gprfs038-10ge:/bricks/b05/g Brick29: gprfs039-10ge:/bricks/b05/g Brick30: gprfs040-10ge:/bricks/b05/g Brick31: gprfs022-10ge:/bricks/b06/g Brick32: gprfs024-10ge:/bricks/b06/g Brick33: gprfs037-10ge:/bricks/b06/g Brick34: gprfs038-10ge:/bricks/b06/g Brick35: gprfs039-10ge:/bricks/b06/g Brick36: gprfs040-10ge:/bricks/b06/g Brick37: gprfs022-10ge:/bricks/b07/g Brick38: gprfs024-10ge:/bricks/b07/g Brick39: gprfs037-10ge:/bricks/b07/g Brick40: gprfs038-10ge:/bricks/b07/g Brick41: gprfs039-10ge:/bricks/b07/g Brick42: gprfs040-10ge:/bricks/b07/g Brick43: gprfs022-10ge:/bricks/b08/g Brick44: gprfs024-10ge:/bricks/b08/g Brick45: gprfs037-10ge:/bricks/b08/g Brick46: gprfs038-10ge:/bricks/b08/g Brick47: gprfs039-10ge:/bricks/b08/g Brick48: gprfs040-10ge:/bricks/b08/g Brick49: gprfs022-10ge:/bricks/b09/g Brick50: gprfs024-10ge:/bricks/b09/g Brick51: gprfs037-10ge:/bricks/b09/g Brick52: gprfs038-10ge:/bricks/b09/g Brick53: gprfs039-10ge:/bricks/b09/g Brick54: gprfs040-10ge:/bricks/b09/g Brick55: gprfs022-10ge:/bricks/b10/g Brick56: gprfs024-10ge:/bricks/b10/g Brick57: gprfs037-10ge:/bricks/b10/g Brick58: gprfs038-10ge:/bricks/b10/g Brick59: gprfs039-10ge:/bricks/b10/g Brick60: gprfs040-10ge:/bricks/b10/g Brick61: gprfs022-10ge:/bricks/b11/g Brick62: gprfs024-10ge:/bricks/b11/g Brick63: gprfs037-10ge:/bricks/b11/g Brick64: gprfs038-10ge:/bricks/b11/g Brick65: gprfs039-10ge:/bricks/b11/g Brick66: gprfs040-10ge:/bricks/b11/g Brick67: gprfs022-10ge:/bricks/b12/g Brick68: gprfs024-10ge:/bricks/b12/g Brick69: gprfs037-10ge:/bricks/b12/g Brick70: gprfs038-10ge:/bricks/b12/g Brick71: gprfs039-10ge:/bricks/b12/g Brick72: gprfs040-10ge:/bricks/b12/g Options Reconfigured: cluster.lookup-unhashed: off client.event-threads: 8 diagnostics.count-fop-hits: on diagnostics.latency-measurement: on
All ec logs in the common code path are not appearing. Raising a new bug for dht.
verified this on glusterfs-3.7.1-13 build and its now showing only 1 message for 1 mkdir. Moving this to verified. [2015-08-31 06:16:41.241515] I [MSGID: 109036] [dht-common.c:7106:dht_log_new_layout_for_dir_selfheal] 13-vol1-dht: Setting layout of /renames with [Subvol_name: vol1-disperse-0, Err: -1 , Start: 2147474804 , Stop: 4294967295 , Hash: 1 ], [Subvol_n
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1845.html