The following scenario was performed to verify the logging improvement. 1. Create a dispersed volume 2. Mount to 2 or more clients 3. Perfom IO on the clients 4. Kill few bricks 5. Bring the bricks up 6. Verify the logs [root@dhcp43-237 ~]# gluster v status Status of volume: testvol_dispersed Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.43.188:/gluster/bricks/brick1/t estvol_dispersed_brick0 49152 0 Y 107025 Brick 10.70.43.5:/gluster/bricks/brick1/tes tvol_dispersed_brick1 49152 0 Y 67553 Brick 10.70.43.245:/gluster/bricks/brick1/t estvol_dispersed_brick2 49152 0 Y 67689 Brick 10.70.41.159:/gluster/bricks/brick1/t estvol_dispersed_brick3 49152 0 Y 68209 Brick 10.70.43.237:/gluster/bricks/brick1/t estvol_dispersed_brick4 49152 0 Y 67796 Brick 10.70.43.224:/gluster/bricks/brick1/t estvol_dispersed_brick5 49152 0 Y 67553 Self-heal Daemon on localhost N/A N/A Y 67813 Self-heal Daemon on 10.70.43.245 N/A N/A Y 67706 Self-heal Daemon on 10.70.43.224 N/A N/A Y 67571 Self-heal Daemon on 10.70.41.159 N/A N/A Y 68226 Self-heal Daemon on 10.70.43.5 N/A N/A Y 67570 Self-heal Daemon on dhcp43-188.lab.eng.blr. redhat.com N/A N/A Y 107042 Task Status of Volume testvol_dispersed ------------------------------------------------------------------------------ There are no active volume tasks Performed IO on clients, brought bricks down on couple of nodes and then up. ---------- OBSERVATIONS: The logs, for e.g 'remote option failed' log in the previous versions, are now providing additional information, thus improving the logging. Below is one eg. ``` [2020-11-03 05:35:32.171859] W [MSGID: 114031] [client-rpc-fops_v2.c:2635:client4_0_lookup_cbk] 0-testvol_dispersed-client-2: remote operation failed. Path: /user1/testfile1.txt (00000000-0000-0000-0000-000000000000) [Transport endpoint is not connected] [2020-11-03 05:37:34.346233] W [MSGID: 114031] [client-rpc-fops_v2.c:2116:client4_0_create_cbk] 3-testvol_dispersed-client-2: remote operation failed. Path: /file114 [Input/output error] ``` Now additional information about which operation and why it is failed can be identified from the logs. Similarly, ``` [2020-11-03 05:37:34.737417] W [MSGID: 122053] [ec-common.c:329:ec_check_status] 3-testvol_dispersed-disperse-0: Operation failed on 1 of 6 subvolumes.(up=111111, mask=111111, remaining=000000, good=111011, bad=000100, FOP : 'CREATE' failed on '/file116' with gfid 00000000-0000-0000-0000-000000000000) [2020-11-03 05:37:34.741320] W [MSGID: 122053] [ec-common.c:329:ec_check_status] 3-testvol_dispersed-disperse-0: Operation failed on 1 of 6 subvolumes.(up=111111, mask=111011, remaining=000000, good=111011, bad=000100, FOP : 'FLUSH' failed on gfid 09c1992b-3190-45b0-9ab8-b39e27e12e32) [2020-11-03 05:37:34.761148] W [MSGID: 122053] [ec-common.c:329:ec_check_status] 3-testvol_dispersed-disperse-0: Operation failed on 1 of 6 subvolumes.(up=111111, mask=111011, remaining=000000, good=111011, bad=000100, FOP : 'WRITE' failed on gfid 09c1992b-3190-45b0-9ab8-b39e27e12e32) [2020-11-03 05:38:00.132336] W [MSGID: 122053] [ec-common.c:329:ec_check_status] 3-testvol_dispersed-disperse-0: Operation failed on 1 of 6 subvolumes.(up=101111, mask=101111, remaining=000000, good=101011, bad=000100, FOP : 'LOOKUP' failed on '/user38' with gfid 585b50e9-303c-47f4-a67c-75db7fa73878) 2020-11-03 05:38:02.168378] W [MSGID: 122033] [ec-common.c:1914:ec_locked] 3-testvol_dispersed-disperse-0: Failed to complete preop lock [Stale file handle] ``` The above observations are verified in the following glusterfs version : ``` glusterfs-6.0-46.el8rhgs.x86_64 ``` ------------------------------------ Marking it as Verifed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (glusterfs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5603