Bug 1413005 - [Remove-brick] Lookup failed errors are seen in rebalance logs during rm -rf
Summary: [Remove-brick] Lookup failed errors are seen in rebalance logs during rm -rf
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: distribute
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: RHGS 3.4.0
Assignee: Nithya Balachandran
QA Contact: Prasad Desala
URL:
Whiteboard: rebase
Depends On:
Blocks: 1503134
TreeView+ depends on / blocked
 
Reported: 2017-01-13 11:49 UTC by Prasad Desala
Modified: 2018-09-04 06:32 UTC (History)
6 users (show)

Fixed In Version: glusterfs-3.12.2-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-09-04 06:29:55 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 None None None 2018-09-04 06:32:05 UTC

Description Prasad Desala 2017-01-13 11:49:22 UTC
Description of problem:
=======================
While remove-brick is in-progress, started removing the entire dataset on the mount point using rm -rf from multiple terminals. The rebalance logs are getting filled with many lookup failed error messages.

When these lookup failed errors were logged, it is just displayed with the file name and lookup failed message. There should be some additional logging information that should get logged along with the lookup failed message which makes easy to find the cause of lookup failure.

[2017-01-13 09:33:07.090970] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: vmlinux.lds.S lookup failed
[2017-01-13 09:33:08.568525] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: __ashrdi3.S lookup failed
[2017-01-13 09:33:08.571814] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: __ashldi3.S lookup failed
[2017-01-13 09:33:08.586351] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: ashrdi3.c lookup failed
[2017-01-13 09:33:08.590265] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: __lshrdi3.S lookup failed
[2017-01-13 09:33:08.599489] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: checksum.c lookup failed
[2017-01-13 09:33:08.601561] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: __ucmpdi2.S lookup failed
[2017-01-13 09:33:08.612718] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: delay.c lookup failed
[2017-01-13 09:33:08.614396] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: bitops.c lookup failed
[2017-01-13 09:33:08.618801] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: internal.h lookup failed
[2017-01-13 09:33:08.620468] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: do_csum.S lookup failed
[2017-01-13 09:33:08.624202] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: lshrdi3.c lookup failed
[2017-01-13 09:33:08.626305] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: memcpy.S lookup failed
[2017-01-13 09:33:08.631211] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: memset.S lookup failed
[2017-01-13 09:33:08.636532] E [MSGID: 109023] [dht-rebalance.c:2200:gf_defrag_migrate_single_file] 0-distrep-dht: Migrate file failed: usercopy.c lookup failed


Version-Release number of selected component (if applicable):
3.8.4-11.el7rhgs.x86_64

How reproducible:
always

Steps to Reproduce:
==================
1) Create distributed-replicate volume and start it.
2) FUSE mount the volume.
3) Under mount point, create two sub directories say /mnt/terminal{1..2}
4) Start Linux kernel untar from both sub directories that is /mnt/terminal1 and /mnt/terminal2
5) Wait for few mins and while untar is in-progress, add couple of bricks to the volume. 
6) Immediately remove the added bricks in step-5 // this will start rebalance 
7) Wait for few mins and while untar is in-progress issue rm -rf * from each terminal directories.

Check for the rebalance logs.

Actual results:
===============
Lookup failed errors are seen in rebalance logs during rm -rf

Expected results:
=================
There should not be any lookup failed errors in rebalance logs. 

Additional info:
================
These lookup failures are not impacting the remove-brick rebalance. On all the nodes, remove-brick rebalance completed successfully.

Comment 3 Ambarish 2017-01-13 12:01:13 UTC
I hit this on add-brick + rm on the Scale setup as well.

Comment 11 Prasad Desala 2018-04-17 12:19:13 UTC
Verified this BZ on glusterfs version: 3.12.2-7.el7rhgs.x86_64. 

Now, lookup failed errors are logged with the error message.

[MSGID: 109023] [dht-rebalance.c:2618:gf_defrag_migrate_single_file] 0-distrepx3-dht: Migrate file failed: /linux-4.9.27/Documentation/devicetree/bindings/phy/keystone-usb-phy.txt lookup failed [No such file or directory]
[MSGID: 109023] [dht-rebalance.c:2618:gf_defrag_migrate_single_file] 0-distrepx3-dht: Migrate file failed: /a84-40 lookup failed [No such file or directory]
[MSGID: 109023] [dht-rebalance.c:2618:gf_defrag_migrate_single_file] 0-distrepx3-dht: Migrate file failed: /a72-40 lookup failed [No such file or directory]

Moving this BZ to Verified.

Comment 12 errata-xmlrpc 2018-09-04 06:29:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.