Bug 1296048 - Attach tier + nfs : Creates fail with invalid argument errors
Attach tier + nfs : Creates fail with invalid argument errors
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: tier (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.1.2
Assigned To: Bug Updates Notification Mailing List
: ZStream
Depends On:
Blocks: 1297311 1306131
  Show dependency treegraph
Reported: 2016-01-06 04:51 EST by Bhaskarakiran
Modified: 2016-11-23 18:12 EST (History)
14 users (show)

See Also:
Fixed In Version: glusterfs-3.7.5-16
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1297311 (view as bug list)
Last Closed: 2016-03-01 01:06:47 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Bhaskarakiran 2016-01-06 04:51:29 EST
Description of problem:

Created a 2x(4+2) EC volume and nfs mounted on the client with quota and uss enabled. Started IO (linux untar, mkdir's, dd (parallel - 1000's). Tried attaching the tier (2x2 dist-rep) and seeing invalid argument errors continuously. The same errors were seen during detach-tier but IO resumed after some time. In this case, complete IO fails with the error messages. 

If quota and uss are turned off, below errors are seen for some time and then the IO resumes.

tar: linux-4.1.1/Documentation/devicetree/bindings/input/touchscreen/zforce_ts.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/input/tps65218-pwrbutton.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/input/twl4030-keypad.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/input/twl4030-pwrbutton.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/abilis,tb10x-ictl.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/allwinner,sun4i-ic.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/allwinner,sun67i-sc-nmi.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/atmel,aic.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm3380-l2-intc.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7038-l1-intc.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,l2-intc.txt: Cannot open: File exists

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Create a disperse volume 2x(4+2)
2. NFS mount on the client.
3. Start IO (linux untar - 2 instances, mkdir (1000 in parallel), dd (1000 in parallel)
4. Attach tier (2x2 dist-rep) volume

Actual results:
Invalid argument errors

Expected results:
No errors to be seen and IO should be smooth.

Additional info:
sosrepots in rhsqe.
Comment 3 Mohammed Rafi KC 2016-01-11 06:14:02 EST

After add-brick, NFS server will be restarted to load new graph. ie NFS server inode table will be fresh after restarting the process. So as part of the fop, resolver will send a lookup on an entry if inode is not lookedup before. During the lookup if healing requires for the entry from DHT (when directories are not present on all of the subvol), we will initiate a healing to create the directories on all of the subvolume. As part of the healing, we are doing a series of named lookup on all the parents starting from root if the inodes are not present, so for a successful lookup we will link the inode to inode table also. This lookup will be initiated from dht, so inode ctx will be created only for the xlators which are beneath of dht.

Since we already linked the inode, ie  resolver will not do a lookup for next fop. So xlator which are above dht will not have inode ctx.

Here in this case, svc_access was complaining about missing inode_ctx.

Possible solutions:

1)  Move dht healing code to interface layer, if healing is required then dht should let the interface layer about healing, and need to give a path to heal. So that each interface layer should do a healing which include fuse, nfs, gfapi.

2) Do not link the inode from any of the xlators other than master xlators, ie do not link from dht. This will cause a huge performance degradation in healing code path, and we might need to do some hack to heal without a linked inode.

3) During resolving of an entry, currently resolving will be successful if there is an inode in the inode table. Make an extra check to see if the inode_ctx is present or not, if inode_ctx is not present for a linked inode, then resolver should consider as an invalid inode and need to do a lookup with the same inode.
Comment 6 Mohammed Rafi KC 2016-01-12 03:57:32 EST
one more patch required to fix this problem completely.
Comment 8 Bhaskarakiran 2016-01-27 04:18:20 EST
Verified this on 3.7.5-17 and didn't hit the issue. Marking this as verified.
Comment 9 nchilaka 2016-01-29 07:12:25 EST
However there will a pause of the IOs for sometime(may be about 4-5min) while running IOs with attach tier on NFS
Comment 11 errata-xmlrpc 2016-03-01 01:06:47 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.