Red Hat Bugzilla – Bug 1296048
Attach tier + nfs : Creates fail with invalid argument errors
Last modified: 2016-11-23 18:12:16 EST
Description of problem:
Created a 2x(4+2) EC volume and nfs mounted on the client with quota and uss enabled. Started IO (linux untar, mkdir's, dd (parallel - 1000's). Tried attaching the tier (2x2 dist-rep) and seeing invalid argument errors continuously. The same errors were seen during detach-tier but IO resumed after some time. In this case, complete IO fails with the error messages.
If quota and uss are turned off, below errors are seen for some time and then the IO resumes.
tar: linux-4.1.1/Documentation/devicetree/bindings/input/touchscreen/zforce_ts.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/input/tps65218-pwrbutton.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/input/twl4030-keypad.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/input/twl4030-pwrbutton.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/abilis,tb10x-ictl.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/allwinner,sun4i-ic.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/allwinner,sun67i-sc-nmi.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/atmel,aic.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2835-armctrl-ic.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm3380-l2-intc.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7038-l1-intc.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm7120-l2-intc.txt: Cannot open: File exists
tar: linux-4.1.1/Documentation/devicetree/bindings/interrupt-controller/brcm,l2-intc.txt: Cannot open: File exists
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Create a disperse volume 2x(4+2)
2. NFS mount on the client.
3. Start IO (linux untar - 2 instances, mkdir (1000 in parallel), dd (1000 in parallel)
4. Attach tier (2x2 dist-rep) volume
Invalid argument errors
No errors to be seen and IO should be smooth.
sosrepots in rhsqe.
After add-brick, NFS server will be restarted to load new graph. ie NFS server inode table will be fresh after restarting the process. So as part of the fop, resolver will send a lookup on an entry if inode is not lookedup before. During the lookup if healing requires for the entry from DHT (when directories are not present on all of the subvol), we will initiate a healing to create the directories on all of the subvolume. As part of the healing, we are doing a series of named lookup on all the parents starting from root if the inodes are not present, so for a successful lookup we will link the inode to inode table also. This lookup will be initiated from dht, so inode ctx will be created only for the xlators which are beneath of dht.
Since we already linked the inode, ie resolver will not do a lookup for next fop. So xlator which are above dht will not have inode ctx.
Here in this case, svc_access was complaining about missing inode_ctx.
1) Move dht healing code to interface layer, if healing is required then dht should let the interface layer about healing, and need to give a path to heal. So that each interface layer should do a healing which include fuse, nfs, gfapi.
2) Do not link the inode from any of the xlators other than master xlators, ie do not link from dht. This will cause a huge performance degradation in healing code path, and we might need to do some hack to heal without a linked inode.
3) During resolving of an entry, currently resolving will be successful if there is an inode in the inode table. Make an extra check to see if the inode_ctx is present or not, if inode_ctx is not present for a linked inode, then resolver should consider as an invalid inode and need to do a lookup with the same inode.
one more patch required to fix this problem completely.
Verified this on 3.7.5-17 and didn't hit the issue. Marking this as verified.
However there will a pause of the IOs for sometime(may be about 4-5min) while running IOs with attach tier on NFS
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.