Hide Forgot
Hello. I found a error. o client side E [client3_1-fops.c:1898:client3_1_lookup_cbk] 0-: error E [client3_1-fops.c:1898:client3_1_lookup_cbk] 0-: error W [fuse-bridge.c:184:fuse_entry_cbk] 0-glusterfs-fuse: 17: LOOKUP() /file => -1 (Invalid argument) o server side E [server.c:67:gfs_serialize_reply] 0-: Failed to encode message This is a result of 'ls'. [root@client01 ~]# ls -la /test total 92 drwxr-xr-x 6 root root 4096 Mar 30 13:46 . drwxr-xr-x 4 root root 4096 Mar 28 10:56 .. ?--------- ? ? ? ? ? file <- this drwx------ 2 root root 16384 Mar 30 09:51 lost+found -rw-r--r-- 1 root root 10 Mar 30 13:46 test -rw-r--r-- 1 root root 12502 Mar 29 20:01 log This is a result of 'gluster volume info all'. [root@server01 01]# gluster volume info all Volume Name: test Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: server01:/brick/01 Brick2: server02:/brick/01 Options Reconfigured: network.ping-timeout: 5 This is a result of 'gluster volume info all'. [root@server01 01]# getfattr -m . -d -e hex ./file # file: file trusted.afr.brick1=0x000000000000000000000000 trusted.afr.brick2=0x000000000000000000000000 trusted.gfid=0x7c6cfbbb220940b096b4cd1710d015ea [root@server02 01]# getfattr -m . -d -e hex ./file # file: file trusted.afr.brick1=0x000000000000000000000000 trusted.afr.brick2=0x000000000000000000000000 trusted.gfid=0x7c6cfbbb220940b096b4cd1710d015ea This is other infomation. OS: CentOS 5.5 (x86_64) FileSystem: ext3 Kernel Version: 2.6.18-194.3.1.el5xen Glusterfs Version: 3.1.3 Please tell me how to repair this file.
Hi, Please issue this command on the client machines: 'echo 3 > /proc/sys/vm/drop_caches' And issue this command on the mount point 'find . |xargs stat' If the issue still persists, please try to disable and enable stat-prefetch xlator using the following command: 'gluster volume set test stat-prefetch off' followed by 'gluster volume set rep stat-prefetch on' Please let us know if this fixes the issue at hand. Could you please provide more information as to ops/steps lead to this issue?
Closing the bug as there has been no response to the last post.
I'm sorry but I couldn't reply. Because my glusterfs environment was not available. I tried your advice. but, I couldn't improve this problem. This is a client log. client:/a/brick01/xxxxxx/item# ls -al ./739/10498739/0624/ ls: cannot access ./739/10498739/0624/img936012067904.jpg: No such file or director ls: cannot access ./739/10498739/0624/img674019172034.jpg: No such file or director ls: cannot access ./739/10498739/0624/img936012067217.jpg: No such file or director ls: cannot access ./739/10498739/0624/img6740191723305.jpg: No such file or directo ls: cannot access ./739/10498739/0624/img9360120674726.jpg: No such file or directo ls: cannot access ./739/10498739/0624/img9360120675818.jpg: No such file or directo ls: cannot access ./739/10498739/0624/img6740191728085.jpg: No such file or directo total 204 drwxr-sr-x 2 20102 group 4096 Jun 24 21:39 . drwxr-sr-x 4 20102 group 4096 Jun 24 19:41 .. ?????????? ? ? ? ? ? img674019172034.jpg ?????????? ? ? ? ? ? img6740191723305.jpg -rw-r--r-- 1 20102 group 31233 Jun 24 21:39 img6740191728059.jpg ?????????? ? ? ? ? ? img6740191728085.jpg -rw-r--r-- 1 20102 group 93827 Jun 24 21:39 img6740191728978.jpg ?????????? ? ? ? ? ? img936012067217.jpg ?????????? ? ? ? ? ? img9360120674726.jpg ?????????? ? ? ? ? ? img9360120675818.jpg -rw-r--r-- 1 20102 group 48420 Jun 24 19:41 img9360120677749.jpg ?????????? ? ? ? ? ? img936012067904.jpg This is a client settings. client:/a/brick01/xxxxxx/item# cat /proc/sys/vm/drop_caches 3 client:/a/brick01/xxxxxx/item# cat /usr/local/glusterfs-3.2.0/etc/glusterfs/xxxxxx.vol #------------------------------------------------------------------------ # # brick settings # #------------------------------------------------------------------------ volume disk1 type protocol/client option transport-type tcp/client option remote-host gluster-server01 option ping-timeout 5 option remote-subvolume /brick01 end-volume volume disk2 type protocol/client option transport-type tcp/client option remote-host gluster-server02 option ping-timeout 5 option remote-subvolume /brick01 end-volume #------------------------------------------------------------------------ # # replicate settings # #------------------------------------------------------------------------ volume replicate1 type cluster/replicate subvolumes disk1 disk2 end-volume #------------------------------------------------------------------------ # # performance settings # #------------------------------------------------------------------------ volume cache type performance/io-cache option cache-size 256MB subvolumes replicate1 end-volume volume writeback type performance/write-behind option cache-size 128MB subvolumes cache end-volume volume quickread type performance/quick-read option cache-timeout 1 option max-file-size 512KB subvolumes writeback end-volume volume iothreads type performance/io-threads option thread-count 16 subvolumes quickread end-volume This is a server log. [root@gluster-server01 item]# getfattr -m . -d -e hex ./739/10498739/0624/img6740191728085.jpg # file: 739/10498739/0624/img6740191728085.jpg trusted.gfid=0xc7daf1f183d946dcaca210b30dc26d1e [root@gluster-server02 item]# getfattr -m . -d -e hex ./739/10498739/0624/img6740191728085.jpg getfattr: ./739/10498739/0624/img6740191728085.jpg: No such file or directory This is a server settings. [root@gluster-server01 item]# gluster volume info Volume Name: xxxxxx Type: Replicate Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: gluster-server01:/brick01 Brick2: gluster-server02:/brick01 Options Reconfigured: performance.stat-prefetch: off diagnostics.dump-fd-stats: off diagnostics.count-fop-hits: on diagnostics.latency-measurement: on performance.io-thread-count: 64 performance.cache-size: 6GB network.ping-timeout: 5 What should I do? Please give me some advice. My environment is as following. +-->[gluster-server01]<--+ [server]---+ +---[client] +---[gluster-server02]<--+ * Write path [server] -> [gluster-server01] * Read path [client] -> [gluster-server01] [client] -> [gluster-server02]
I couldn't read more than 128KB from client. client:~# ls -lh /a/brick01/xxxxxx/nsys/ ls: cannot access /a/brick01/xxxxxx/data.129: Invalid argument ls: cannot access /a/brick01/xxxxxx/data.128: Invalid argument ls: cannot access /a/brick01/xxxxxx/data.130: Invalid argument -rw-r--r-- 1 root root 121K Jun 30 12:50 data.121 -rw-r--r-- 1 root root 122K Jun 30 12:52 data.122 -rw-r--r-- 1 root root 123K Jun 30 12:50 data.123 -rw-r--r-- 1 root root 124K Jun 30 12:50 data.124 -rw-r--r-- 1 root root 125K Jun 30 12:50 data.125 -rw-r--r-- 1 root root 126K Jun 30 12:50 data.126 -rw-r--r-- 1 root root 127K Jun 30 12:52 data.127 ?????????? ? ? ? ? ? data.128 ?????????? ? ? ? ? ? data.129 ?????????? ? ? ? ? ? data.130
> My environment is as following. > > +-->[gluster-server01]<--+ > [server]---+ +---[client] > +---[gluster-server02]<--+ > > > * Write path > [server] -> [gluster-server01] > > * Read path > [client] -> [gluster-server01] > [client] -> [gluster-server02] What is a 'server' here? and is it accessing the storage via glusterfs mount? GlsuterFS doesn't support writing directly to the backend.
I understand what you mean. but, When I wrote via fuse, Those problems also happend. This is a write log from client to server: client:~# mount | grep brick /usr/local/glusterfs-3.2.0/etc/glusterfs/xxxxxx.vol on /a/brick01/xxxxxx type fuse.glusterfs (rw,allow_other,default_permissions,max_read=131072) client:~# cd /a/brick01/xxxxxx/yyyy;pwd /a/brick01/xxxxxx/yyyy client:/a/brick01/xxxxxx/yyyy# for x in 121 122 123 124 125 126 127 128 129 130; do dd if=/dev/zero of=./data.${x} bs=1024 count=${x}; done 121+0 records in 121+0 records out 123904 bytes (124 kB) copied, 0.00740512 s, 16.7 MB/s 122+0 records in 122+0 records out 124928 bytes (125 kB) copied, 0.00693275 s, 18.0 MB/s 123+0 records in 123+0 records out 125952 bytes (126 kB) copied, 0.00667576 s, 18.9 MB/s 124+0 records in 124+0 records out 126976 bytes (127 kB) copied, 0.00650061 s, 19.5 MB/s 125+0 records in 125+0 records out 128000 bytes (128 kB) copied, 0.00780486 s, 16.4 MB/s 126+0 records in 126+0 records out 129024 bytes (129 kB) copied, 0.00731015 s, 17.6 MB/s 127+0 records in 127+0 records out 130048 bytes (130 kB) copied, 0.00669811 s, 19.4 MB/s 128+0 records in 128+0 records out 131072 bytes (131 kB) copied, 0.00661765 s, 19.8 MB/s 129+0 records in 129+0 records out 132096 bytes (132 kB) copied, 0.00749507 s, 17.6 MB/s 130+0 records in 130+0 records out 133120 bytes (133 kB) copied, 0.00710567 s, 18.7 MB/s client:/a/brick01/xxxxxx/yyyy# ls -lh data.1* ls: cannot access data.128: Invalid argument <-- broken ls: cannot access data.129: Invalid argument <-- broken ls: cannot access data.130: Invalid argument <-- broken -rw-r--r-- 1 root root 121K Jun 30 14:40 data.121 -rw-r--r-- 1 root root 122K Jun 30 14:40 data.122 -rw-r--r-- 1 root root 123K Jun 30 14:40 data.123 -rw-r--r-- 1 root root 124K Jun 30 14:40 data.124 -rw-r--r-- 1 root root 125K Jun 30 14:40 data.125 -rw-r--r-- 1 root root 126K Jun 30 14:40 data.126 -rw-r--r-- 1 root root 127K Jun 30 14:40 data.127
Hi Yasuya, We do not support writing to backend directly (bypassing the client). This creates inconsistencies. As your log shows, one of the backend has the data, while the other does not [root@gluster-server01 item]# getfattr -m . -d -e hex ./739/10498739/0624/img6740191728085.jpg # file: 739/10498739/0624/img6740191728085.jpg trusted.gfid=0xc7daf1f183d946dcaca210b30dc26d1e <<----as you wrote to the backe server01 directly. [root@gluster-server02 item]# getfattr -m . -d -e hex ./739/10498739/0624/img6740191728085.jpg getfattr: ./739/10498739/0624/img6740191728085.jpg: No such file or directory the client protocol (replicate) actually takes care of mirroring the info on both servers. Please restart all your tests, and do not write to the backend directly. One way to recover data is, to write data(copy the file) to the mount point, and after completion, delete the data on the backend to which you had written directly(older copy). Please let me know if I can go ahead and close this bug.
Closing the bug as I have not heard back from the originator of the bug. The issue was writing to one of the replica backend directly, which is not supported.
I tried recommended configuration but, same situation was going on. Finally I resolved this problem to setting as following. #------------------------------------------------------------------------ # # performance settings # #------------------------------------------------------------------------ volume cache type performance/io-cache option cache-size 256MB subvolumes replicate1 end-volume volume writeback type performance/write-behind option cache-size 128MB subvolumes cache end-volume #volume quickread # type performance/quick-read # option cache-timeout 1 # option max-file-size 512KB # subvolumes writeback #end-volume volume iothreads type performance/io-threads option thread-count 16 subvolumes writeback end-volume When I set quickread, glusterfs could not write over 128KB. Thank you for helping us.