Bug 763778 (GLUSTER-2046)

Summary: tar xf operates different on gluster than other filesystems
Product: [Community] GlusterFS Reporter: lana.deere
Component: rdmaAssignee: Raghavendra G <raghavendra>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 3.1.0CC: aavati, gluster-bugs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: fuse
Documentation: DNR CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
tarfile containing two .a files which can't be extracted
none
client TRACE from around a failed tar xf
none
server TRACE from around the time of tailed tar xf none

Description lana.deere 2010-10-29 23:08:27 UTC
The attached tarfile cannot be extracted onto my gluster filesystem.  3.1, distribute, rdma, native client/fuse, centos 5.4/5.5.

$ tar xzf /tmp/tcl-issue.tar.gz 
tar: gcc-4.3.4-opt-64/third_party/lib/libtcl8.5.a: Cannot write: Permission denied
tar: gcc-4.3.4-opt-64/third_party/lib/libtcl8.5.a: Cannot close: Permission denied
tar: Error exit delayed from previous errors

It will extract without problem on a regular filesystem, such as my /tmp.

Workaround is to extract it onto the regular filesystem, chmod -R u+w the hierarchy, then cp -r the hierarchy onto the gluster filesystem.

Comment 1 Raghavendra G 2010-11-10 02:47:45 UTC
Hi Iana,

Did you try to extract as root or as some other user? If you are trying as non-root user, do the back end directories have adequate permissions for tar extraction?

regards,
Raghavendra

Comment 2 lana.deere 2010-11-10 04:25:07 UTC
I have run into this problem both as root and as a regular user.  Either way, I had permission to write to the directory where I was trying to extract.  The permissions on the files in the extracted hierarchy can be seen inside the tarfile.  Some of them are supposed to end up without write permission once they are extracted.

Comment 3 Raghavendra G 2010-11-10 07:10:15 UTC
Hi Iana,

we are not able to reproduce the issue on our test setups. tar extraction on a distribute setup with rdma as transport is working fine on our machines. Can you try to run glusterfs and glusterfsd at loglevel 'TRACE' and provide us the log files?

regards,
Raghavendra.

Comment 4 lana.deere 2010-11-11 22:25:23 UTC
I can try that early next week.  How do I set the loglevel?

Comment 5 Raghavendra G 2010-11-12 01:06:32 UTC
For client you've to pass '-L TRACE' option while starting it.

For servers you've to do it through gluster. For example, if your volume name is dist, then you can change the loglevel as shown below,

root@booradley:/home/raghu# gluster
gluster> volume info dist

Volume Name: dist
Type: Distribute
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 192.168.1.134:/home/export/1
Brick2: 192.168.1.134:/home/export/2
Brick3: 192.168.1.134:/home/export/3
Brick4: 192.168.1.134:/home/export/4
Options Reconfigured:
diagnostics.brick-log-level: INFO
gluster> volume set dist brick-log-level TRACE
Set volume successful
gluster> volume info dist

Volume Name: dist
Type: Distribute
Status: Started
Number of Bricks: 4
Transport-type: tcp
Bricks:
Brick1: 192.168.1.134:/home/export/1
Brick2: 192.168.1.134:/home/export/2
Brick3: 192.168.1.134:/home/export/3
Brick4: 192.168.1.134:/home/export/4
Options Reconfigured:
diagnostics.brick-log-level: TRACE

regards,
Raghavendra

Comment 6 lana.deere 2010-11-15 16:50:05 UTC
I tried to follow the directions, hopefully I didn't mess up.  On the client side, I added log-level=TRACE to the mount options.  On the server side I did 
    gluster volume set RaidData diagnostics.brick-log-level TRACE
Then I tried my one of the tarfiles which hits the error.

In a minute I'll attach files which I hope are the correct log files from the client and the server.  I edited down the client to just the time around the tar and rotated the server log first, so they wouldn't be gigantic. 

How do I turn off the trace on the server?  I'm not sure what level it was originally, as it didn't show up in the volume info at all.

Comment 7 lana.deere 2010-11-15 16:51:45 UTC
Created attachment 380

Comment 8 lana.deere 2010-11-15 16:52:20 UTC
Created attachment 381

Comment 9 Raghavendra G 2010-11-16 04:05:17 UTC
Thanks for the logs. The original log levels were INFO. You can set loglevel to INFO the same you had set it to TRACE.

regards,
Raghavendra.

Comment 10 Raghavendra G 2010-11-25 03:27:13 UTC
Hi Iana,

part of server logs seems to be missing. Writes (which errored out) to file happened at around 2010-11-15 14:40:05.787713 from client, but server log has ended at 2010-11-15 11:41:02.137410. I am particularly interested in the trace of rpc call with xid 0x9bd.

Couple of other things:
* have you tried same tests with tcp as transport? If so, what were the results?
* can you also load trace translator on top of posix? This you've to do manually by editing volume spec files that can be found in /etc/glusterd/vols/<volname>/. An example illustrating adding trace translator is given below

volume brick
  type storage/posix
  option directory /export
end-volume

volume trace
  type debug/trace
  subvolumes brick
end-volume

volume locks
  type features/locks
  subvolumes trace
end-volume


volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.locks.allow *
  subvolumes locks
end-volume

regards,
Raghavendra.

Comment 11 lana.deere 2010-11-28 12:25:40 UTC
# part of server logs seems to be missing. Writes (which errored out) to file
# happened at around 2010-11-15 14:40:05.787713 from client, but server log has
# ended at 2010-11-15 11:41:02.137410. I am particularly interested in the trace
# of rpc call with xid 0x9bd.

Are the timestamps in the logs in localtime or in UTC?  I ask because I took a look into my configuration, it turns out that the server is set to Pacific time while the client is set to Eastern, explaining the apparent 3 hour difference.  I'll have to fix that at some point, they should be in the same timezone.  Sorry for any confusion this has caused.

Can you match the write at client time 14:40 with a server item at 11:40 or so?
If I remember correctly, the client and server were both idle (except for standard system stuff), then I tried the tar, then they were both idle again.  So perhaps it is possible to match the items despite the timezone difference?

Otherwise I may be able to retry the experiment after getting the time zone issue fixed.

Comment 12 lana.deere 2010-11-28 12:55:32 UTC
My apologies again, but the NTP wasn't running on the storage nodes either, so there is some clock skew also between the client and server, even if you compensate for the timezone difference.  Let me know if you want me to try the trace experiment again.

Comment 13 Raghavendra G 2010-11-29 06:34:05 UTC
Time is not an issue. I just wanted to confirm that entire logs have been attached. If you've attached entire log files, then there seems to be some discrepancy, since one of the request has not been logged in server log files.

Irrespective of what I've written above, it would be helpful if you can answer other two questions I've asked in my earlier comments. I am pasting those two questions below:

* have you tried same tests with tcp as transport? If so, what were the
results?
* can you also load trace translator on top of posix? This you've to do
manually by editing volume spec files that can be found in
/etc/glusterd/vols/<volname>/. An example illustrating adding trace translator
is given below

volume brick
  type storage/posix
  option directory /export
end-volume

volume trace
  type debug/trace
  subvolumes brick
end-volume

volume locks
  type features/locks
  subvolumes trace
end-volume


volume server
  type protocol/server
  option transport-type tcp
  option auth.addr.locks.allow *
  subvolumes locks
end-volume

regards,
Raghavendra.

Comment 14 lana.deere 2010-11-29 06:55:51 UTC
I believe the whole logs were attached.

I did not try with tcp.  I could try that if it is not hard, do I just mount the volume with the transport=tcp option, or are there other things I would have to configure?

Are the trace translator volume spec files on the client side or the server side?  I should be able to give it a try, I think.

Comment 15 lana.deere 2010-11-30 09:51:23 UTC
An additional data point: if I try to extract this tarfile as a regular user, I reliably get the error as mentioned in the original report.  If I extract this tarfile as root, it works fine.

Comment 16 Raghavendra G 2010-11-30 09:57:45 UTC
you can either change the option transport-type to tcp or create a new volume
(using gluster cli) specifying 'transport tcp' (along with other parameters) in
command line.

trace is needed on server side.

regards,
Raghavendra

Comment 17 Anand Avati 2010-11-30 10:16:03 UTC
The symptoms of this bug match with the bug 763790 very well, which is fixed in 3.1.1. Please try with 3.1.1 and confirm if it fixes your problem. We would like to close this bug.

Avati

Comment 18 lana.deere 2010-11-30 10:20:52 UTC
I will upgrade to 3.1.1 in the next day or two and then will let you know if the problem is still there.

Comment 19 lana.deere 2010-11-30 14:41:47 UTC
The upgrade to 3.1.1 seems to have resolved this issue.  Thanks!

Comment 20 Raghavendra G 2010-12-13 00:25:16 UTC

*** This bug has been marked as a duplicate of bug 2058 ***