998963 – CTDB:iozone gives read block error when run on smb mount on windows

Bug 998963 - CTDB:iozone gives read block error when run on smb mount on windows

Summary: CTDB:iozone gives read block error when run on smb mount on windows

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	samba
Sub Component:
Version:	2.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ira Cooper
QA Contact:	surabhi
Docs Contact:
URL:
Whiteboard:	ctdb
Depends On:
Blocks:	956495
TreeView+	depends on / blocked

Reported:	2013-08-20 12:42 UTC by surabhi
Modified:	2015-12-03 17:14 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-12-03 17:14:17 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Perl script used to generate I/O (2.26 KB, application/x-perl) 2013-08-27 13:06 UTC, Lalatendu Mohanty	no flags	Details
View All

Description surabhi 2013-08-20 12:42:49 UTC

Description of problem:

In a ctdb setup, with a 6x2 distributed replicated volume mounted via smb on windows client ,with the virtual ip corresponding to the one of physical nodes where the volume resides ,executed iozone test: 

When the node corresponding to the virtual IP with which the volume is mounted is rebooted the iozone test failed with read block error.

The observation was that when the node went down the other node takeover and iozone was hung for some time and as soon as the original node came up the test failed.


Version-Release number of selected component (if applicable):
samba-glusterfs-3.6.9-159.1.el6rhs.x86_64
glusterfs-3.4.0.19rhs-2.el6rhs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. create a 6x2 dis-rep volume.
2. create a ctdb setup
3. mount the samba share via smb on windows with the virtual IP
4. Run iozone from the mount point
5. Bring down the node corresponding to the virtual IP 



Actual results:
Iozone failed with read block error.
Error reading block 145 fee00000
read: No such file or directory

Expected results:
IOzone should run successfully.

Additional info:

First mounted the share with the virtual IP corresponding to the node where 
ctdb volume bricks were present and it was failing. So tried it with another VIP where ctdb volume is not part of it.it still fails.

Comment 3 Christopher R. Hertel 2013-08-20 22:42:13 UTC

Please post the Windows version and the protocol version negotiated on the wire.

Comment 4 Christopher R. Hertel 2013-08-21 02:28:58 UTC

In a simple test:
Windows 7 client
Samba 3.6.3 server (what I have at home)
EXT4 File system
SMB2 protocol negotiated

I ran the following sequence:

# service smbd stop; sleep 2; service smbd start

That was sufficient to produce an error very similar to the one given above, even through the Windows client very quickly reconnected to the server (once it was running again).

The underlying problem is that once the TCP connection is broken, both the client and the server lose a lot of state information. In particular, they lose things like file handles. In order to restart the connection, the client has to re-establish the TCP connection, reopen all files, and re-apply for all locks before any other client takes those locks away. Since, on the client side, the application holds the file handles, it becomes the applications job to reestablish the connection. Few applications actually do this, and some will crash so badly that they bring down the whole Windows system. I've seen this happen (Blue Screen).

There are new features of SMB2 and SMB3 that allow the client and server to maintain more state information so that connection state can be reestablished. These features (Durable handles, Resilient handles, and Persistent handles) are not available in Samba 3.6.x. They are being developed for Samba 4.1.x and above.

So, the behavior listed in this ticket is the expected behavior, even in a clustered environment. Samba 3.6.x with CTDB provides scale-out cluster support as well as course-grained failover. By course-grained, I mean what you see here. The behavior is the same is it would be if a single-node server crashed hard and came back up again quickly.

Request that QE close this ticket with appropriate settings.

Comment 5 surabhi 2013-08-21 08:51:37 UTC

Tried the test on windows 7 client 
with max protocol=SMB2
samba version samba-3.6.9-159.1.el6rhs.x86_64
xfs/samba
Tried it on single node
With smb stop and start the test is failing.
Moving the keyword blocker.

Comment 6 Sayan Saha 2013-08-21 13:43:07 UTC

Is there a test application that we can use to see the behavior where the client does the proper re-establishment of connection after a reboot while using CTDB? I do agree that iozone does not do all these things and so the error is probably inevitable. Also we should at-least run the same test with a single node including Samba + XFS only and see what happens when a server is rebooted while live I/O is going on. OK to not have this as a blocker.

Comment 7 surabhi 2013-08-21 14:02:20 UTC

To confirm the live failover with ctdb I tried the below tests and in both of the tests I/O's stopped.

1. In the ctdb setup , I mounted the volume on windows client with virtual ip of node1 and started the iozone test. When I brought down the network (ip link set dev eth0 down) of the node1 , the iozone I/O failed.

2. In the second test, I shut down the node1 while io was running and and iozone failed after shut down.

I tried iozone io on single node xfs+samba. I rebooted the samba server and as expected i/o failed. 
Because i/o fails for a single node samba server(in case of unavailability), we run ctdb ( i.e. clustered implementation of samba servers) for node failover, and IP takeover.

We will try simple application other than iozone to see it works with ctdb and update.

Comment 8 Christopher R. Hertel 2013-08-21 17:30:42 UTC

See comment #4.
We can bring down smbd, wait the 2 seconds for all smbd daemons to stop, and then restart smbd.

When I did this against Samba+EXT4, iozone failed within the 2 second delay period, but by the time I had typed 'dir<enter>', the share was available again and the temporary iozone.tmp file was available.

A similar test should work against a clustered configuration, as long as we allow enough time (perhaps 4 seconds instead of 2) for the failover to occur.

1) Start the iozone test on the Windows client.
2) Shut down smbd on the connected server node.
3) Allow iozone to fail.
4) Use the dir command to see if the share is available.
5) Once the share is available again, delete iozone.tmp and restart iozone.

Comment 9 surabhi 2013-08-22 14:57:42 UTC

Here are following scenarios we tried:

I tried the iozone test on same environment with nfs mount and it worked fine.

 Executed iozone on nfs mount with virtual ip and power off the node, the iozone continued and node takeover has happened. After powering on the node the iozone still continued.

These are some tests i executed to test if ctdb failover and failback happens for a simple application (i.e. application other than iozone)

The application is a simple Perl script which creates files and folders on the smb mount point.

 1. Started the script on the mount point.Powered off one of the nodes with whose virtual ip the share was mounted and the script was running continuously and creating files. It didn't give any error which looks like the file handle got migrated to the other node and ctdb failover is working fine with a simple application.

2. In second scenario, started the same script and rebooted the node, this time the script was running fine. But once the node came back to healthy (in command "ctdb status"), the script threw the error 

print() on closed filehandle FH at Win_CreateDirTreeNFiles.pl line 73.  <-------with reboot

So it looks like failover is happening but failback is not happening properly.

Comment 10 Christopher R. Hertel 2013-08-22 19:46:47 UTC

Please indicate the platform on which the Perl Scripts are running.  A packet capture would also help.

Also note that comparison between NFS and SMB is not particularly useful in this case, since NFSv3 is a stateless protocol and failover *should* work as described above.  SMB is stateful, and what we are particularly losing in a failover or failback situation is state.

...and a CTDB failback is basically the same as a failover.

Comment 11 Lalatendu Mohanty 2013-08-27 13:05:15 UTC

Chris,

The script was running on a Windows7 client machine. We will provide a packet trace once the set-up is free. I will attach the script with this bug in case you want to-take a look.

In the above comment, the NFS and the Perl script on win7 client is different issue. I agree that NFSV3 is a stateless protocol and we should not compare with smb. 

As you said "CTDB failback is basically the same as a failover", I am worried why the script is not failing when we take down the node but it fails when it comes back.

Comment 12 Lalatendu Mohanty 2013-08-27 13:06:48 UTC

Created attachment 790970 [details]
Perl script used to generate I/O


Attached the Perl script which can be used to create IO.

Comment 13 Christopher R. Hertel 2013-10-02 20:38:28 UTC

Since Durable and Persistent handle support is not available in Samba 3.6, and since SMB is a stateful protocol, the expected behavior is that a connection loss or loss of the server will cause an I/O failure even if the client reconnects to another server in the cluster.

There are other BZs open against banning and failover issues in CTDB.

The original bug reported in this BZ is actually 'expected behavior'.

Comment 16 Vivek Agarwal 2015-12-03 17:14:17 UTC

Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.

Note You need to log in before you can comment on or make changes to this bug.