Bug 764495 (GLUSTER-2763)

Summary: mysql regression suite fails with gluster 3.1.3
Product: [Community] GlusterFS Reporter: jggally
Component: posixAssignee: Amar Tumballi <amarts>
Status: CLOSED DEFERRED QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.3CC: amarts, gluster-bugs, saurabh, vijay, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-18 10:32:43 UTC Type: ---
Regression: --- Mount Type: fuse
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description jggally 2011-04-14 22:24:35 UTC
This is easy to reproduce, and appears to be 100% reproducible: i.e.  the regression appears to never fully succeed when the mysql data location is in a gluster subdir, but if the location is moved to a non-gluster ext3 subdir it appears to always succeed.  

With gluster, some test cases fail every time, others randomly fail.  Interestingly, one of the test cases that fails every time, innodb.test,  does not fail if the test case is run under strace, which suggests a timing/race condition to me (running mysql test cases with strace is slightly tricky: let me know if you want the hack work-around for this).

After installing mysql, mysql-server, and mysql-test, the regression suite can be started via root as follows from /usr/share/mysql-test :

sudo -u mysql ./mysql-test-run   --vardir=<any global glusterfs subdirectory that has read/write/execute and create/remove permissions >

Configuration info:
-	Linux kernel 3.2.32
-	glusterfs-core.x86_64 3.1.3-1
-	glusterfs-fuse.x86_64 3.1.3-1
-	mysql.x86_64 5.0.77-4.el5_5.5
-	mysql-server.x86_64 5.0.77-4.el5_5.5
-	mysql-test.x86_64 5.0.77-4.el5_5.5

Single node gluster.

[root@sweng65 mysql-test]#  cat /etc/glusterd/vols/sme_global/bricks/192.160.100.65\:-logging-sme-gfs-backing
hostname=192.160.100.65
path=/logging/sme-gfs-backing # note: this is an ext3 fs

[root@sweng65 mysql-test]# cat /etc/glusterd/vols/sme_global/sme_global-fuse.vol
volume sme_global-client-0
    type protocol/client
    option remote-host 192.160.100.65
    option remote-subvolume /logging/sme-gfs-backing
    option transport-type tcp
end-volume

volume sme_global-write-behind
    type performance/write-behind
    subvolumes sme_global-client-0
end-volume

volume sme_global-read-ahead
    type performance/read-ahead
    subvolumes sme_global-write-behind
end-volume

volume sme_global-io-cache
    type performance/io-cache
    subvolumes sme_global-read-ahead
end-volume

volume sme_global-quick-read
    type performance/quick-read
    subvolumes sme_global-io-cache
end-volume

volume sme_global-stat-prefetch
    type performance/stat-prefetch
    subvolumes sme_global-quick-read
end-volume

volume sme_global
    type debug/io-stats
    subvolumes sme_global-stat-prefetch
end-volume

[root@sweng65 mysql-test]# cat /etc/glusterfs/glusterd.vol
volume management
    type mgmt/glusterd
    option working-directory /etc/glusterd
    option transport-type socket,rdma
    option transport.socket.keepalive-time 10
    option transport.socket.keepalive-interval 2
end-volume

Comment 1 Saurabh 2011-05-09 02:01:26 UTC
On a dist-rep volume and using mysql we have reproduced this bug,


Volume Name: drep
Type: Distributed-Replicate
Status: Started
Number of Bricks: 3 x 2 = 6
Transport-type: tcp
Bricks:
Brick1: 10.1.12.134:/mnt/drep
Brick2: 10.1.12.135:/mnt/drep
Brick3: 10.1.12.134:/mnt/dddrep
Brick4: 10.1.12.135:/mnt/dddrep
Brick5: 10.1.12.134:/mnt/ddddrep
Brick6: 10.1.12.135:/mnt/ddddrep
Options Reconfigured:
features.quota: off


we reproduced the bug on 3.2.0

Comment 2 Amar Tumballi 2011-06-27 09:28:13 UTC
Venky/Saurabh,

Any updates on this? Lets take this to completion.

Comment 3 Venky Shankar 2011-07-14 03:04:33 UTC
(In reply to comment #2)
> Venky/Saurabh,
> 
> Any updates on this? Lets take this to completion.

Amar,

Since i was upto hdfs things i didn't get time to look into this. I am relatively freed up with hdfs now, i will start looking into this.

Thanks,
-Venky

Comment 4 Venky Shankar 2011-07-19 06:28:01 UTC
OK, i tried to dig into this further. here is what i get:

Setup:

MySQL 5.0 (compiled from source) - includes server, client, mysql-test
glusterfs - 3.1.3

When running the innodb related test - i got a bunch of "duplicate entry" errors from the test (see below). This does not seem to be related to any kind of locking issues.

---------------------------------------------------------------

drop table t1;
create table t1 (n int not null primary key) engine=innodb;
set autocommit=0;
insert into t1 values (4);
rollback;
select n, "after rollback" from t1;
n       after rollback
insert into t1 values (4);
commit;
select n, "after commit" from t1;
n       after commit
4       after commit
commit;
insert into t1 values (5);
insert into t1 values (4);
ERROR 23000: Duplicate entry '4' for key 1
commit;
select n, "after commit" from t1;
n       after commit
4       after commit
5       after commit
------------------------------------------------------------------

jggally,

did you get the same error as show above in innodb.reject (this file would be in the --vardir path you provided to the test) and not any kind of locking issues when testing using single mysql instance. (for multiple mysql instances you would need to use mandatory locking in glusterfs  i.e. use external locking) 

While the tests are clean on a normal ext3/4 backed FS - i get a bunch of "duplicate entry" errors when backed by glusterfs.

Please let me know so i can debug further and concentrate on this particular failure case.

Thanks,
-Venky

Comment 5 jggally 2011-07-19 14:13:43 UTC
Yes, I got the same error at the same spot.  I have not independently varified that the mysql regression suite is not running mutiple copies of mysql in parallel. 

BTW, we have reason to believe that some of the mysql regression test suite errors are the result of the mysql server creating temp files of tables that are being altered, then renaming the temp file over the top of an existing table file to replace it with the altered table info when the DB changes are committed (see gluster bug # 762766 - "rename() is not atomic").

An excerpt from my innodb.reject below (note there are 69 other errors in my innodb.reject file see below for a summary):


drop table t1;
create table t1 (n int not null primary key) engine=innodb;
set autocommit=0;
insert into t1 values (4);
rollback;
select n, "after rollback" from t1;
n       after rollback
insert into t1 values (4);
commit;
select n, "after commit" from t1;
n       after commit
4       after commit
commit;
insert into t1 values (5);
insert into t1 values (4);
ERROR 23000: Duplicate entry '4' for key 1
commit;
select n, "after commit" from t1;
n       after commit
4       after commit
5       after commit



ERROR 42000: SAVEPOINT savept3 does not exist
ERROR 42000: SAVEPOINT my_savepoint does not exist
ERROR 42000: SAVEPOINT savept2 does not exist
ERROR 23000: Duplicate entry 'pippo' for key 1
ERROR 23000: Duplicate entry 'test2' for key 2
ERROR 23000: Duplicate entry '1' for key 1
ERROR 23000: Duplicate entry 'test2' for key 2
ERROR 23000: Duplicate entry '1-1' for key 1
ERROR 23000: Duplicate entry '1-1' for key 1
ERROR 42000: Unknown database 'mysqltest'
ERROR 23000: Duplicate entry '3' for key 1

ERROR 23000: Duplicate entry '4' for key 1
ERROR 23000: Duplicate entry '1' for key 1
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 42S22: Unknown column 't1.id' in 'where clause'
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 42000: Incorrect foreign key definition for 't1_id_fk': Key reference and table reference don't match
ERROR 42S21: Duplicate column name 'c'
ERROR 42S21: Duplicate column name 'c1'
ERROR 42S21: Duplicate column name 'c1'
ERROR 42S21: Duplicate column name 'c1'
ERROR 42S21: Duplicate column name 'c1'
ERROR 42S21: Duplicate column name 'c1'
ERROR 42S21: Duplicate column name 'c1'
ERROR 42S21: Duplicate column name 'c1'
ERROR HY000: The used table type doesn't support FULLTEXT indexes
ERROR 23000: Duplicate entry '{ ' for key 1
ERROR 23000: Duplicate entry 'a' for key 1
ERROR 23000: Duplicate entry 'a ' for key 1
ERROR 23000: Duplicate entry 'a     ' for key 1
ERROR 23000: Duplicate entry 'a         ' for key 1
ERROR 23000: Duplicate entry 'a ' for key 1
ERROR 23000: Duplicate entry '1' for key 2
ERROR 23000: Duplicate entry '2' for key 1
ERROR 42000: Specified key was too long; max key length is 767 bytes
ERROR 42000: Specified key was too long; max key length is 767 bytes
ERROR 42000: Specified key was too long; max key length is 767 bytes
ERROR 42000: Specified key was too long; max key length is 767 bytes
ERROR 23000: Cannot add or update a child row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails
ERROR 23000: Cannot add or update a child row: a foreign key constraint fails ()
ERROR HY000: Can't create table './test/t1.frm' (errno: 150)
ERROR HY000: Can't create table './test/t2.frm' (errno: 150)
ERROR HY000: Error on rename of './test/t3' to './test/t1' (errno: 150)
ERROR 23000: Cannot add or update a child row: a foreign key constraint fails ()
ERROR 23000: Cannot add or update a child row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot add or update a child row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot add or update a child row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 42000: Specified key was too long; max key length is 3072 bytes
ERROR 23000: Duplicate entry 'A' for key 1
ERROR 23000: Duplicate entry 'A ' for key 1
ERROR 23000: Duplicate entry 'A' for key 1
ERROR 23000: Cannot add or update a child row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR 23000: Cannot delete or update a parent row: a foreign key constraint fails ()
ERROR HY000: Can't create table './test/t1.frm' (errno: -1)
ERROR HY000: The used table type doesn't support SPATIAL indexes
ERROR HY000: Error on rename of '#sql-temporary' to './test/t2' (errno: 150)

Comment 6 Amar Tumballi 2012-09-04 09:28:52 UTC
More than 18months  since reported. Need confirmation from QE team to proceed with working on this. If mysql works on top of GlusterFS, we can close as WORKSFORME, if it doesn't, will keep it open for community contribution (internship project?), as hosting DB on top of GlusterFS is not a 'supported' use case for now.

Comment 7 Amar Tumballi 2012-09-18 10:32:43 UTC
not working on Database hosting on glusterfs right now. Will reopen once we start dealing with database.