Description of problem: Disk I/O erros in some WAL tests of sqlite build for ppc64le. Problem also occurs with ppc64. Version-Release number of selected component (if applicable): sqlite-3.8.4.2-2.fc21.src.rpm How reproducible: Build with make check active. Actual results: "Error: disk I/O error" message on some tests. Additional info: Trying to build and make checks on the ppc64le arch. Got some disk I/O errors on the WAL (Write-Ahead Logging) tests. Reproduced the problem under the sqlite3 command line tool. The disk I/O error seems to occur when the -shm file associated to the database file reaches the 65535 bytes size. Here is the corresponding log: sqlite> .log stdout sqlite> vacuum; (5386) os_unix.c:28099: (22) mmap(/tmp/mydb-shm) - (5386) statement aborts at 5: [ATTACH '' AS vacuum_db;] disk I/O error (10) statement aborts at 2: [vacuum;] disk I/O error Error: disk I/O error sqlite> The strace tool was used to capture some more information: $ strace sqlite3 mydb 'VACUUM;' 2>&1 | tee mydb-vacuum.strace ... lseek(5, 53247, SEEK_SET) = 53247 write(5, "\0", 1) = 1 lseek(5, 57343, SEEK_SET) = 57343 write(5, "\0", 1) = 1 lseek(5, 61439, SEEK_SET) = 61439 write(5, "\0", 1) = 1 lseek(5, 65535, SEEK_SET) = 65535 write(5, "\0", 1) = 1 mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0x8000) = -1 EINVAL (Invalid argument) fcntl(5, F_SETLK, {type=F_UNLCK, whence=SEEK_SET, start=121, len=7}) = 0 fcntl(5, F_SETLK, {type=F_UNLCK, whence=SEEK_SET, start=120, len=1}) = 0 write(2, "Error: disk I/O error\n", 22Error: disk I/O error ) = 22 exit_group(10) = ? +++ exited with 10 +++ Looking at the os_unix.c file, in the unixShmMap() function: ... pShmNode->apRegion = apNew; while(pShmNode->nRegion<=iRegion){ void *pMem; if( pShmNode->h>=0 ){ pMem = osMmap(0, szRegion, pShmNode->isReadonly ? PROT_READ : PROT_READ|PROT_WRITE, MAP_SHARED, pShmNode->h, szRegion*(i64)pShmNode->nRegion ); if( pMem==MAP_FAILED ){ 4410 rc = unixLogError(SQLITE_IOERR_SHMMAP, "mmap", pShmNode->zFilename); goto shmpage_out; } }else{ pMem = sqlite3_malloc(szRegion); if( pMem==0 ){ rc = SQLITE_NOMEM; goto shmpage_out; } memset(pMem, 0, szRegion); } pShmNode->apRegion[pShmNode->nRegion] = pMem; pShmNode->nRegion++; } } ... At line 4410, the error message regarding mmap if displayed, and SQLITE_IOERR_SHMMAP symbol value is 5386 (defined in sqlite3.h) as displayed in the sqlite log, with the EINVAL value for errno. The message is displayed because the call to osMmap() failed. The osMmap() is defined in the same os_unix.c file as: /* ** Many system calls are accessed through pointer-to-functions so that ** they may be overridden at runtime to facilitate fault injection during ** testing and sandboxing. The following array holds the names and pointers ** to all overrideable system calls. */ static struct unix_syscall { const char *zName; /* Name of the system call */ sqlite3_syscall_ptr pCurrent; /* Current value of the system call */ sqlite3_syscall_ptr pDefault; /* Default value */ } aSyscall[] = { ... #if !defined(SQLITE_OMIT_WAL) || SQLITE_MAX_MMAP_SIZE>0 { "mmap", (sqlite3_syscall_ptr)mmap, 0 }, #define osMmap ((void*(*)(void*,size_t,int,int,int,off_t))aSyscall[21].pCurrent) ... }; /* End of the overrideable system calls */ Failing call: mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_SHARED, 5, 0x8000) = -1 EINVAL (Invalid argument) Looking at the mmap() man page, errno can be set to EINVAL in following three cases: ... EINVAL We don't like addr, length, or offset (e.g., they are too large, or not aligned on a page boundary). EINVAL (since Linux 2.6.12) length was 0. EINVAL flags contained neither MAP_PRIVATE or MAP_SHARED, or contained both of these values. ... The case to consider should probably be the first one. Note this problem does not seem to be specific to ppc64le since it also occur with a ppc64 build, see: http://ppc.koji.fedoraproject.org/kojifiles/packages/sqlite/3.8.4/1.fc21/data/logs/ppc64/build.log
Created attachment 889744 [details] sqlite.spec.ppc64le.ignore.test.error.patch
I suggest to modify the spec file as per above sqlite.spec.ppc64le.ignore.test.error.patch to have same bypass for ppc64le as already done for other archi as tracked by bug 1041279.
(In reply to Michel Normand from comment #2) > I suggest to modify the spec file as per above > sqlite.spec.ppc64le.ignore.test.error.patch to have same bypass for ppc64le > as already done for other archi as tracked by bug 1041279. the last available sqlite-3.8.4.3-3.fc21 is still failing for ppc64le archi and still need an update of spec file as already suggested by previous comment. === $git diff diff --git a/sqlite.spec b/sqlite.spec index df042ea..140a350 100644 --- a/sqlite.spec +++ b/sqlite.spec @@ -164,7 +164,7 @@ rm -f $RPM_BUILD_ROOT/%{_libdir}/*.{la,a} # XXX shell tests are broken due to loading system libsqlite3, work around... export LD_LIBRARY_PATH=`pwd`/.libs export MALLOC_CHECK_=3 -%ifarch s390 s390x ppc ppc64 %{sparc} +%ifarch s390 s390x ppc %{power64} %{sparc} make test || : %else make test ===
with sqlite-3.8.6-2 last available version sqlite is built and tested OK on ppc64le archi, and has 7 errors for ppc64 arch but bypassed by current spec. So should be able to close this bug. http://ppc.koji.fedoraproject.org/koji/buildinfo?buildID=259821 === ppc64 7 errors out of 212993 tests Failures on these tests: fts3conf-3.1 fts3conf-3.2 fts3conf-3.3 fts3conf-3.4 fts3conf-3.5 fts3conf-3.6 fts3conf-3.8 === ppc64le 0 errors out of 212994 tests ===
After checking the logs, I agree that this bug is probably fixed - closing. In case of persisting problems fell free to re-open.