Bug 363911
Summary: | oracle apps recieve EAGAIN error when attempting to use async io | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Chuck Mead <csm> |
Component: | libaio | Assignee: | Jeff Moyer <jmoyer> |
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | |
Severity: | urgent | Docs Contact: | |
Priority: | low | ||
Version: | 4.5 | CC: | evuraan, jwest, paul.hood |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-03-13 14:23:04 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Chuck Mead
2007-11-02 14:46:04 UTC
Sadly, aio_waitn() does not seem like a Linux call..? That seems like a Generic Error message from Oracl's convenience. (In reply to comment #0) > Description of problem: ORA-27083: waiting for async I/Os failed > Linux-x86_64 Error: 7: Argument list too long > Wed Oct 17 16:37:03 2007 > DBW0: terminating instance due to error 27083 > oerr ORA 27083 > 27083, 00000, "waiting for async I/Os failed" > // *Cause: The aio_waitn() library call returned an error. > // *Action: Check errno. As Anu mentions in comment #1, aio_waitn is not a libaio library call, nor is it a libc call. I looked through libaio-oracle, and it does not provide this abstraction, either. My guess is there is a portability layer within Oracle that implements this for Linux, but that's just a guess. So, could you please provide some data that points to a problem within either libaio or the kernel's aio infrastructure? Jeff, Thanks for responding. We are pushing Oracle for a response here. Hopefully it will come soon. --- the e299_dbw0_5701.trc shows: *** 2007-10-25 13:42:43.109 ksedmp: internal or fatal error ORA-27083: waiting for async I/Os failed Linux-x86_64 Error: 7: Argument list too long the error: 7 should map to #define E2BIG 7 /* Arg list too long */ I expected to find this in the dbwr traces at the os level, but I do not find E2BIG. the dbwr strace does show EAGAIN on semtimdop numerous times: 3995 semtimedop(851971, 0x7fbfffdea0, 1, {1, 730000000}) = -1 EAGAIN (Resource temporarily unavailable) There was concern that this (EAGAIN) is a problem and possibly exhausting something on the os side and it eventualy results in the ORA-27083 ...Linux-x86_64 Error: 7: Argument list too long. Frustratingly when trying to further trace the problem (ie errorstack in oracle) to get more information on the ORA-27083, the customer indicates the problem does not occur. The same is true when trying to use strace on sqlplus. we would like to get more information about the nature of the Linux-x86_64 Error: 7: Argument list too long.. (In reply to comment #4) > --- > the e299_dbw0_5701.trc shows: > *** 2007-10-25 13:42:43.109 > ksedmp: internal or fatal error > ORA-27083: waiting for async I/Os failed > Linux-x86_64 Error: 7: Argument list too long > > we would like to get more information about the nature of the Linux-x86_64 > Error: 7: Argument list too long.. See the man pages for semop and semtimedop: E2BIG The argument nsops is greater than SEMOPM, the maximum number of operations allowed per system call. The semval, sempid, semzcnt, and semnct values for a semaphore can all be retrieved using appropriate semctl(2) calls. The following limits on semaphore set resources affect the semop() call: SEMOPM Maximum number of operations allowed for one semop() call (32) (on Linux, this limit can be read and modified via the third field of /proc/sys/kernel/sem). I'm closing this as NOTABUG, as it seems clear to me that this is not a libaio problem. Cheers, Jeff Re-opening this bug so as to determine why the EAGAIN error is occurring. If this is not a libaio issue, then what is it? 1. What does Error: 7: Argument list too long mean? 2. Why do we get the message "waiting for async I/Os failed" --jwest I am sending a sysreport for one of the two hosts we're concerned with directly to Jeremy. I cannot attach it as bugzilla says it is too big. (In reply to comment #6) > Re-opening this bug so as to determine why the EAGAIN error is occurring. If > this is not a libaio issue, then what is it? > > 1. What does Error: 7: Argument list too long mean? I already answered this in comment #5. > 2. Why do we get the message "waiting for async I/Os failed" I cannot answer this as I don't have the source code for the software that prints this error message. Feel free to leave the bug open to track this, but I cannot be of further assistance until you can show me that the libaio or the kernel aio subsystem is responsible for the errors. Chuck, If you can tell me where to pull that sysreport from, I can grab it. Jeff, I appreciate your patience and assistance. --jwest Doggone it... I sent it yesterday email direct from my Bloomberg account but it appears I transposed the email address (dyslexic's UNTIE!). In any event it's now been sent to you from my on site email account. Please keep this bug in NEEDINFO state until you can provide information implicating libaio or the kernel's AIO subsystem. Thanks. Is there any progress on this issue? Jeff, Can we leave this open a little longer? The customer in this situation has been given an aio stress test that runs outside of the oracle processes. We're waiting for those results. Thanks Jeremy West I'm not convinced that aio-stress will reproduce the problem the customer is experiencing. In my opinion, it would be a better use of time to get more debugging output from the application that produces the problem. |