This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 363911 - oracle apps recieve EAGAIN error when attempting to use async io
oracle apps recieve EAGAIN error when attempting to use async io
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: libaio (Show other bugs)
4.5
x86_64 Linux
low Severity urgent
: ---
: ---
Assigned To: Jeffrey Moyer
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-11-02 10:46 EDT by Chuck Mead
Modified: 2008-03-13 10:23 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-03-13 10:23:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Chuck Mead 2007-11-02 10:46:04 EDT
Description of problem: ORA-27083: waiting for async I/Os failed
Linux-x86_64 Error: 7: Argument list too long
Wed Oct 17 16:37:03 2007
DBW0: terminating instance due to error 27083
oerr ORA 27083
27083, 00000, "waiting for async I/Os failed"
// *Cause:  The aio_waitn() library call returned an error.
// *Action: Check errno.



Version-Release number of selected component (if applicable):
libaio-0.3.105-2.x86_64


How reproducible:
Run oracle using aio.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info: Oracle asked us to open this bug report so they would have an
entry point into Red Hat for discussion about this error.
Comment 1 Anu Matthew 2007-11-02 10:49:27 EDT
Sadly, aio_waitn() does not seem like a Linux call..? 

That seems like a Generic Error message from Oracl's convenience. 
Comment 2 Jeffrey Moyer 2007-11-02 12:32:13 EDT
(In reply to comment #0)
> Description of problem: ORA-27083: waiting for async I/Os failed
> Linux-x86_64 Error: 7: Argument list too long
> Wed Oct 17 16:37:03 2007
> DBW0: terminating instance due to error 27083
> oerr ORA 27083
> 27083, 00000, "waiting for async I/Os failed"
> // *Cause:  The aio_waitn() library call returned an error.
> // *Action: Check errno.

As Anu mentions in comment #1, aio_waitn is not a libaio library call, nor is it
a libc call.  I looked through libaio-oracle, and it does not provide this
abstraction, either.  My guess is there is a portability layer within Oracle
that implements this for Linux, but that's just a guess.

So, could you please provide some data that points to a problem within either
libaio or the kernel's aio infrastructure?
Comment 3 Chuck Mead 2007-11-02 13:29:25 EDT
Jeff,
     Thanks for responding. We are pushing Oracle for a response here. Hopefully
it will come soon.
Comment 4 Paul Hood 2007-11-06 14:58:13 EST
---
the e299_dbw0_5701.trc shows:
*** 2007-10-25 13:42:43.109
ksedmp: internal or fatal error
ORA-27083: waiting for async I/Os failed
Linux-x86_64 Error: 7: Argument list too long

the error: 7 should map to 
#define E2BIG            7      /* Arg list too long */

I expected to find this in the dbwr traces at the os level, but I do not find
E2BIG.  

the dbwr strace does show  EAGAIN on semtimdop numerous times:

3995  semtimedop(851971, 0x7fbfffdea0, 1, {1, 730000000}) = -1 EAGAIN (Resource
temporarily unavailable)

There was concern that this (EAGAIN) is a problem and possibly exhausting
something on the os side and it eventualy
results in the ORA-27083 ...Linux-x86_64 Error: 7: Argument list too long.

Frustratingly when trying to further trace the problem (ie errorstack in oracle)
to get more information on the ORA-27083, the customer indicates the problem
does not occur.  The same is true when trying to use strace on sqlplus.

we would like to get more information about the nature of the Linux-x86_64
Error: 7: Argument list too long..
Comment 5 Jeffrey Moyer 2007-11-06 15:44:18 EST
(In reply to comment #4)
> ---
> the e299_dbw0_5701.trc shows:
> *** 2007-10-25 13:42:43.109
> ksedmp: internal or fatal error
> ORA-27083: waiting for async I/Os failed
> Linux-x86_64 Error: 7: Argument list too long
> 

> we would like to get more information about the nature of the Linux-x86_64
> Error: 7: Argument list too long..

See the man pages for semop and semtimedop:

       E2BIG  The argument nsops is greater than SEMOPM, the maximum number of
              operations allowed per system call.

       The  semval, sempid, semzcnt, and semnct values for a semaphore can all
       be retrieved using appropriate semctl(2) calls.

       The following limits on semaphore  set  resources  affect  the  semop()
       call:

       SEMOPM Maximum  number  of operations allowed for one semop() call (32)
              (on Linux, this limit can be read and  modified  via  the  third
              field of /proc/sys/kernel/sem).

I'm closing this as NOTABUG, as it seems clear to me that this is not a libaio
problem.

Cheers,

Jeff
Comment 6 Jeremy West 2007-11-08 11:29:45 EST
Re-opening this bug so as to determine why the EAGAIN error is occurring.  If
this is not a libaio issue, then what is it?  

1.  What does Error: 7: Argument list too long mean?
2.  Why do we get the message "waiting for async I/Os failed"

--jwest
Comment 7 Chuck Mead 2007-11-08 13:11:46 EST
I am sending a sysreport for one of the two hosts we're concerned with directly
to Jeremy. I cannot attach it as bugzilla says it is too big.
Comment 8 Jeffrey Moyer 2007-11-08 13:16:24 EST
(In reply to comment #6)
> Re-opening this bug so as to determine why the EAGAIN error is occurring.  If
> this is not a libaio issue, then what is it?  
> 
> 1.  What does Error: 7: Argument list too long mean?

I already answered this in comment #5.

> 2.  Why do we get the message "waiting for async I/Os failed"

I cannot answer this as I don't have the source code for the software that
prints this error message.

Feel free to leave the bug open to track this, but I cannot be of further
assistance until you can show me that the libaio or the kernel aio subsystem is
responsible for the errors.
Comment 9 Jeremy West 2007-11-09 11:28:09 EST
Chuck,

If you can tell me where to pull that sysreport from, I can grab it.  Jeff, I
appreciate your patience and assistance.

--jwest
Comment 10 Chuck Mead 2007-11-09 11:34:53 EST
Doggone it... I sent it yesterday email direct from my Bloomberg account but it
appears I transposed the email address (dyslexic's UNTIE!). In any event it's
now been sent to you from my on site email account.
Comment 11 Jeffrey Moyer 2007-11-26 15:16:29 EST
Please keep this bug in NEEDINFO state until you can provide information
implicating libaio or the kernel's AIO subsystem.

Thanks.
Comment 12 Jeffrey Moyer 2008-01-02 11:22:37 EST
Is there any progress on this issue?
Comment 14 Jeremy West 2008-01-02 13:09:36 EST
Jeff,

Can we leave this open a little longer?  The customer in this situation has been
given an aio stress test that runs outside of the oracle processes.  We're
waiting for those results.

Thanks
Jeremy West
Comment 15 Jeffrey Moyer 2008-01-02 14:40:13 EST
I'm not convinced that aio-stress will reproduce the problem the customer is
experiencing.  In my opinion, it would be a better use of time to get more
debugging output from the application that produces the problem.

Note You need to log in before you can comment on or make changes to this bug.