Red Hat Bugzilla – Full Text Bug Listing
|Summary:||libvirtd terminating due to an lib augeas exit() call|
|Product:||[Fedora] Fedora||Reporter:||Stefan Berger <stefanb>|
|Component:||augeas||Assignee:||David Lutterkort <lutter>|
|Status:||CLOSED WONTFIX||QA Contact:||Fedora Extras Quality Assurance <extras-qa>|
|Version:||13||CC:||apevec, hbrock, lutter, mbooth, veillard|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2011-06-28 07:36:17 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
Description Stefan Berger 2010-10-14 11:16:41 EDT
Description of problem: libvirtd suddenly terminates with an error message input to flex scanner failed Stack trace shown below leads to libaugeas. Version-Release number of selected component (if applicable): # rpm -q -a | grep augeas augeas-devel-0.7.3-1.fc13.x86_64 augeas-debuginfo-0.7.3-1.fc13.x86_64 augeas-libs-0.7.3-1.fc13.x86_64 How reproducible: Not sure what exactly is triggering this. I am running several scripts (performing loops) against libvirtd that at some point cause this bug (non-deterministically ?). Actual results: Thread 6 (Thread 0x7fffeffff710 (LWP 19852)): #0 0x00000037dae35ef0 in exit () from /lib64/libc.so.6 #1 0x00000037dfa2882e in yy_fatal_error (msg=<value optimized out>, yyscanner=<value optimized out>) at lexer.c:1961 #2 0x00000037dfa2a730 in yy_get_next_buffer ( yylval_param=<value optimized out>, yylloc_param=<value optimized out>, yyscanner=0x7fffd80eca40) at lexer.c:1371 #3 augl_lex (yylval_param=<value optimized out>, yylloc_param=<value optimized out>, yyscanner=0x7fffd80eca40) at lexer.c:1212 #4 0x00000037dfa136fb in augl_parse (term=0x7fffefffe4e8, scanner=0x7fffd80eca40) at parser.c:1585 #5 0x00000037dfa146d9 in augl_parse_file (aug=0x7fffd80a1b50, name=<value optimized out>, term=0x7fffefffe4e8) at parser.y:359 #6 0x00000037dfa0ed7e in load_module_file (aug=0x7fffd80a1b50, filename=0x7fffd80e91a0 "/usr/share/augeas/lenses/dist/build.aug") at syntax.c:1951 #7 0x00000037dfa0f9bb in load_module (aug=0x7fffd80a1b50, ---Type <return> to continue, or q <return> to quit--- name=<value optimized out>) at syntax.c:1979 #8 0x00000037dfa0fb30 in lookup_internal (aug=0x7fffd80a1b50, ctx_modname=0x7fffd80cfc60 "Shellvars", name=0x7fffd80d4c50 "Build.xchgs", bnd=0x7fffefffe658) at syntax.c:513 #9 0x00000037dfa0fc9b in ctx_lookup_bnd (info=0x7fffd80d4c70, ctx=0x7fffefffe730, name=0x7fffd80d4c50 "Build.xchgs") at syntax.c:545 #10 0x00000037dfa0ffb1 in ctx_lookup_type (term=0x7fffd80d4ca0, ctx=0x7fffefffe730) at syntax.c:565 #11 check_exp (term=0x7fffd80d4ca0, ctx=0x7fffefffe730) at syntax.c:1239 #12 0x00000037dfa0ef6d in check_decl (aug=0x7fffd80a1b50, filename=0x7fffd80cf100 "/usr/share/augeas/lenses/dist/shellvars.aug") at syntax.c:1299 #13 typecheck (aug=0x7fffd80a1b50, filename=0x7fffd80cf100 "/usr/share/augeas/lenses/dist/shellvars.aug") at syntax.c:1362 #14 load_module_file (aug=0x7fffd80a1b50, filename=0x7fffd80cf100 "/usr/share/augeas/lenses/dist/shellvars.aug") at syntax.c:1954 #15 0x00000037dfa0f9bb in load_module (aug=0x7fffd80a1b50, name=<value optimized out>) at syntax.c:1979 #16 0x00000037dfa0fb30 in lookup_internal (aug=0x7fffd80a1b50, ctx_modname=0x0, name=0x7fffd80cf370 "Shellvars.lns", bnd=0x7fffefffe8b8) at syntax.c:513 #17 0x00000037dfa11fdc in lens_lookup (aug=<value optimized out>, qname=<value optimized out>) at syntax.c:524 #18 0x00000037dfa1e2a8 in lens_from_name (aug=0x7fffd80a1b50, name=0x7fffd80cf370 "Shellvars.lns") at transform.c:535 #19 0x00000037dfa1f4be in transform_validate (aug=0x7fffd80a1b50, xfm=0x7fffd80cf5c0) at transform.c:602 #20 0x00000037dfa05e9b in aug_load (aug=0x7fffd80a1b50) at augeas.c:527 #21 0x00000037df609804 in ?? () from /usr/lib64/libnetcf.so.1 #22 0x00000037df607293 in ?? () from /usr/lib64/libnetcf.so.1 #23 0x00000000004a0a7d in interfaceOpenInterface (conn=0x7fffd8001610, auth=<value optimized out>, flags=<value optimized out>) at interface/netcf_driver.c:135 #24 0x00007ffff7ac160a in do_open (name=0x0, auth=0x0, flags=0) at libvirt.c:1283 #25 0x00007ffff7ac20a6 in virConnectOpen (name=0x7fffd80088b0 "") at libvirt.c:1427 #26 0x0000000000428ab9 in remoteDispatchOpen (server=0x703830, client=0x7ffff00011f0, conn=<value optimized out>, hdr=<value optimized out>, rerr=0x7fffefffec70, args=0x7fffefffec20, ret=0x7fffefffebc0) at remote.c:422 #27 0x000000000042aef7 in remoteDispatchClientCall (server=0x703830, client=0x7ffff00011f0, msg=0x7ffff00c17e0) at dispatch.c:529 #28 remoteDispatchClientRequest (server=0x703830, client=0x7ffff00011f0, ---Type <return> to continue, or q <return> to quit--- msg=0x7ffff00c17e0) at dispatch.c:407 #29 0x000000000041b708 in qemudWorker (data=0x7ffff0000908) at libvirtd.c:1570 #30 0x00000037db607761 in start_thread () from /lib64/libpthread.so.0 #31 0x00000037daee151d in clone () from /lib64/libc.so.6
Comment 1 Daniel Veillard 2010-10-14 11:23:12 EDT
unfortunately yy_fatal_error is really coming from (f)lex code and well calling exit() is really a bad behaviour. But it should be avoidable, #define YY_FATAL_ERROR(msg) fprintf( stderr, "%s\n", msg ); in src/lexer.l might be sufficient to avoid this problem Daniel
Comment 2 David Lutterkort 2010-10-14 18:47:29 EDT
This is really nasty, but a fix is a little harder than the above, especially since this error is very hard to test. It's easy to avoid the exit() in this case, but I'd also like to make sure things don't blow up during error recovery. What happened in a nutshell is that reading from one of the *.aug lenses (/usr/share/augeas/lenses/dist/build.aug to be specific) failed. Is there anything special about the build.aug file on your system ? It would be great if I could reproduce the problem and write an actual test to fix it.
Comment 3 Stefan Berger 2010-10-14 19:19:20 EDT
Here is the content of the build.aug file: [root@d941e-10 ~]# cat /usr/share/augeas/lenses/dist/build.aug (* Module: Build Generic functions to build lenses Author: Raphael Pinson <firstname.lastname@example.org> About: License This file is licensed under the LGPLv2+, like the rest of Augeas. *) module Build = let eol = Util.eol (* Generic constructions *) let brackets (l:lens) (r:lens) (lns:lens) = l . lns . r (* List constructions *) let list (lns:lens) (sep:lens) = lns . ( sep . lns )+ let opt_list (lns:lens) (sep:lens) = lns . ( sep . lns )* (* Labels *) let xchg (m:regexp) (d:string) (l:string) = del m d . label l let xchgs (m:string) (l:string) = xchg m m l (* Keys *) let key_value_line (kw: regexp) (sep:lens) (sto:lens) = [ key kw . sep . sto . eol ] let key_value (kw: regexp) (sep:lens) (sto:lens) = [ key kw . sep . sto ] I doubt I can come up with a test case to reproduce this in a reasonable timeframe... I am causing this error by running concurrent tests against libvirtd and only after some time does this error occurr. Could it be a concurrency issue ? - that libaugeas is called from multiple threads at the same time but something in libaugeas cannot deal with concurrency??
Comment 4 Stefan Berger 2010-10-14 20:47:46 EDT
Two more things: I exorted YYDEBUG=1 and then nothing happend -- no failure in a long time at least. This time I saw this here as well. That message appeared before also and I did see other weird things, but this time it's really 'close' to the 'input in flex scanner failed' message: [New Thread 0x7fffcabff710 (LWP 11609)] Detaching after fork from child process 11610. Detaching after fork from child process 11612. Detaching after fork from child process 11674. I/O error : Bad file descriptor input in flex scanner failed [Thread 0x7fffcabff710 (LWP 11609) exited] [Thread 0x7fffcb600710 (LWP 11521) exited] [Thread 0x7fffef5fe710 (LWP 8897) exited] [Thread 0x7fffeffff710 (LWP 8895) exited] [Thread 0x7ffff57f0710 (LWP 8893) exited] [Thread 0x7ffff61f1710 (LWP 8891) exited] [Thread 0x7ffff6bf2710 (LWP 8890) exited] [Thread 0x7ffff4def710 (LWP 8894) exited] Program exited with code 02. I am not sure who or what complains about an I/O error.
Comment 5 Stefan Berger 2010-10-14 21:21:39 EDT
Well, good news. It's NOT an augeas bug from what I can see. The bug is related to an fd getting closed twice in libvirt code. -> closing this bug
Comment 6 Daniel Veillard 2010-10-15 03:05:31 EDT
I'm reopening, really if the lexer code may call exit() it's still a bug from a library POV, sure that bug was caused by bad conditions we should try to avoid calling exit() from below. Daniel
Comment 7 Bug Zapper 2011-05-31 07:20:08 EDT
This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 8 Bug Zapper 2011-06-28 07:36:17 EDT
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed.