Bug 643071 - libvirtd terminating due to an lib augeas exit() call
Summary: libvirtd terminating due to an lib augeas exit() call
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: augeas
Version: 13
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: David Lutterkort
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-10-14 15:16 UTC by Stefan Berger
Modified: 2013-04-30 23:42 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-28 11:36:17 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Stefan Berger 2010-10-14 15:16:41 UTC
Description of problem:

libvirtd suddenly terminates with an error message

input to flex scanner failed

Stack trace shown below leads to libaugeas.


Version-Release number of selected component (if applicable):

# rpm -q -a | grep augeas
augeas-devel-0.7.3-1.fc13.x86_64
augeas-debuginfo-0.7.3-1.fc13.x86_64
augeas-libs-0.7.3-1.fc13.x86_64

How reproducible:

Not sure what exactly is triggering this. I am running several scripts (performing loops) against libvirtd that at some point cause this bug (non-deterministically ?).

Actual results:

Thread 6 (Thread 0x7fffeffff710 (LWP 19852)):
#0  0x00000037dae35ef0 in exit () from /lib64/libc.so.6
#1  0x00000037dfa2882e in yy_fatal_error (msg=<value optimized out>, 
    yyscanner=<value optimized out>) at lexer.c:1961
#2  0x00000037dfa2a730 in yy_get_next_buffer (
    yylval_param=<value optimized out>, yylloc_param=<value optimized out>, 
    yyscanner=0x7fffd80eca40) at lexer.c:1371
#3  augl_lex (yylval_param=<value optimized out>, 
    yylloc_param=<value optimized out>, yyscanner=0x7fffd80eca40)
    at lexer.c:1212
#4  0x00000037dfa136fb in augl_parse (term=0x7fffefffe4e8, 
    scanner=0x7fffd80eca40) at parser.c:1585
#5  0x00000037dfa146d9 in augl_parse_file (aug=0x7fffd80a1b50, 
    name=<value optimized out>, term=0x7fffefffe4e8) at parser.y:359
#6  0x00000037dfa0ed7e in load_module_file (aug=0x7fffd80a1b50, 
    filename=0x7fffd80e91a0 "/usr/share/augeas/lenses/dist/build.aug")
    at syntax.c:1951
#7  0x00000037dfa0f9bb in load_module (aug=0x7fffd80a1b50, 
---Type <return> to continue, or q <return> to quit---
    name=<value optimized out>) at syntax.c:1979
#8  0x00000037dfa0fb30 in lookup_internal (aug=0x7fffd80a1b50, 
    ctx_modname=0x7fffd80cfc60 "Shellvars", name=0x7fffd80d4c50 "Build.xchgs", 
    bnd=0x7fffefffe658) at syntax.c:513
#9  0x00000037dfa0fc9b in ctx_lookup_bnd (info=0x7fffd80d4c70, 
    ctx=0x7fffefffe730, name=0x7fffd80d4c50 "Build.xchgs") at syntax.c:545
#10 0x00000037dfa0ffb1 in ctx_lookup_type (term=0x7fffd80d4ca0, 
    ctx=0x7fffefffe730) at syntax.c:565
#11 check_exp (term=0x7fffd80d4ca0, ctx=0x7fffefffe730) at syntax.c:1239
#12 0x00000037dfa0ef6d in check_decl (aug=0x7fffd80a1b50, 
    filename=0x7fffd80cf100 "/usr/share/augeas/lenses/dist/shellvars.aug")
    at syntax.c:1299
#13 typecheck (aug=0x7fffd80a1b50, 
    filename=0x7fffd80cf100 "/usr/share/augeas/lenses/dist/shellvars.aug")
    at syntax.c:1362
#14 load_module_file (aug=0x7fffd80a1b50, 
    filename=0x7fffd80cf100 "/usr/share/augeas/lenses/dist/shellvars.aug")
    at syntax.c:1954
#15 0x00000037dfa0f9bb in load_module (aug=0x7fffd80a1b50, 
    name=<value optimized out>) at syntax.c:1979
#16 0x00000037dfa0fb30 in lookup_internal (aug=0x7fffd80a1b50, 
    ctx_modname=0x0, name=0x7fffd80cf370 "Shellvars.lns", bnd=0x7fffefffe8b8)
    at syntax.c:513
#17 0x00000037dfa11fdc in lens_lookup (aug=<value optimized out>, 
    qname=<value optimized out>) at syntax.c:524
#18 0x00000037dfa1e2a8 in lens_from_name (aug=0x7fffd80a1b50, 
    name=0x7fffd80cf370 "Shellvars.lns") at transform.c:535
#19 0x00000037dfa1f4be in transform_validate (aug=0x7fffd80a1b50, 
    xfm=0x7fffd80cf5c0) at transform.c:602
#20 0x00000037dfa05e9b in aug_load (aug=0x7fffd80a1b50) at augeas.c:527
#21 0x00000037df609804 in ?? () from /usr/lib64/libnetcf.so.1
#22 0x00000037df607293 in ?? () from /usr/lib64/libnetcf.so.1
#23 0x00000000004a0a7d in interfaceOpenInterface (conn=0x7fffd8001610, 
    auth=<value optimized out>, flags=<value optimized out>)
    at interface/netcf_driver.c:135
#24 0x00007ffff7ac160a in do_open (name=0x0, auth=0x0, flags=0)
    at libvirt.c:1283
#25 0x00007ffff7ac20a6 in virConnectOpen (name=0x7fffd80088b0 "")
    at libvirt.c:1427
#26 0x0000000000428ab9 in remoteDispatchOpen (server=0x703830, 
    client=0x7ffff00011f0, conn=<value optimized out>, 
    hdr=<value optimized out>, rerr=0x7fffefffec70, args=0x7fffefffec20, 
    ret=0x7fffefffebc0) at remote.c:422
#27 0x000000000042aef7 in remoteDispatchClientCall (server=0x703830, 
    client=0x7ffff00011f0, msg=0x7ffff00c17e0) at dispatch.c:529
#28 remoteDispatchClientRequest (server=0x703830, client=0x7ffff00011f0, 
---Type <return> to continue, or q <return> to quit---
    msg=0x7ffff00c17e0) at dispatch.c:407
#29 0x000000000041b708 in qemudWorker (data=0x7ffff0000908) at libvirtd.c:1570
#30 0x00000037db607761 in start_thread () from /lib64/libpthread.so.0
#31 0x00000037daee151d in clone () from /lib64/libc.so.6

Comment 1 Daniel Veillard 2010-10-14 15:23:12 UTC
unfortunately yy_fatal_error is really coming from (f)lex code
and well calling exit() is really a bad behaviour. But it
should be avoidable,


#define YY_FATAL_ERROR(msg) fprintf( stderr, "%s\n", msg );

in src/lexer.l might be sufficient to avoid this problem

Daniel

Comment 2 David Lutterkort 2010-10-14 22:47:29 UTC
This is really nasty, but a fix is a little harder than the above, especially since this error is very hard to test. It's easy to avoid the exit() in this case, but I'd also like to make sure things don't blow up during error recovery.

What happened in a nutshell is that reading from one of the *.aug lenses (/usr/share/augeas/lenses/dist/build.aug to be specific) failed.

Is there anything special about the build.aug file on your system ? It would be great if I could reproduce the problem and write an actual test to fix it.

Comment 3 Stefan Berger 2010-10-14 23:19:20 UTC
Here is the content of the build.aug file:

[root@d941e-10 ~]# cat /usr/share/augeas/lenses/dist/build.aug
(*
Module: Build
   Generic functions to build lenses

Author: Raphael Pinson <raphink>

About: License
  This file is licensed under the LGPLv2+, like the rest of Augeas.
*)


module Build =

let eol = Util.eol

(* Generic constructions *)
let brackets (l:lens) (r:lens) (lns:lens) = l . lns . r

(* List constructions *)
let list (lns:lens) (sep:lens) = lns . ( sep . lns )+
let opt_list (lns:lens) (sep:lens) = lns . ( sep . lns )*

(* Labels *)
let xchg (m:regexp) (d:string) (l:string) = del m d . label l

let xchgs (m:string) (l:string) = xchg m m l

(* Keys *)
let key_value_line (kw: regexp) (sep:lens) (sto:lens) =
                                   [ key kw . sep . sto . eol ]

let key_value (kw: regexp) (sep:lens) (sto:lens) =
                                   [ key kw . sep . sto ]


I doubt I can come up with a test case to reproduce this in a reasonable timeframe... I am causing this error by running concurrent tests against libvirtd and only after some time does this error occurr. Could it be a concurrency issue ? - that libaugeas is called from multiple threads at the same time but something in libaugeas cannot deal with concurrency??

Comment 4 Stefan Berger 2010-10-15 00:47:46 UTC
Two more things:

I exorted YYDEBUG=1 and then nothing happend -- no failure in a long time at least.

This time I saw this here as well. That message appeared before also and I did see other weird things, but this time it's really 'close' to the 'input in flex scanner failed' message:

[New Thread 0x7fffcabff710 (LWP 11609)]
Detaching after fork from child process 11610.
Detaching after fork from child process 11612.
Detaching after fork from child process 11674.
I/O error : Bad file descriptor
input in flex scanner failed
[Thread 0x7fffcabff710 (LWP 11609) exited]
[Thread 0x7fffcb600710 (LWP 11521) exited]
[Thread 0x7fffef5fe710 (LWP 8897) exited]
[Thread 0x7fffeffff710 (LWP 8895) exited]
[Thread 0x7ffff57f0710 (LWP 8893) exited]
[Thread 0x7ffff61f1710 (LWP 8891) exited]
[Thread 0x7ffff6bf2710 (LWP 8890) exited]
[Thread 0x7ffff4def710 (LWP 8894) exited]

Program exited with code 02.

I am not sure who or what complains about an I/O error.

Comment 5 Stefan Berger 2010-10-15 01:21:39 UTC
Well, good news. 
It's NOT an augeas bug from what I can see. The bug is related to an fd getting closed twice in libvirt code.

-> closing this bug

Comment 6 Daniel Veillard 2010-10-15 07:05:31 UTC
I'm reopening, really if the lexer code may call exit() it's still a
bug from a library POV, sure that bug was caused by bad conditions
we should try to avoid calling exit() from below.

Daniel

Comment 7 Bug Zapper 2011-05-31 11:20:08 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Bug Zapper 2011-06-28 11:36:17 UTC
Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.