Bug 845660

Summary: kernel-3.5.0-2.fc17 disrupts nfs domains in ovirt/vdsm
Product: [Retired] oVirt Reporter: Jason Brooks <jbrooks>
Component: vdsmAssignee: Saggi Mizrahi <smizrahi>
Status: CLOSED CURRENTRELEASE QA Contact: Haim <hateya>
Severity: urgent Docs Contact:
Priority: urgent    
Version: unspecifiedCC: abaron, acathrow, amureini, atkac, bazulay, bretm, danken, dyasny, frank.enderle, fsimonce, fweyns, fxgsell, gansalmon, gpadgett, gspurgeon, gspurgeon, iheim, itamar, jason, jboggs, jclift, jeff, jonathan, jrankin, jscalia, kernel-maint, madhu.chinakonda, mburns, mcl, mgoldboi, mmello, pbrobinson, ricardo.arguello, rydekull, stephen.dart, wdh, yeylon, ykaul
Target Milestone: ---   
Target Release: 3.3.4   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: vdsm-4.10.0-10.fc17 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 873761 (view as bug list) Environment:
Last Closed: 2013-01-09 10:42:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 873761    
Attachments:
Description Flags
Engine and VDSM logs, from the two boxes involved.
none
vdsm log from F17 host w/ 3.5 kernel
none
core.7273.1344613714.dump.1.xz none

Description Jason Brooks 2012-08-03 19:29:18 UTC
Description of problem:

Following update to kernel-3.5.0-2 on two F17 x86_64 machines, the machines won't connect to oVirt 3.1 nfs domains. Returning to 3.4.6-2.fc17.x86_64 resolves the issue.

I don't know what's causing the issue, but I'm happy to help debug it.

Version-Release number of selected component (if applicable):

kernel-3.5.0-2.fc17.x86_64 

How reproducible:


Steps to Reproduce:
1. Update to this version of the kernel on F17 machine serving as oVirt host. 
2. 
3.
  
Actual results:

NFS domain is inaccessible. 

Expected results:


Additional info:

I've tried putting selinux into permissive mode and downing the firewall, neither has an effect. The affected host does mount the nfs share, but fails while attaching to it.

Comment 1 Mike Burns 2012-08-09 15:53:10 UTC
Any chance someone can look at this?  It's a critical issue for oVirt.

Comment 2 Federico Simoncelli 2012-08-09 22:20:30 UTC
(In reply to comment #0)
> Description of problem:
> 
> Following update to kernel-3.5.0-2 on two F17 x86_64 machines, the machines
> won't connect to oVirt 3.1 nfs domains. Returning to 3.4.6-2.fc17.x86_64
> resolves the issue.

Can you attach the relevant vdsm logs? Thanks.

Comment 3 Justin Clift 2012-08-09 22:51:14 UTC
Created attachment 603363 [details]
Engine and VDSM logs, from the two boxes involved.

Inside the .zip file are two .tar.bz2.  One each for the ovirt engine log, and one for the vdsm log.

Comment 4 Justin Clift 2012-08-09 22:55:20 UTC
Just to clarify, by "two boxes involved" in my comment, I'm meaning the two servers in my test environment here.  They're different boxes to Jason's ones, from the original report.

Further useful info may be in BZ 847083 too. (same problem, before we knew it was kernel version specific)

Comment 5 Jason Brooks 2012-08-10 00:19:46 UTC
Created attachment 603368 [details]
vdsm log from F17 host w/ 3.5 kernel

This is a vdsm log from an F17 ovirt host that had, while running a pre-3.5 kernel, been configured to use an NFS master data domain. Upon booting into the current 3.5 kernel, the host will no longer connect to the NFS domain. Booting back into the earlier kernel resolves the issue.

Comment 6 Federico Simoncelli 2012-08-10 16:16:37 UTC
One of the vdsm process (from oop) crashed with the following backtrace:

#0  0x00007fa0f9398925 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007fa0f939a0d8 in __GI_abort () at abort.c:91
#2  0x00007fa0f93d864b in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7fa0f94dbc28 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3  0x00007fa0f93df7ce in malloc_printerr (ptr=0x7fa09c00e000, str=0x7fa0f94dbce8 "double free or corruption (!prev)", action=3) at malloc.c:5027
#4  _int_free (av=0x7fa09c000020, p=0x7fa09c00dff0, have_lock=0) at malloc.c:3948
#5  0x00007fa0f632ee90 in ffi_call_unix64 () from /lib64/libffi.so.5
#6  0x00007fa0f632e8a0 in ffi_call () from /lib64/libffi.so.5
#7  0x00007fa0f6548cc3 in _call_function_pointer (argcount=1, resmem=0x7fa0adff66e0, restype=<optimized out>, atypes=<optimized out>, avalues=0x7fa0adff66c0, pProc=0x7fa0f93e3140 <__GI___libc_free>, flags=4361)
    at /usr/src/debug/Python-2.7.3/Modules/_ctypes/callproc.c:827
#8  _ctypes_callproc (pProc=pProc@entry=0x7fa0f93e3140 <__GI___libc_free>, argtuple=argtuple@entry=(<c_char_p at remote 0x1b72e60>,), flags=4361, argtypes=argtypes@entry=0x0, restype=
    <_ctypes.PyCSimpleType at remote 0x1c4fab0>, checker=0x0) at /usr/src/debug/Python-2.7.3/Modules/_ctypes/callproc.c:1174
#9  0x00007fa0f65423dd in PyCFuncPtr_call (self=<optimized out>, inargs=<optimized out>, kwds=0x0) at /usr/src/debug/Python-2.7.3/Modules/_ctypes/_ctypes.c:3913
#10 0x00007fa0fa081a7e in PyObject_Call (func=func@entry=<_FuncPtr(__name__='free') at remote 0x24a8ae0>, arg=arg@entry=(<c_char_p at remote 0x1b72e60>,), kw=kw@entry=0x0)
    at /usr/src/debug/Python-2.7.3/Objects/abstract.c:2529
#11 0x00007fa0fa1113e3 in do_call (nk=<optimized out>, na=1, pp_stack=0x7fa0adff6aa8, func=<_FuncPtr(__name__='free') at remote 0x24a8ae0>) at /usr/src/debug/Python-2.7.3/Python/ceval.c:4316
#12 call_function (oparg=<optimized out>, pp_stack=0x7fa0adff6aa8) at /usr/src/debug/Python-2.7.3/Python/ceval.c:4121
#13 PyEval_EvalFrameEx (f=f@entry=
    Frame 0x7fa09c00d5a0, for file /usr/share/vdsm/storage/fileUtils.py, line 272, in _createAlignedBuffer (self=<DirectFile(_closed=False, _writable=False, _mode='dr', _fd=3) at remote 0x1c82d90>, size=1024, pbuff=<c_char_p at remote 0x1b72e60>, ppbuff=<LP_c_char_p at remote 0x1b72cb0>, rc=0), throwflag=<optimized out>) at /usr/src/debug/Python-2.7.3/Python/ceval.c:2740

The relevant part for VDSM is vdsm/storage/fileUtils.py:272

259     @contextmanager
260     def _createAlignedBuffer(self, size):
261         pbuff = ctypes.c_char_p(0)
262         ppbuff = ctypes.pointer(pbuff)
263         # Because we usually have fixed sizes for our reads, caching
264         # buffers might give a slight performance boost.
265         rc = libc.posix_memalign(ppbuff, PAGESIZE, size)
266         if rc:
267             raise OSError(rc, "Could not allocate aligned buffer")
268         try:
269             ctypes.memset(pbuff, 0, size)
270             yield pbuff
271         finally:
272             libc.free(pbuff)

As conclusion the NFS operation gets stuck because the helper died (why vdsm isn't detecting that the fd has been closed?).
I'm still not sure if this is a VDSM issue that gets exposed only with the newer kernel, or if the glibc and the kernel currently have some issue with posix_memalign+free.

Comment 7 Federico Simoncelli 2012-08-10 16:22:21 UTC
Created attachment 603582 [details]
core.7273.1344613714.dump.1.xz

VDSM core dump file.
gdb /bin/python core.7273.1344613714.dump

Comment 8 Federico Simoncelli 2012-08-10 16:32:49 UTC
This is easily reproducible with:

$ uname -sr ; rpm -q glibc
Linux 3.5.0-2.fc17.x86_64
glibc-2.15-51.fc17.x86_64

$ cat python_crash.py 
import ctypes
libc = ctypes.CDLL("libc.so.6", use_errno=True)

pbuff = ctypes.c_char_p(0)
ppbuff = ctypes.pointer(pbuff)

SIZE = 100
libc.posix_memalign(ppbuff, libc.getpagesize(), SIZE)
ctypes.memset(pbuff, 0, SIZE)

libc.free(pbuff)
libc.free(pbuff)


$ python python_crash.py
*** glibc detected *** python: double free or corruption (fasttop): 0x0000000001c55000 ***
======= Backtrace: =========
/lib64/libc.so.6[0x35c307c7ce]
/lib64/libffi.so.5(ffi_call_unix64+0x4c)[0x35c5c05e90]
/lib64/libffi.so.5(ffi_call+0x1e0)[0x35c5c058a0]
/usr/lib64/python2.7/lib-dynload/_ctypes.so(_ctypes_callproc+0x3e3)[0x7ff17fdebcc3]
/usr/lib64/python2.7/lib-dynload/_ctypes.so(+0xa3dd)[0x7ff17fde53dd]
/lib64/libpython2.7.so.1.0(PyObject_Call+0x4e)[0x35d4a49a7e]
/lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x1c93)[0x35d4ad93e3]
/lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x87f)[0x35d4addb1f]
/lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x35d4addbf2]
/lib64/libpython2.7.so.1.0[0x35d4af6b9a]
/lib64/libpython2.7.so.1.0(PyRun_FileExFlags+0x92)[0x35d4af7992]
/lib64/libpython2.7.so.1.0(PyRun_SimpleFileExFlags+0xdb)[0x35d4af83ab]
/lib64/libpython2.7.so.1.0(Py_Main+0xc32)[0x35d4b09882]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x35c3021735]
python[0x4006f1]
======= Memory map: ========
00400000-00401000 r-xp 00000000 fd:02 546244                             /usr/bin/python2.7
00600000-00601000 r--p 00000000 fd:02 546244                             /usr/bin/python2.7
00601000-00602000 rw-p 00001000 fd:02 546244                             /usr/bin/python2.7
01b59000-01c72000 rw-p 00000000 00:00 0                                  [heap]
35c2800000-35c2820000 r-xp 00000000 fd:02 524867                         /usr/lib64/ld-2.15.so
35c2a1f000-35c2a20000 r--p 0001f000 fd:02 524867                         /usr/lib64/ld-2.15.so
35c2a20000-35c2a21000 rw-p 00020000 fd:02 524867                         /usr/lib64/ld-2.15.so
35c2a21000-35c2a22000 rw-p 00000000 00:00 0 
35c3000000-35c31ac000 r-xp 00000000 fd:02 527324                         /usr/lib64/libc-2.15.so
35c31ac000-35c33ac000 ---p 001ac000 fd:02 527324                         /usr/lib64/libc-2.15.so
35c33ac000-35c33b0000 r--p 001ac000 fd:02 527324                         /usr/lib64/libc-2.15.so
35c33b0000-35c33b2000 rw-p 001b0000 fd:02 527324                         /usr/lib64/libc-2.15.so
35c33b2000-35c33b7000 rw-p 00000000 00:00 0 
35c3400000-35c3416000 r-xp 00000000 fd:02 527412                         /usr/lib64/libpthread-2.15.so
35c3416000-35c3616000 ---p 00016000 fd:02 527412                         /usr/lib64/libpthread-2.15.so
35c3616000-35c3617000 r--p 00016000 fd:02 527412                         /usr/lib64/libpthread-2.15.so
35c3617000-35c3618000 rw-p 00017000 fd:02 527412                         /usr/lib64/libpthread-2.15.so
35c3618000-35c361c000 rw-p 00000000 00:00 0 
35c3800000-35c38fa000 r-xp 00000000 fd:02 527882                         /usr/lib64/libm-2.15.so
35c38fa000-35c3af9000 ---p 000fa000 fd:02 527882                         /usr/lib64/libm-2.15.so
35c3af9000-35c3afa000 r--p 000f9000 fd:02 527882                         /usr/lib64/libm-2.15.so
35c3afa000-35c3afb000 rw-p 000fa000 fd:02 527882                         /usr/lib64/libm-2.15.so
35c3c00000-35c3c03000 r-xp 00000000 fd:02 527494                         /usr/lib64/libdl-2.15.so
35c3c03000-35c3e02000 ---p 00003000 fd:02 527494                         /usr/lib64/libdl-2.15.so
35c3e02000-35c3e03000 r--p 00002000 fd:02 527494                         /usr/lib64/libdl-2.15.so
35c3e03000-35c3e04000 rw-p 00003000 fd:02 527494                         /usr/lib64/libdl-2.15.so
35c5000000-35c5015000 r-xp 00000000 fd:02 527885                         /usr/lib64/libgcc_s-4.7.0-20120507.so.1
35c5015000-35c5214000 ---p 00015000 fd:02 527885                         /usr/lib64/libgcc_s-4.7.0-20120507.so.1
35c5214000-35c5215000 rw-p 00014000 fd:02 527885                         /usr/lib64/libgcc_s-4.7.0-20120507.so.1
35c5c00000-35c5c07000 r-xp 00000000 fd:02 528449                         /usr/lib64/libffi.so.5.0.10
35c5c07000-35c5e06000 ---p 00007000 fd:02 528449                         /usr/lib64/libffi.so.5.0.10
35c5e06000-35c5e07000 r--p 00006000 fd:02 528449                         /usr/lib64/libffi.so.5.0.10
35c5e07000-35c5e08000 rw-p 00007000 fd:02 528449                         /usr/lib64/libffi.so.5.0.10
35d4a00000-35d4b6c000 r-xp 00000000 fd:02 546233                         /usr/lib64/libpython2.7.so.1.0
35d4b6c000-35d4d6c000 ---p 0016c000 fd:02 546233                         /usr/lib64/libpython2.7.so.1.0
35d4d6c000-35d4d6d000 r--p 0016c000 fd:02 546233                         /usr/lib64/libpython2.7.so.1.0
35d4d6d000-35d4daa000 rw-p 0016d000 fd:02 546233                         /usr/lib64/libpython2.7.so.1.0
35d4daa000-35d4dba000 rw-p 00000000 00:00 0 
35d5800000-35d5802000 r-xp 00000000 fd:02 533736                         /usr/lib64/libutil-2.15.so
35d5802000-35d5a01000 ---p 00002000 fd:02 533736                         /usr/lib64/libutil-2.15.so
35d5a01000-35d5a02000 r--p 00001000 fd:02 533736                         /usr/lib64/libutil-2.15.so
35d5a02000-35d5a03000 rw-p 00002000 fd:02 533736                         /usr/lib64/libutil-2.15.so
7ff17fb91000-7ff17fbd2000 rw-p 00000000 00:00 0 
7ff17fbd2000-7ff17fbd9000 r-xp 00000000 fd:02 532340                     /usr/lib64/python2.7/lib-dynload/_struct.so
7ff17fbd9000-7ff17fdd8000 ---p 00007000 fd:02 532340                     /usr/lib64/python2.7/lib-dynload/_struct.so
7ff17fdd8000-7ff17fdd9000 r--p 00006000 fd:02 532340                     /usr/lib64/python2.7/lib-dynload/_struct.so
7ff17fdd9000-7ff17fddb000 rw-p 00007000 fd:02 532340                     /usr/lib64/python2.7/lib-dynload/_struct.so
7ff17fddb000-7ff17fdf4000 r-xp 00000000 fd:02 532139                     /usr/lib64/python2.7/lib-dynload/_ctypes.so
7ff17fdf4000-7ff17fff4000 ---p 00019000 fd:02 532139                     /usr/lib64/python2.7/lib-dynload/_ctypes.so
7ff17fff4000-7ff17fff5000 r--p 00019000 fd:02 532139                     /usr/lib64/python2.7/lib-dynload/_ctypes.so
7ff17fff5000-7ff17fff9000 rw-p 0001a000 fd:02 532139                     /usr/lib64/python2.7/lib-dynload/_ctypes.so
7ff17fff9000-7ff186426000 r--p 00000000 fd:02 569679                     /usr/lib/locale/locale-archive
7ff186426000-7ff1864d9000 rw-p 00000000 00:00 0 
7ff1864da000-7ff186561000 rw-p 00000000 00:00 0 
7ff18657c000-7ff18657e000 rw-p 00000000 00:00 0 
7fff0cf06000-7fff0cf27000 rw-p 00000000 00:00 0                          [stack]
7fff0cf4e000-7fff0cf4f000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted (core dumped)

Comment 9 Federico Simoncelli 2012-08-10 17:00:46 UTC
(In reply to comment #8)
> This is easily reproducible with:

The reproducer above is probably invalid (two libc.free calls, which might assume a real double release performed by either vdsm or ctypes/python), the original is most likely a real corruption. Also the backtrace looks slightly different (!prev vs fasttop):

*** glibc detected *** python: double free or corruption (fasttop): 0x0000000001c55000 ***

*** glibc detected *** python: double free or corruption (!prev): 0x00007fe710027000 ***

Comment 10 Jason Brooks 2012-08-20 21:26:27 UTC
Is there any further information I can provide to help advance this bug? The current stable version of oVirt is broken due to this. Perhaps this would be better dealt with as a vdsm bug?

Comment 11 Federico Simoncelli 2012-08-21 07:51:36 UTC
(In reply to comment #10)
> Is there any further information I can provide to help advance this bug? The
> current stable version of oVirt is broken due to this. Perhaps this would be
> better dealt with as a vdsm bug?

We need someone to volunteer to backport these (master branch) to the ovirt-3.1 branch:

8f226cf Change oop to be a new process instead of a fork
41ca78b Fixing broken compilation
b400488 fix logging
5bcb224 Add missing log object to CrabRPCServer

http://gerrit.ovirt.org

Comment 12 Federico Simoncelli 2012-08-21 14:01:20 UTC
As I suspected this affects also the vdsm master branch (it's not related to the oop mechanism used). I just found a vdsm host using crabrpc with the same issue (kernel 3.5.1-1.fc17.x86_64).

I noticed this message:

kernel: [86611.703352] python[27692]: segfault at 18 ip 0000003fb307b06b sp 00007fff34c3b240 error 4 in libc-2.15.so[3fb3000000+1ab000]

Comment 14 Ayal Baron 2012-09-09 23:19:36 UTC
Fede, why is this on vdsm?

Comment 15 Federico Simoncelli 2012-09-12 14:36:58 UTC
(In reply to comment #14)
> Fede, why is this on vdsm?

I hardly think that it's a kernel/glibc bug (posix_memalign/free are widely used). It could be a python/ctypes issue, but at the moment I'd try to figure out if we're using them properly (for example if there's any path that might lead to a double free on the same pointer, etc...).

Comment 16 Mike Burns 2012-09-12 15:15:40 UTC
@abaron -- anything we can do to make this a higher priority?  It's a blocking issue for all ovirt-node use.  If it needs to go to the kernel team or we need to pull in people from a different team, I'll do that, but I need to know who to talk to.

Comment 18 Dan Kenigsberg 2012-09-24 11:31:40 UTC
Jason, would you please try Saggi's http://gerrit.ovirt.org/8143 ?

Comment 19 Jason Brooks 2012-09-24 21:40:28 UTC
(In reply to comment #18)
> Jason, would you please try Saggi's http://gerrit.ovirt.org/8143 ?

Maybe I'm doing it wrong, but I built a new rpm from vdsm master and, after modding the spec file to work with the version of libvirt that comes with F17, installed the package. It didn't fix the problem. Then, I changed the spec file back to its previous libvirt requirement, and also built a newer libvirt for my F17 host. I installed the packages and it didn't fix the problem.

Again, not sure if I'm testing this wrong...

Comment 20 Federico Simoncelli 2012-09-24 21:48:11 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > Jason, would you please try Saggi's http://gerrit.ovirt.org/8143 ?
> 
> Maybe I'm doing it wrong, but I built a new rpm from vdsm master and, after
> modding the spec file to work with the version of libvirt that comes with
> F17, installed the package. It didn't fix the problem. Then, I changed the
> spec file back to its previous libvirt requirement, and also built a newer
> libvirt for my F17 host. I installed the packages and it didn't fix the
> problem.
> 
> Again, not sure if I'm testing this wrong...

Could you give it a try with the new vdsm build?

* Mon Sep 24 2012 Federico Simoncelli <fsimonce> 4.10.0-9.fc17
- BZ#845660 Use the recommended alignment instead of using pagesize

http://koji.fedoraproject.org/koji/buildinfo?buildID=356119

Thanks.

Comment 21 Jason Brooks 2012-09-24 23:48:03 UTC
(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #18)
> > > Jason, would you please try Saggi's http://gerrit.ovirt.org/8143 ?
> > 
> > Maybe I'm doing it wrong, but I built a new rpm from vdsm master and, after
> > modding the spec file to work with the version of libvirt that comes with
> > F17, installed the package. It didn't fix the problem. Then, I changed the
> > spec file back to its previous libvirt requirement, and also built a newer
> > libvirt for my F17 host. I installed the packages and it didn't fix the
> > problem.
> > 
> > Again, not sure if I'm testing this wrong...
> 
> Could you give it a try with the new vdsm build?

I tried with this build, and unfortunately, the issue remains. 

I updated an F17 host (different than that one I'd been testing my self-built pkgs on) with the vdsm build referenced below. With kernel 3.5.4, my host would not attach to my existing, gluster-based nfs data domain. I also tried to create a new data domain, nfs based but non-gluster, and the host running 3.5.4 showed the same behavior as reported earlier -- wouldn't attach.

I also tried with kernel 3.4.6 & this new vdsm build, and my gluster and non-gluster nfs domains, etc. -- all that continued to work normally.

> 
> * Mon Sep 24 2012 Federico Simoncelli <fsimonce> 4.10.0-9.fc17
> - BZ#845660 Use the recommended alignment instead of using pagesize
> 
> http://koji.fedoraproject.org/koji/buildinfo?buildID=356119
> 
> Thanks.

Comment 22 Federico Simoncelli 2012-09-25 11:12:39 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > (In reply to comment #19)
> > > (In reply to comment #18)
> > > > Jason, would you please try Saggi's http://gerrit.ovirt.org/8143 ?
> > > 
> > > Maybe I'm doing it wrong, but I built a new rpm from vdsm master and, after
> > > modding the spec file to work with the version of libvirt that comes with
> > > F17, installed the package. It didn't fix the problem. Then, I changed the
> > > spec file back to its previous libvirt requirement, and also built a newer
> > > libvirt for my F17 host. I installed the packages and it didn't fix the
> > > problem.
> > > 
> > > Again, not sure if I'm testing this wrong...
> > 
> > Could you give it a try with the new vdsm build?
> 
> I tried with this build, and unfortunately, the issue remains. 

I tried it myself too and indeed the issue is not solved yet.

# grep _PC_REC_XFER_ALIGN /usr/share/vdsm/storage/fileUtils.py
_PC_REC_XFER_ALIGN = 17
        alignment = libc.fpathconf(self.fileno(), _PC_REC_XFER_ALIGN)

# uname -r
3.5.4-1.fc17.x86_64

# ls -l /var/log/core/core.2187.1348563252.dump
-rw-------. 1 vdsm kvm 2573868 Sep 25 04:54 /var/log/core/core.2187.1348563252.dump

Comment 23 Bret McMillan 2012-10-02 00:16:07 UTC
Anything else others can do to help w/ this?  I've got a new home lab setup, but stuck on what appears to be this bug (ovirt 3.1 stable branch)...

Comment 24 Saggi Mizrahi 2012-10-04 14:03:14 UTC
http://gerrit.ovirt.org/#/c/8356/

Comment 25 Barak 2012-10-04 14:34:59 UTC
Jason, Bert -  could you please try the patch mentioned in comment #24?

Comment 26 Jason Brooks 2012-10-04 15:32:31 UTC
(In reply to comment #25)
> Jason, Bert -  could you please try the patch mentioned in comment #24?

OK, this is looking good. On my F17 setup, running kernel 3.5.4 with vdsm built w/ the above patch, my nfs iso domain is up.

I wasn't 100% positive about the right way to build a pkg w/ this patch, so let me confirm that:

I have vdsm from git: git clone http://gerrit.ovirt.org/p/vdsm.git, and that's up to date. 

Then I did: git fetch git://gerrit.ovirt.org/vdsm refs/changes/56/8356/3 && git checkout FETCH_HEAD -- That's the checkout line for the patch referenced above. So I ran that, then continued as directed on ttp://wiki.ovirt.org/wiki/Vdsm_Developers#Building_a_Vdsm_RPM.

I replaced the vdsm pkgs on my f17 ovirt 3.1 test box w/ those, rebooted, it's running 3.5.4, and my nfs iso domain is up.

Comment 27 Federico Simoncelli 2012-10-04 20:55:48 UTC
Fixed in vdsm-4.10.0-10.fc17

http://koji.fedoraproject.org/koji/buildinfo?buildID=358280

Jason can you give it a try? Thanks.

Comment 29 Jason Brooks 2012-10-04 21:09:04 UTC
(In reply to comment #27)
> Fixed in vdsm-4.10.0-10.fc17
> 
> http://koji.fedoraproject.org/koji/buildinfo?buildID=358280
> 
> Jason can you give it a try? Thanks.

My pleasure -- I just tested w/ vdsm 4.10.0.10 and kernel 3.5.4 on oVirt 3.1, and my nfs iso and data domains are both up.

Thank you!

Comment 30 Peter Robinson 2013-01-09 10:42:17 UTC
I believe this is now fixed