Bug 986601

Summary: [abrt] nbdkit-1.1.2-2.fc19: recv_request_send_reply: Process /usr/sbin/nbdkit was killed by signal 6 (SIGABRT)
Product: [Fedora] Fedora Reporter: Michael S. <misc>
Component: nbdkitAssignee: Richard W.M. Jones <rjones>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 19CC: rjones
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: abrt_hash:d1336357808df833fdd87955072c90cc98a532f4
Fixed In Version: nbdkit-1.1.2-3.fc18 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-02 03:27:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
File: backtrace
none
File: cgroup
none
File: core_backtrace
none
File: dso_list
none
File: environ
none
File: limits
none
File: maps
none
File: open_fds
none
File: proc_pid_status
none
File: var_log_messages
none
fix storage of sockaddr none

Description Michael S. 2013-07-20 21:33:58 UTC
Description of problem:
tried to test the update. It may be caused by inconsistant state on libvirt or guestfish side.

i started as non root : 
$ nbdkit -f  /usr/lib64/nbdkit/plugins/nbdkit-example3-plugin.so

then :
guestfish --ro -a nbd://localhost
><fs> run
libguestfs: error: could not create appliance through libvirt: Unable to read from monitor: Connection reset by peer [code=38 domain=10]


But i did some test, involving ctrl C on both sides.

Version-Release number of selected component:
nbdkit-1.1.2-2.fc19

Additional info:
reporter:       libreport-2.1.5
backtrace_rating: 4
cmdline:        nbdkit -f /usr/lib64/nbdkit/plugins/nbdkit-example1-plugin.so
crash_function: recv_request_send_reply
executable:     /usr/sbin/nbdkit
kernel:         3.9.9-302.fc19.x86_64
runlevel:       N 5
uid:            500

Truncated backtrace:
Thread no. 1 (4 frames)
 #5 recv_request_send_reply at connections.c:429
 #6 _handle_single_connection at connections.c:82
 #7 handle_single_connection at connections.c:103
 #8 start_thread at sockets.c:220

Comment 1 Michael S. 2013-07-20 21:34:02 UTC
Created attachment 776274 [details]
File: backtrace

Comment 2 Michael S. 2013-07-20 21:34:04 UTC
Created attachment 776275 [details]
File: cgroup

Comment 3 Michael S. 2013-07-20 21:34:07 UTC
Created attachment 776276 [details]
File: core_backtrace

Comment 4 Michael S. 2013-07-20 21:34:10 UTC
Created attachment 776277 [details]
File: dso_list

Comment 5 Michael S. 2013-07-20 21:34:13 UTC
Created attachment 776278 [details]
File: environ

Comment 6 Michael S. 2013-07-20 21:34:15 UTC
Created attachment 776279 [details]
File: limits

Comment 7 Michael S. 2013-07-20 21:34:18 UTC
Created attachment 776280 [details]
File: maps

Comment 8 Michael S. 2013-07-20 21:34:23 UTC
Created attachment 776281 [details]
File: open_fds

Comment 9 Michael S. 2013-07-20 21:34:27 UTC
Created attachment 776282 [details]
File: proc_pid_status

Comment 10 Michael S. 2013-07-20 21:34:30 UTC
Created attachment 776283 [details]
File: var_log_messages

Comment 11 Richard W.M. Jones 2013-07-20 21:54:52 UTC
(In reply to Michael Scherer from comment #0)
> Description of problem:
> tried to test the update. It may be caused by inconsistant state on libvirt
> or guestfish side.
> 
> i started as non root : 
> $ nbdkit -f  /usr/lib64/nbdkit/plugins/nbdkit-example3-plugin.so
> 
> then :
> guestfish --ro -a nbd://localhost
> ><fs> run
> libguestfs: error: could not create appliance through libvirt: Unable to
> read from monitor: Connection reset by peer [code=38 domain=10]

If you use the guestfish -v option you should get a lot more
detail about what's really going on here.  This is possibly
a separate bug in libvirt.

But in any case, nbdkit shouldn't segfault even if you ^C it.

Comment 12 Michael S. 2013-07-20 22:20:18 UTC
Indeed, there is this log.

><fs> run
libguestfs: launch: backend=libvirt
libguestfs: launch: tmpdir=/tmp/libguestfseJV2h8
libguestfs: launch: umask=0002
libguestfs: launch: euid=500
libguestfs: libvirt version = 1000005 (1.0.5)
libguestfs: [00000ms] connect to libvirt
libguestfs: opening libvirt handle: URI = NULL, auth = virConnectAuthPtrDefault, flags = 0
libguestfs: successfully opened libvirt handle: conn = 0x7fc709a859b0
libguestfs: [02077ms] get libvirt capabilities
libguestfs: [02083ms] parsing capabilities XML
libguestfs: [02084ms] build appliance
libguestfs: command: run: supermin-helper
libguestfs: command: run: \ --verbose
libguestfs: command: run: \ -f checksum
libguestfs: command: run: \ /usr/lib64/guestfs/supermin.d
libguestfs: command: run: \ x86_64
supermin helper [00000ms] whitelist = (not specified), host_cpu = x86_64, kernel = (null), initrd = (null), appliance = (null)
supermin helper [00000ms] inputs[0] = /usr/lib64/guestfs/supermin.d
checking modpath /lib/modules/3.9.9-302.fc19.x86_64 is a directory
picked vmlinuz-3.9.9-302.fc19.x86_64 because modpath /lib/modules/3.9.9-302.fc19.x86_64 exists
checking modpath /lib/modules/3.9.8-300.fc19.x86_64 is a directory
picked vmlinuz-3.9.8-300.fc19.x86_64 because modpath /lib/modules/3.9.8-300.fc19.x86_64 exists
checking modpath /lib/modules/3.9.9-301.fc19.x86_64 is a directory
picked vmlinuz-3.9.9-301.fc19.x86_64 because modpath /lib/modules/3.9.9-301.fc19.x86_64 exists
supermin helper [00000ms] finished creating kernel
supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d
supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d/base.img
supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d/daemon.img
supermin helper [00000ms] visiting /usr/lib64/guestfs/supermin.d/hostfiles
supermin helper [00052ms] visiting /usr/lib64/guestfs/supermin.d/init.img
supermin helper [00052ms] visiting /usr/lib64/guestfs/supermin.d/udev-rules.img
supermin helper [00052ms] adding kernel modules
supermin helper [00088ms] finished creating appliance
libguestfs: checksum of existing appliance: d2ca7ac6cb24dfbbcacc5a4560fc754808221f87eae8c45e4816a8fa7a8365a4
libguestfs: command: run: qemu-img
libguestfs: command: run: \ create
libguestfs: command: run: \ -f qcow2
libguestfs: command: run: \ -b /var/tmp/.guestfs-500/root.25808
libguestfs: command: run: \ -o backing_fmt=raw
libguestfs: command: run: \ /tmp/libguestfseJV2h8/snapshot1
Formatting '/tmp/libguestfseJV2h8/snapshot1', fmt=qcow2 size=4294967296 backing_file='/var/tmp/.guestfs-500/root.25808' backing_fmt='raw' encryption=off cluster_size=65536 lazy_refcounts=off 
libguestfs: command: run: qemu-img
libguestfs: command: run: \ create
libguestfs: command: run: \ -f qcow2
libguestfs: command: run: \ -b nbd:localhost:10809
libguestfs: command: run: \ /tmp/libguestfseJV2h8/snapshot2
Formatting '/tmp/libguestfseJV2h8/snapshot2', fmt=qcow2 size=104857600 backing_file='nbd:localhost:10809' encryption=off cluster_size=65536 lazy_refcounts=off 
libguestfs: [02259ms] create libvirt XML
libguestfs: libvirt XML:\n<?xml version="1.0"?>\n<domain type="kvm" xmlns:qemu="http://libvirt.org/schemas/domain/qemu/1.0">\n  <name>guestfs-4p3mepfyp0avvxwo</name>\n  <memory unit="MiB">500</memory>\n  <currentMemory unit="MiB">500</currentMemory>\n  <vcpu>1</vcpu>\n  <clock offset="utc"/>\n  <os>\n    <type>hvm</type>\n    <kernel>/var/tmp/.guestfs-500/kernel.25808</kernel>\n    <initrd>/var/tmp/.guestfs-500/initrd.25808</initrd>\n    <cmdline>panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=xterm</cmdline>\n  </os>\n  <on_reboot>destroy</on_reboot>\n  <devices>\n    <controller type="scsi" index="0" model="virtio-scsi"/>\n    <disk device="disk" type="file">\n      <source file="/tmp/libguestfseJV2h8/snapshot2"/>\n      <target dev="sda" bus="scsi"/>\n      <driver name="qemu" type="qcow2"/>\n      <address type="drive" controller="0" bus="0" target="0" unit="0"/>\n    </disk>\n    <disk type="file" device="disk">\n      <source file="/tmp/libguestfseJV2h8/snapshot1"/>\n      <target dev="sdb" bus="scsi"/>\n      <driver name="qemu" type="qcow2" cache="unsafe"/>\n      <address type="drive" controller="0" bus="0" target="1" unit="0"/>\n      <shareable/>\n    </disk>\n    <serial type="unix">\n      <source mode="connect" path="/tmp/libguestfseJV2h8/console.sock"/>\n      <target port="0"/>\n    </serial>\n    <channel type="unix">\n      <source mode="connect" path="/tmp/libguestfseJV2h8/guestfsd.sock"/>\n      <target type="virtio" name="org.libguestfs.channel.0"/>\n    </channel>\n  </devices>\n  <qemu:commandline>\n    <qemu:env name="TMPDIR" value="/var/tmp"/>\n  </qemu:commandline>\n</domain>\n
libguestfs: command: run: ls
libguestfs: command: run: \ -a
libguestfs: command: run: \ -l
libguestfs: command: run: \ -Z /var/tmp/.guestfs-500
libguestfs: drwxr-xr-x. misc misc staff_u:object_r:user_tmp_t:s0   .
libguestfs: drwxrwxrwt. root root system_u:object_r:tmp_t:s0       ..
libguestfs: -rwxr-xr-x. misc misc staff_u:object_r:user_tmp_t:s0   checksum
libguestfs: -rw-r--r--. misc misc system_u:object_r:virt_content_t:s0 initrd
libguestfs: -rw-r--r--. misc misc system_u:object_r:virt_content_t:s0 initrd.25808
libguestfs: -rw-r--r--. misc misc system_u:object_r:virt_content_t:s0 kernel
libguestfs: -rw-r--r--. misc misc system_u:object_r:virt_content_t:s0 kernel.25808
libguestfs: -rw-r--r--. misc misc system_u:object_r:virt_content_t:s0 root
libguestfs: -rw-r--r--. misc misc system_u:object_r:virt_content_t:s0 root.25808
libguestfs: command: run: ls
libguestfs: command: run: \ -a
libguestfs: command: run: \ -l
libguestfs: command: run: \ -Z /tmp/libguestfseJV2h8
libguestfs: drwxr-xr-x. misc misc staff_u:object_r:user_tmp_t:s0   .
libguestfs: drwxrwxrwt. root root system_u:object_r:tmp_t:s0       ..
libguestfs: srwxrwxr-x. misc misc staff_u:object_r:user_tmp_t:s0   console.sock
libguestfs: srwxrwxr-x. misc misc staff_u:object_r:user_tmp_t:s0   guestfsd.sock
libguestfs: -rw-r--r--. misc misc staff_u:object_r:user_tmp_t:s0   snapshot1
libguestfs: -rw-r--r--. misc misc staff_u:object_r:user_tmp_t:s0   snapshot2
libguestfs: -rwxrwxr-x. misc misc staff_u:object_r:user_tmp_t:s0   umask-check
libguestfs: [02265ms] launch libvirt guest
libguestfs: error: could not create appliance through libvirt: internal error process exited while connecting to monitor: nbd.c:nbd_receive_negotiate():L457: read failed
qemu-system-x86_64: -drive file=/tmp/libguestfseJV2h8/snapshot2,if=none,id=drive-scsi0-0-0-0,format=qcow2: could not open disk image /tmp/libguestfseJV2h8/snapshot2: Invalid argument
 [code=1 domain=10]

I suspected selinux ( as I am running in a enforcing login ), but after setenforce 0, this still break.

Comment 13 Michael S. 2013-07-20 23:05:01 UTC
in fact, just doing : 
$ nbdkit -f /usr/lib64/nbdkit/plugins/nbdkit-example3-plugin.so

and on another terminal :
$ cat /dev/random| nc localhost 10809

is sufficient to trigger the bug.

Comment 14 Michael S. 2013-07-20 23:18:50 UTC
So trying to run with vlagrind and from git, it doesn't crash. The only message is see is :
==4204== Thread 2:
==4204== Invalid write of size 2
==4204==    at 0x4A0A404: memcpy@@GLIBC_2.14 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4204==  Address 0x4c3fef8 is 0 bytes after a block of size 40 alloc'd
==4204==    at 0x4A08121: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==4204==    by 0x404F02: tls_new_server_thread (tls.c:89)
==4204==    by 0x404764: start_thread (sockets.c:216)
==4204==    by 0x3112E07C52: start_thread (pthread_create.c:308)
==4204==    by 0x3112AF513C: clone (clone.S:113)
==4204== 

Running from git without valgrind is still crashing.

Comment 15 Michael S. 2013-07-21 07:52:00 UTC
Discussing with a friend of mine during the night, he found out the problem right away, so I coded a patch based on what he suggested, and this fix the error  in valgrind and I cannot reproduce crash.

Here is a patch against latest git HEAD.

Comment 16 Michael S. 2013-07-21 07:54:34 UTC
Created attachment 776370 [details]
fix storage of sockaddr

Discussing with a friend of mine during the night, he found out the problem right away, so I coded a patch based on what he suggested, and this fix the error  in valgrind and I cannot reproduce crash.

Here is a patch against latest git HEAD. 

As he explained, we use sockaddr to store sockaddr_in ( which is ok, same size ), but not for sockaddr_in6, because it is bigger, so memcopy goes over the bound of calloc.

Comment 17 Richard W.M. Jones 2013-07-21 21:05:13 UTC
Oh dear, that's embarrassingly bad.  I've pushed your change
with some minor reformatting.

Comment 18 Richard W.M. Jones 2013-07-21 21:46:08 UTC
Thanks.  This is now fixed in Rawhide, F19 & F18 (updates-testing).

Comment 19 Fedora Update System 2013-07-21 21:55:32 UTC
nbdkit-1.1.2-3.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/nbdkit-1.1.2-3.fc18

Comment 20 Fedora Update System 2013-07-21 21:55:38 UTC
nbdkit-1.1.2-3.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/nbdkit-1.1.2-3.fc19

Comment 21 Fedora Update System 2013-07-23 01:06:14 UTC
Package nbdkit-1.1.2-3.fc19:
* should fix your issue,
* was pushed to the Fedora 19 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing nbdkit-1.1.2-3.fc19'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-13377/nbdkit-1.1.2-3.fc19
then log in and leave karma (feedback).

Comment 22 Fedora Update System 2013-08-02 03:27:20 UTC
nbdkit-1.1.2-3.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 23 Fedora Update System 2013-08-02 03:43:25 UTC
nbdkit-1.1.2-3.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.