Description of problem:
add-drive-ops fails to hot plug a disk. It works well before launch the appliance. This works well on rhel7.1-x86_64.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. # guestfish -a test.raw
><fs> add-drive-opts test.img label:hotplug
libguestfs: error: internal_hot_add_drive: hot-add drive: '/dev/disk/guestfs/hotplug' did not appear after 30 seconds: this could mean that virtio-scsi (in qemu or kernel) or udev is not working
Described as step.1.
Disk can be hot plugged successfully.
><fs> add-drive-opts test.img label:hotplug
Yes, hotplug is known to not work on aarch64.
It will start to work when we enable virtio-scsi over PCI
(instead of virtio-scsi over virtio-mmio, as used now).
Adding Drew who may have an update.
Actually this works "by magic" now (even with virtio-mmio).
We should retest going into RHELSA7.3 now that they've moved to a PCIe hotplug model. I don't know if that will work, but perhaps Richard has an update.
I've just tested this again, and the results are mixed.
kernel 4.5.0-0.40.el7.aarch64 *NB* with acpi=off because of another bug
libguestfs-1.32.5-6.el7.aarch64 still using virtio-mmio
Hotplug add is working fine.
Hotplug remove fails. qemu crashes and the kernel dumps the following information.
[ 69.703520] qemu-kvm: unhandled level 2 translation fault (11) at 0x00000010, esr 0x92000006
[ 69.712449] pgd = fffffe7fe4b60000
[ 69.715849]  *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[ 69.725612] CPU: 2 PID: 1869 Comm: qemu-kvm Not tainted 4.5.0-0.40.el7.aarch64 #1
[ 69.733069] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Jan 26 2016
[ 69.740351] task: fffffe7ff152e800 ti: fffffe7ffad34000 task.ti: fffffe7ffad34000
[ 69.747808] PC is at 0x2aab3378ef4
[ 69.751197] LR is at 0x2aab321a0a4
[ 69.754591] pc : [<000002aab3378ef4>] lr : [<000002aab321a0a4>] pstate: 80000000
[ 69.761957] sp : 000003fffe88f2d0
[ 69.765267] x29: 000003fffe88f2d0 x28: 000002aad0d76000
[ 69.770583] x27: 000002aab34ff000 x26: 0000000000000000
[ 69.775904] x25: 000002aacf757750 x24: 000002aacf7fa2d0
[ 69.781225] x23: 000002aacfa74400 x22: 000002aae26e4800
[ 69.786545] x21: 000002aab34f35b8 x20: 000003fffe88f380
[ 69.791864] x19: 000002aab34ff000 x18: 0000000000000001
[ 69.797184] x17: 000003ff94be20a0 x16: 000002aab34fdf68
[ 69.802504] x15: 003b9aca00000000 x14: 001d24f7bc000000
[ 69.807826] x13: ffffffffa88e9f71 x12: 0000000000000018
[ 69.813148] x11: 65722e6d6f635f5f x10: 65722e6d6f635f5f
[ 69.818470] x9 : 0000000000000020 x8 : 6972645f74616864
[ 69.823794] x7 : 0000000000000004 x6 : 000002aab316ccf4
[ 69.829117] x5 : 00000000471e3447 x4 : 0000000000000000
[ 69.834437] x3 : 0000000000000c80 x2 : 00000000000001e6
[ 69.839760] x1 : 000002aab3419d38 x0 : 0000000000000000
If I'm understanding all that correctly, that is just qemu segfaulting, not
any kernel problem.
The stack trace inside qemu is:
Thread 1 (Thread 0x3ffb3406700 (LWP 2712)):
#0 qstring_get_str (qstring=0x0) at qobject/qstring.c:129
#1 0x000002aab63b9574 in qdict_get_str (qdict=<optimized out>,
key=key@entry=0x2aab6459d38 "id") at qobject/qdict.c:279
#2 0x000002aab625a0a4 in hmp_drive_del (mon=<optimized out>,
qdict=<optimized out>) at blockdev.c:2843
#3 0x000002aab61ac7e0 in handle_qmp_command (parser=<optimized out>,
tokens=<optimized out>) at /usr/src/debug/qemu-2.6.0/monitor.c:3922
#4 0x000002aab63bb0d0 in json_message_process_token (lexer=0x2aad6b5a340,
input=0x2aad6b00fa0, type=JSON_RCURLY, x=<optimized out>,
y=<optimized out>) at qobject/json-streamer.c:94
#5 0x000002aab63cfd44 in json_lexer_feed_char (
lexer=lexer@entry=0x2aad6b5a340, ch=<optimized out>,
flush=flush@entry=false) at qobject/json-lexer.c:310
#6 0x000002aab63cfe2c in json_lexer_feed (lexer=0x2aad6b5a340,
buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:360
#7 0x000002aab63bb1b0 in json_message_parser_feed (parser=<optimized out>,
buffer=<optimized out>, size=<optimized out>)
#8 0x000002aab61aad58 in monitor_qmp_read (opaque=<optimized out>,
buf=<optimized out>, size=<optimized out>)
#9 0x000002aab6261284 in qemu_chr_be_write_impl (len=<optimized out>,
buf=<optimized out>, s=<optimized out>) at qemu-char.c:389
#10 qemu_chr_be_write (s=<optimized out>, buf=<optimized out>,
len=<optimized out>) at qemu-char.c:401
#11 0x000002aab6261660 in tcp_chr_read (
chan=<error reading variable: value has been optimized out>,
cond=<error reading variable: value has been optimized out>,
opaque@entry=<error reading variable: value has been optimized out>)
#12 0x000002aab638b5dc in qio_channel_fd_source_dispatch (
source=<optimized out>, callback=<optimized out>,
user_data=<optimized out>) at io/channel-watch.c:84
#13 0x000003ffb4dde508 in g_main_context_dispatch ()
#14 0x000002aab6332cec in glib_pollfds_poll () at main-loop.c:213
#15 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:258
#16 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
#17 0x000002aab6178414 in main_loop () at vl.c:1934
#18 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
TBH I don't think we care about this very much, so this is just for your
information. I don't think we should try too hard to fix this unless the fix
turns out to be trivial. The reasons are (a) no one really cares about hotplug
with libguestfs and (b) we'll be moving to virtio-pci as soon as we can.
Comment 7 was filed as bug 1350889.
I'm closing this as current-release. I was able to both hot-add/remove a disk using virsh attach/detach-disk and using add/remove-drive in libguestfs.
kernel (host & guest): kernel-4.5.0-13.el7.aarch64 (using ACPI)