Bug 1221569 - aarch64: add-drive-opts fails to hot plug a disk
Summary: aarch64: add-drive-opts fails to hot plug a disk
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libguestfs
Version: 7.1
Hardware: aarch64
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Richard W.M. Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1237250 1326420
Blocks: 1212027 1288337 1301891
TreeView+ depends on / blocked
 
Reported: 2015-05-14 11:40 UTC by Hu Zhang
Modified: 2016-09-26 15:36 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1350889 (view as bug list)
Environment:
Last Closed: 2016-09-26 15:36:04 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Hu Zhang 2015-05-14 11:40:46 UTC
Description of problem:
add-drive-ops fails to hot plug a disk. It works well before launch the appliance. This works well on rhel7.1-x86_64.


Version-Release number of selected component (if applicable):
libguestfs-1.28.1-1.23.aa7a.aarch64
kernel-3.19.0-0.75.aa7a.aarch64

How reproducible:
Always

Steps to Reproduce:
1. # guestfish -a test.raw
><fs> run
><fs> add-drive-opts test.img  label:hotplug
libguestfs: error: internal_hot_add_drive: hot-add drive: '/dev/disk/guestfs/hotplug' did not appear after 30 seconds: this could mean that virtio-scsi (in qemu or kernel) or udev is not working                                                                    

Actual results:
Described as step.1.

Expected results:
Disk can be hot plugged successfully.                  
><fs> add-drive-opts test.img  label:hotplug
><fs> list-disk-labels
hotplug: /dev/sdc

Additional info:

Comment 2 Richard W.M. Jones 2015-05-15 14:56:17 UTC
Yes, hotplug is known to not work on aarch64.

It will start to work when we enable virtio-scsi over PCI
(instead of virtio-scsi over virtio-mmio, as used now).

Adding Drew who may have an update.

Comment 4 Richard W.M. Jones 2016-01-12 15:18:49 UTC
Actually this works "by magic" now (even with virtio-mmio).

Comment 5 Jon Masters 2016-04-14 09:13:21 UTC
We should retest going into RHELSA7.3 now that they've moved to a PCIe hotplug model. I don't know if that will work, but perhaps Richard has an update.

Comment 7 Richard W.M. Jones 2016-06-27 17:37:24 UTC
I've just tested this again, and the results are mixed.

kernel 4.5.0-0.40.el7.aarch64  *NB* with acpi=off because of another bug
qemu-kvm-rhev-2.6.0-8.el7.aarch64
libguestfs-1.32.5-6.el7.aarch64 still using virtio-mmio

Hotplug add is working fine.

Hotplug remove fails.  qemu crashes and the kernel dumps the following information.

[   69.703520] qemu-kvm[1869]: unhandled level 2 translation fault (11) at 0x00000010, esr 0x92000006
[   69.712449] pgd = fffffe7fe4b60000
[   69.715849] [00000010] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[   69.724119] 
[   69.725612] CPU: 2 PID: 1869 Comm: qemu-kvm Not tainted 4.5.0-0.40.el7.aarch64 #1
[   69.733069] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Jan 26 2016
[   69.740351] task: fffffe7ff152e800 ti: fffffe7ffad34000 task.ti: fffffe7ffad34000
[   69.747808] PC is at 0x2aab3378ef4
[   69.751197] LR is at 0x2aab321a0a4
[   69.754591] pc : [<000002aab3378ef4>] lr : [<000002aab321a0a4>] pstate: 80000000
[   69.761957] sp : 000003fffe88f2d0
[   69.765267] x29: 000003fffe88f2d0 x28: 000002aad0d76000 
[   69.770583] x27: 000002aab34ff000 x26: 0000000000000000 
[   69.775904] x25: 000002aacf757750 x24: 000002aacf7fa2d0 
[   69.781225] x23: 000002aacfa74400 x22: 000002aae26e4800 
[   69.786545] x21: 000002aab34f35b8 x20: 000003fffe88f380 
[   69.791864] x19: 000002aab34ff000 x18: 0000000000000001 
[   69.797184] x17: 000003ff94be20a0 x16: 000002aab34fdf68 
[   69.802504] x15: 003b9aca00000000 x14: 001d24f7bc000000 
[   69.807826] x13: ffffffffa88e9f71 x12: 0000000000000018 
[   69.813148] x11: 65722e6d6f635f5f x10: 65722e6d6f635f5f 
[   69.818470] x9 : 0000000000000020 x8 : 6972645f74616864 
[   69.823794] x7 : 0000000000000004 x6 : 000002aab316ccf4 
[   69.829117] x5 : 00000000471e3447 x4 : 0000000000000000 
[   69.834437] x3 : 0000000000000c80 x2 : 00000000000001e6 
[   69.839760] x1 : 000002aab3419d38 x0 : 0000000000000000 
[   69.845083] 

If I'm understanding all that correctly, that is just qemu segfaulting, not
any kernel problem.

The stack trace inside qemu is:

Thread 1 (Thread 0x3ffb3406700 (LWP 2712)):
#0  qstring_get_str (qstring=0x0) at qobject/qstring.c:129
#1  0x000002aab63b9574 in qdict_get_str (qdict=<optimized out>, 
    key=key@entry=0x2aab6459d38 "id") at qobject/qdict.c:279
#2  0x000002aab625a0a4 in hmp_drive_del (mon=<optimized out>, 
    qdict=<optimized out>) at blockdev.c:2843
#3  0x000002aab61ac7e0 in handle_qmp_command (parser=<optimized out>, 
    tokens=<optimized out>) at /usr/src/debug/qemu-2.6.0/monitor.c:3922
#4  0x000002aab63bb0d0 in json_message_process_token (lexer=0x2aad6b5a340, 
    input=0x2aad6b00fa0, type=JSON_RCURLY, x=<optimized out>, 
    y=<optimized out>) at qobject/json-streamer.c:94
#5  0x000002aab63cfd44 in json_lexer_feed_char (
    lexer=lexer@entry=0x2aad6b5a340, ch=<optimized out>, 
    flush=flush@entry=false) at qobject/json-lexer.c:310
#6  0x000002aab63cfe2c in json_lexer_feed (lexer=0x2aad6b5a340, 
    buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:360
#7  0x000002aab63bb1b0 in json_message_parser_feed (parser=<optimized out>, 
    buffer=<optimized out>, size=<optimized out>)
    at qobject/json-streamer.c:114
#8  0x000002aab61aad58 in monitor_qmp_read (opaque=<optimized out>, 
    buf=<optimized out>, size=<optimized out>)
    at /usr/src/debug/qemu-2.6.0/monitor.c:3938
#9  0x000002aab6261284 in qemu_chr_be_write_impl (len=<optimized out>, 
    buf=<optimized out>, s=<optimized out>) at qemu-char.c:389
#10 qemu_chr_be_write (s=<optimized out>, buf=<optimized out>, 
    len=<optimized out>) at qemu-char.c:401
#11 0x000002aab6261660 in tcp_chr_read (
    chan=<error reading variable: value has been optimized out>, 
    cond=<error reading variable: value has been optimized out>, 
    opaque=0x2aad6b50880, 
    opaque@entry=<error reading variable: value has been optimized out>)
    at qemu-char.c:2895
#12 0x000002aab638b5dc in qio_channel_fd_source_dispatch (
    source=<optimized out>, callback=<optimized out>, 
    user_data=<optimized out>) at io/channel-watch.c:84
#13 0x000003ffb4dde508 in g_main_context_dispatch ()
   from /lib64/libglib-2.0.so.0
#14 0x000002aab6332cec in glib_pollfds_poll () at main-loop.c:213
#15 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:258
#16 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
#17 0x000002aab6178414 in main_loop () at vl.c:1934
#18 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>)
    at vl.c:4667


TBH I don't think we care about this very much, so this is just for your
information.  I don't think we should try too hard to fix this unless the fix
turns out to be trivial.  The reasons are (a) no one really cares about hotplug
with libguestfs and (b) we'll be moving to virtio-pci as soon as we can.

Comment 8 Richard W.M. Jones 2016-06-28 15:34:10 UTC
Comment 7 was filed as bug 1350889.

Comment 9 Andrew Jones 2016-09-26 15:36:04 UTC
I'm closing this as current-release. I was able to both hot-add/remove a disk using virsh attach/detach-disk and using add/remove-drive in libguestfs.

kernel (host & guest): kernel-4.5.0-13.el7.aarch64 (using ACPI)
qemu-kvm-rhev-2.6.0-23.el7.aarch64
libguestfs-1.32.7-3.el7.aarch64


Note You need to log in before you can comment on or make changes to this bug.