Bug 1254971
Summary: | RFE: support setting 'reconnect' parameter on TCP chardev backends (for USB redir and all other chardev users) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Fangge Jin <fjin> | ||||||||||
Component: | libvirt | Assignee: | Pavel Hrdina <phrdina> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | jiyan <jiyan> | ||||||||||
Severity: | low | Docs Contact: | |||||||||||
Priority: | low | ||||||||||||
Version: | 7.2 | CC: | areis, berrange, coli, dm, dyuan, hachen, jinzhao, jiyan, jsuchane, juzhang, knoel, kraxel, lmiksik, michen, mzhan, phrdina, rbalakri, virt-maint, xfu, xuzhang, yafu, yanqzhan, zpeng | ||||||||||
Target Milestone: | rc | Keywords: | FutureFeature, Reopened | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | libvirt-3.7.0-1.el7 | Doc Type: | If docs needed, set a value | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2018-04-10 10:33:22 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 1401400 | ||||||||||||
Attachments: |
|
Description
Fangge Jin
2015-08-19 11:13:26 UTC
Created attachment 1064754 [details]
dmesg of guest
Created attachment 1064755 [details]
qemu log on source host
Created attachment 1064757 [details]
qemu log on target host
Sorry, some typo in step1, it should be: 1.Plug a USB device on **a** host(10.66.4.208), Moving to qemu-kvm, libvirt starts the new qemu with correct arguments. Qemu disconnects from usbredirserver but never connect again after migration is finished. Is usbredirserver actually supported in the first place? As far I know it is more a debug tool ... spice usb redirection supports live migration (and puts quite some effort in to make it work). (In reply to Gerd Hoffmann from comment #8) > Is usbredirserver actually supported in the first place? > As far I know it is more a debug tool ... > > spice usb redirection supports live migration (and puts quite some effort in > to make it work). If usbredirserver is not supported, how to test the usbredir tcp mode? The usbredirserver works well before migration. I don't think tcp mode is supported either and thus doesn't needs special testing. The main difference between tcp and spice mode is that (a) spice supports migration and (b) a different network transport is used. So apart from migration support there shouldn't be much of a difference. So, for QE purposes it might be useful to use tcp mode instead of spice mode simply because it is probably easier to use usbredirserver in automated testing. Other than that I see little reason to pay much attention to usbredirserver. (In reply to Gerd Hoffmann from comment #10) > I don't think tcp mode is supported either and thus doesn't needs special > testing. > > The main difference between tcp and spice mode is that (a) spice supports > migration and (b) a different network transport is used. So apart from > migration support there shouldn't be much of a difference. > > So, for QE purposes it might be useful to use tcp mode instead of spice mode > simply because it is probably easier to use usbredirserver in automated > testing. Other than that I see little reason to pay much attention to > usbredirserver. I'm closing as WONTFIX and will follow up with the doc team to check if this should be documented somewhere. (In reply to Ademar Reis from comment #11) > (In reply to Gerd Hoffmann from comment #10) > > I don't think tcp mode is supported either and thus doesn't needs special > > testing. > > > > The main difference between tcp and spice mode is that (a) spice supports > > migration and (b) a different network transport is used. So apart from > > migration support there shouldn't be much of a difference. > > > > So, for QE purposes it might be useful to use tcp mode instead of spice mode > > simply because it is probably easier to use usbredirserver in automated > > testing. Other than that I see little reason to pay much attention to > > usbredirserver. > > I'm closing as WONTFIX and will follow up with the doc team to check if this > should be documented somewhere. Maybe it's better to refuse migration with usbredir tcp mode. I met another issue when do migration with usbredir tcp mode. Migrate the guest with both virtio disk and usebredir tcp mode, the guest crashed on the target host once it trying to mount the usb device after migration. Test version libvirt-0.10.2-55.el6.x86_64 qemu-kvm-rhev-0.12.1.2-2.482.el6.x86_64 usbredir-server-0.5.1-3.el6.x86_64 Steps to reproduce: 1.Plug a USB device on the host, and start usbredirserver: # lsusb ... Bus 001 Device 003: ID 0951:1624 Kingston Technology DataTraveler 101 II # usbredirserver -p 4000 0951:1624 2.Start a guest with virtio disk and usb-redir source mode: ... <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/var/lib/libvirt/images/rhel6.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> ... <redirdev bus='usb' type='tcp'> <source mode='connect' host='10.66.4.148' service='4000'/> <protocol type='raw'/> </redirdev> ... 3.Do migration: #virsh migrate guest qemu+ssh://10.66.4.148/system --live --verbose 4.Open the guest in virt-manager, the guest crashed once trying to mount usb device 5.Check the qemu log, the guest crashed since assert failed: ... qemu-kvm: /builddir/build/BUILD/qemu-kvm-0.12.1.2/usb-redir.c:1024: usbredir_chardev_read: Assertion `dev->read_buf == ((void *)0)' failed. 6.The coredump file is in the attachment. Created attachment 1115031 [details]
coredump file
(In reply to yafu from comment #12) > (In reply to Ademar Reis from comment #11) > > (In reply to Gerd Hoffmann from comment #10) > > > I don't think tcp mode is supported either and thus doesn't needs special > > > testing. > > > > > > The main difference between tcp and spice mode is that (a) spice supports > > > migration and (b) a different network transport is used. So apart from > > > migration support there shouldn't be much of a difference. > > > > > > So, for QE purposes it might be useful to use tcp mode instead of spice mode > > > simply because it is probably easier to use usbredirserver in automated > > > testing. Other than that I see little reason to pay much attention to > > > usbredirserver. > > > > I'm closing as WONTFIX and will follow up with the doc team to check if this > > should be documented somewhere. > > > Maybe it's better to refuse migration with usbredir tcp mode. I met another > issue when do migration with usbredir tcp mode. Migrate the guest with both > virtio disk and usebredir tcp mode, the guest crashed on the target host > once it trying to mount the usb device after migration. > Makes sense, I agree. Gerd, can this be done? Hi,
> > Maybe it's better to refuse migration with usbredir tcp mode. I met another
> > issue when do migration with usbredir tcp mode. Migrate the guest with both
> > virtio disk and usebredir tcp mode, the guest crashed on the target host
> > once it trying to mount the usb device after migration.
> >
>
> Makes sense, I agree. Gerd, can this be done?
Hmm, not so easy as things are handled indirectly via chardev.
usbredir doesn't know what the actual transport is ...
(In reply to Gerd Hoffmann from comment #15) > Hi, > > > > Maybe it's better to refuse migration with usbredir tcp mode. I met another > > > issue when do migration with usbredir tcp mode. Migrate the guest with both > > > virtio disk and usebredir tcp mode, the guest crashed on the target host > > > once it trying to mount the usb device after migration. > > > > > > > Makes sense, I agree. Gerd, can this be done? > > Hmm, not so easy as things are handled indirectly via chardev. > usbredir doesn't know what the actual transport is ... So given this is not a valid use-case for customers (tcp here should be used only for debug) and the fix is complex, I'm closing it again. > So given this is not a valid use-case for customers Really, this can be real use case. There is article about it, in russian though https://habrahabr.ru/post/265065/ It is possible to use usbredirserver to export usb keys , which are still widely in use for windows applications, into windows guests, so tcp migration can be extremilu useful. so, I'd like this bug reopened and fixed :-) (In reply to Need Real Name from comment #17) > > So given this is not a valid use-case for customers > > Really, this can be real use case. > There is article about it, in russian though > https://habrahabr.ru/post/265065/ > > It is possible to use usbredirserver to export usb keys , which are still > widely in use for windows applications, into windows guests, so tcp > migration can be extremilu useful. There is usb-host (for usb keys connected to the virtualization host). There is spice redirection (for usb keys connected to the users machine). Why tcp redirection? There is no user's machine here, i.e. usb key is not plugged into it. Application server is protected by usb key, or, another scenario, application is on windows terminal server. So currently we plug keys into host and pass through them to VMs , but this prevents migration. There is http://www.digi.com/products/usb-and-serial-connectivity/usb-over-ip-hubs/anywhereusb but it's cost is high enough to buy reserve... So we found usbredirserver, so we can use one of linux servers as usb keys server, and in case of it's failure plug keys to another server, we alredy have. But! No live migration , so no difference with just plugging keys into host... Thank you! Hello! Could you tell me is my explanation clear enough? Thank you! btw, somebody uses this with proxomox: https://github.com/kvaps/usbredirtools (In reply to Need Real Name from comment #21) > Hello! > > Could you tell me is my explanation clear enough? > > Thank you! Yes, they build a kind of licensing server which exports all those usb dongles. Problem is there is no easy way to make this fully guest transparent. spice puts quite some effort into this. What happens if you unplug the usb dongle, then re-plug it after a short time? i.e. would "unplug -> live migrate -> plug" work? (In reply to Gerd Hoffmann from comment #23) > What happens if you unplug the usb dongle, then re-plug it after a short > time? > i.e. would "unplug -> live migrate -> plug" work? Asked colleague to test this he replied: unplug: virsh qemu-monitor-command --hmp manzan device_del usbredirdev1 then live migration then on new host plug: virsh qemu-monitor-command --hmp manzan chardev-add socket,id=usbredirdev1,port=4000,host=192.168.22.31 virsh qemu-monitor-command --hmp manzan device_add usb-redir,chardev=usbredirdev1,id=usbredirdev1,bus=usb.0 works. Problem here is that this require additional scripting , on both hosts, which makes ready-to-use scripts, let's say virtualdomain from pacemaker, useless :-( Thank you! > unplug: > then on new host plug: > works. Good. > Problem here is that this require additional scripting , on both hosts, > which makes ready-to-use scripts, let's say virtualdomain from pacemaker, > useless :-( Sure. The manual scripting issue is probably solvable, by moving the unplug + usb-device-reset and re-plug into qemu. We do something simliar in usb-host already. I'll have a look. Waded through the code. The redirection code seems to do the correct thing already when it sees CHR_EVENT_{OPENED,CLOSED} events. Typically qemu will run in server mode and listen for connects, or it gets passed in filehandles from libvirt. tcp chardevs can also connect to a peer at startup. But it is rarely used, and not supported very well. So qemu doesn't really have a concept of re-connecting: if a connection goes down -- bad luck, You have to power-cycle the virtual machine to reconnect (and that is without live migration involved yet). Daniel, you've rewrote much of the socket code in qemu. Any opinion on this? Can we implement something like -chardev tcp,reconnect_interval=3sec ? The tcp chardev already has ability to do reconnects when operating has a client. I don't think that is wired up into libvirt yet though. (In reply to Daniel Berrange from comment #27) > The tcp chardev already has ability to do reconnects when operating has a > client. I don't think that is wired up into libvirt yet though. --verbose please. How can I kick a reconnect? monitor? Just set the 'reconnect' flag on the chardev backed - it sets the timeout for reconnecting on non-server sockets when the remote end goes away. qemu will delay this many seconds and then attempt to reconnect, but this defaults to 0, hence does not reconnect by default. Libvirt doesn't expose the "reconnect" attribute in XML though. Looks like reconnect works-just tested on one host, but , because it is not supported by libvirt (at least in RHEL7 version), then we can't use it for live migration :-( May be it is possible to set default reconnect value to not 0? Changing the default isn't a good idea as this has a high chance to just trade one issue for another. There is a special libvirt syntax to pass additional command line switches to qemu, see http://blog.vmsplice.net/2011/04/how-to-pass-qemu-command-line-options.html You can try this to tweak the reconnect value: <qemu:commandline> <qemu:arg value='-set'/> <qemu:arg value='chardev.charredir0.reconnect=10'/> </qemu:commandline> In any case making the reconnect configurable goes into libvirt territory, reassigning ... Hello! Unfortunately we failed to set reconnect parameter using <qemu:commandline>, may be we are doing something wrong, but.. Thank you! Upstream patches posted: https://www.redhat.com/archives/libvir-list/2017-August/msg00818.html Upstream commit: commit 3ba6b532d11736fe82fedd53244f3c334e911b7c Author: Pavel Hrdina <phrdina> Date: Fri Aug 25 18:57:15 2017 +0200 qemu: implement chardev source reconnect v3.6.0-223-g3ba6b532d1 Test env components: qemu-kvm-rhev-2.10.0-3.el7.x86_64 kernel-3.10.0-742.el7.x86_64 libvirt-3.8.0-1.el7.x86_64 Test Scenario: 1. 'tcp' USB redirected device with 'connect' mode and 'reconnect' element ( 'enabled' equals 'yes' and 'timeout' is set) 2. 'tcp' USB redirected device with 'connect' mode and 'reconnect' element ( 'enabled' equals 'no') 3. 'tcp' channel device with 'connect' mode and without configuration for 'reconnect' element Scenario-1: USB redirected device with 'connect' mode, 'enabled' equals 'yes' and 'timeout' is set 1. Prepare physical hostA with USB device plugged (hostA)# lsusb Bus 002 Device 051: ID 058f:6387 Alcor Micro Corp. Flash Drive (hostA)# usbredirserver -p 4000 058f:6387 2. Prepare physical hostB used as migration source machine, and VM named 'pc' with USB redirected device configured runs in hostB, start VM (hostB)# virsh dumpxml pc --inactive |grep redir -A5 <redirdev bus='usb' type='tcp'> <source mode='connect' host='hostA IP' service='4000'> <reconnect enabled='yes' timeout='5'/> </source> <protocol type='raw'/> <address type='usb' bus='0' port='1'/> </redirdev> (hostB)# virsh start pc Domain pc started 3. After starting VM 'pc', check the info returned by command 'usbredirserver' in hostA (hostA)# usbredirserver -p 4000 058f:6387 usbredirparser: Peer version: qemu usb-redir guest 2.10.0, using 64-bits ids 4. Prepare physical hostC used as migration destination machine, migrate VM 'pc' from hostB to hostC (hostB)# virsh migrate --live pc qemu+ssh://hostC IP/system --verbose root@hostC IP's password: Migration: [100 %] 5. Check the info returned by command 'usbredirserver' in hostA (hostA)# usbredirserver -p 4000 058f:6387 usbredirparser: Peer version: qemu usb-redir guest 2.10.0, using 64-bits ids usbredirhost: device disconnected usbredirparser: error data len 33 != header len 0 ep 00 6. Check the VM 'pc' in hostC (hostC) # virsh list --all |grep pc 13 pc running (hostC) # virsh console pc Connected to domain pc Escape character is ^] Last login: Tue Oct 24 17:26:01 on tty1 # lsusb Bus 001 Device 003: ID 058f:6387 Alcor Micro Corp. Flash Drive 7. Check the info returned by command 'usbredirserver' in hostA (hostA)# usbredirserver -p 4000 058f:6387 usbredirparser: Peer version: qemu usb-redir guest 2.10.0, using 64-bits ids usbredirhost: device disconnected usbredirparser: error data len 33 != header len 0 ep 00 usbredirparser: Peer version: qemu usb-redir guest 2.10.0, using 64-bits ids It shows reconnection succeeds as configurtion in 'reconnect' element. Scenario-2: USB redirected device with 'connect' mode, 'enabled' equals 'no' 1. Prepare physical hostA with USB device plugged (hostA)# lsusb Bus 002 Device 051: ID 058f:6387 Alcor Micro Corp. Flash Drive (hostA)# usbredirserver -p 4000 058f:6387 2. Prepare physical hostB used as migration source machine, and VM named 'pc' with USB redirected device configured runs in hostB, start VM (hostB)# virsh dumpxml pc --inactive |grep redir -A5 <redirdev bus='usb' type='tcp'> <source mode='connect' host='hostA IP' service='4000'> <reconnect enabled='no'/> </source> <protocol type='raw'/> <address type='usb' bus='0' port='1'/> </redirdev> (hostB)# virsh start pc Domain pc started 3. After starting VM 'pc', check the info returned by command 'usbredirserver' in hostA (hostA)# usbredirserver -p 4000 058f:6387 usbredirparser: Peer version: qemu usb-redir guest 2.10.0, using 64-bits ids 4. Prepare physical hostC used as migration destination machine, migrate VM 'pc' from hostB to hostC (hostB)# virsh migrate --live pc qemu+ssh://hostC IP/system --verbose root@hostC IP's password: Migration: [100 %] 5. Check the info returned by command 'usbredirserver' in hostA (hostA)# usbredirserver -p 4000 058f:6387 usbredirparser: Peer version: qemu usb-redir guest 2.10.0, using 64-bits ids usbredirhost: device disconnected usbredirparser: error data len 33 != header len 0 ep 00 6. Check the VM 'pc' in hostC (hostC) # virsh list --all |grep pc 13 pc running (hostC) # virsh console pc Connected to domain pc Escape character is ^] Last login: Tue Oct 24 17:26:01 on tty1 # lsusb No USB redirected device 7. Check the info returned by command 'usbredirserver' in hostA (hostA)# usbredirserver -p 4000 058f:6387 usbredirparser: Peer version: qemu usb-redir guest 2.10.0, using 64-bits ids usbredirhost: device disconnected usbredirparser: error data len 33 != header len 0 ep 00 It shows reconnection fails as configurtion in 'reconnect' element. Scenario-3: 1. Prepare physical hostA with server socket program run, as for the code please refer to attachment (hostA)# ./server 2. Prepare physical hostB used as migration source machine, and VM named 'pc' with channel device configured runs in hostB, start VM (hostB)# virsh dumpxml pc --inactive |grep channel -A6 <channel type='tcp'> <source mode='connect' host='hostA' service='2445'/> <protocol type='raw'/> <target type='virtio' name='test1'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> (hostB)# virsh start pc Domain pc started 3. After starting VM 'pc', check the info returned by program in hostA (hostA)# ./server connect from hostB IP (hostA)# netstat -anlp |grep 2445 tcp 0 0 0.0.0.0:2445 0.0.0.0:* LISTEN 16188/./server tcp 0 0 HostA:2445 hostB:56830 ESTABLISHED 16188/./server 4. Kill the 'server' program # kill -9 16188 # netstat -anlp |grep 2445 No output 5. Restart the 'server' program (hostA)# ./server 6. Try to send several characters in guest to hostA through channel ()hostB # virsh console pc Connected to domain pc Escape character is ^] Last login: Wed Oct 25 11:10:02 on ttyS0 # echo abdadsafds >/dev/vport0p1 -bash: echo: write error: Interrupted system call It shows reconnection fails as no configurtion for 'reconnect' element. All the results are as expected, move this bug to be verified. Additional info: # cat server.c #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <unistd.h> #include <stdlib.h> #include <string.h> #include <stdio.h> #define PORT 2445 #define MAXSOCKFD 10 int Bind(int fd,const struct sockaddr *sa,socklen_t salen) { return(bind(fd,sa,salen)); } main() { int sockfd,newsockfd,is_connected[MAXSOCKFD],fd; struct sockaddr_in addr; // struct sockaddr *addrt; int addr_len = sizeof(struct sockaddr_in); fd_set readfds; char buffer[256]; char msg[] ="Welcome to server!"; if ((sockfd = socket(AF_INET,SOCK_STREAM,0))<0) { perror("socket"); exit(1); } printf("%d\n", sockfd); bzero(&addr,sizeof(addr)); // memset(&addr,0,sizeof(addr)); addr.sin_family = AF_INET; addr.sin_port = htons(PORT); addr.sin_addr.s_addr = htonl(INADDR_ANY); // addrt = &addr; if(bind(sockfd,(struct sockaddr *)&addr,sizeof(addr))<0) { perror("connect"); exit(1); } printf("%d\n", sockfd); if(listen(sockfd,3)<0) { perror("listen"); exit(1); } printf("%d\n", sockfd); for(fd=0;fd<MAXSOCKFD;fd++) is_connected[fd]=0; while(1){ FD_ZERO(&readfds); FD_SET(sockfd,&readfds); // printf("%d\n",sockfd ); for(fd=0;fd<MAXSOCKFD;fd++) if(is_connected[fd]) FD_SET(fd,&readfds); if(!select(MAXSOCKFD,&readfds,NULL,NULL,NULL)) continue; for(fd=0;fd<MAXSOCKFD;fd++) if(FD_ISSET(fd,&readfds)) { if(sockfd ==fd) { if((newsockfd = accept (sockfd,(struct sockaddr *)&addr,&addr_len))<0) perror("accept"); write(newsockfd,msg,sizeof(msg)); is_connected[newsockfd] =1; printf("connect from %s\n",inet_ntoa(addr.sin_addr)); } else { bzero(buffer,sizeof(buffer)); if(read(fd,buffer,sizeof(buffer))<=0) { printf("connect closed.\n"); is_connected[fd]=0; close(fd); } else printf("%s",buffer); } } } } # gcc server.c -o server Hi Pavel, I test the another scenario when verifying this bug, the scenario is: migrate VM with 'tcp' channel in 'connect' mode but without 'reconnect' configuration from hostB to hostC, after migration, VM can still send message to hostA which runs a server socket program, could you help to check whether it is normal or it is a bug, thanks in advance. Steps to reoproduce: 1. Prepare physical hostA with server socket program run, as for the code please refer to attachment, the program shows in comment37 (hostA)# ./server 2. Prepare physical hostB used as migration source machine, and VM named 'pc' with channel device configured runs in hostB, start VM (hostB)# virsh dumpxml pc --inactive |grep channel -A6 <channel type='tcp'> <source mode='connect' host='hostA' service='2445'/> <protocol type='raw'/> <target type='virtio' name='test1'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> (hostB)# virsh start pc Domain pc started 3. After starting VM 'pc', check the info returned by program in hostA (hostA)# ./server connect from hostB IP (hostA)# netstat -anlp |grep 2445 tcp 0 0 0.0.0.0:2445 0.0.0.0:* LISTEN 16188/./server tcp 0 0 HostA:2445 hostB:56830 ESTABLISHED 16188/./server 4. Try to send several characters in guest to hostA through channel and check the info in hostA (hostB) # virsh console pc Connected to domain pc Escape character is ^] Last login: Wed Oct 25 11:10:02 on ttyS0 # echo abdadsafds >/dev/vport0p1 (hostA)# ./server connect from hostB IP dsdankdnasiofew 5. Prepare physical hostC used as migration destination machine, migrate VM 'pc' from hostB to hostC (hostB) # virsh migrate --live pc qemu+ssh://hostC IP/system --verbose root@hostC IP's password: Migration: [100 %] 6. Check the info in hostA # ./server connect from hostB IP dsdankdnasiofew connect from hostC IP connect closed. 7. Check the VM 'pc' in hostC and Try to send several characters in guest to hostA through channel (hostC) # virsh list --all |grep pc 13 pc running (hostC) # virsh console pc Connected to domain pc Escape character is ^] Last login: Tue Oct 24 17:26:01 on tty1 # echo xewqicrecervqf >/dev/vport0p1 8. Check the info in hostA # ./server connect from hostB IP dsdankdnasiofew connect from hostC IP connect closed. xewqicrecervqf I wouldn't say that's a bug. If you don't configure reconnect at all it's up to the hypervisor to use some default. In case of migration QEMU probably tries to reconnect to server. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0704 |