Bug 1227874
Summary: | virtserialport doesn't generates SIGIO signals correctly | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Lukáš Doktor <ldoktor> | ||||||||
Component: | qemu-kvm-rhev | Assignee: | Laurent Vivier <lvivier> | ||||||||
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 7.1 | CC: | amit.shah, hannsj_uhl, huding, juzhang, knoel, ldoktor, lvivier, qzhang, thuth, virt-maint, xfu | ||||||||
Target Milestone: | rc | ||||||||||
Target Release: | 7.3 | ||||||||||
Hardware: | ppc64le | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2016-08-09 10:25:47 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1359843 | ||||||||||
Attachments: |
|
Description
Lukáš Doktor
2015-06-03 17:03:07 UTC
Created attachment 1034414 [details]
virttest log of a 2 variant test where the 1st one passes and the 2nd one fails
Is this specific to ppc? Does it reproduce on x86? I run the cases for several times, the test failed and does not generates SIGIO signals correctly. However it seems the test failed on other events (IN HUP) in my side. As this case failed and the signals have problem, I will ack it first, any problem please correct me. BTW, I also run the test on x86, the first time it passed. Failed at the second time, failed event is IN OUT HUP. Created attachment 1045367 [details]
test_log_power
Created attachment 1045368 [details]
test_log_x86
The Qunfang problem might not be related to this one. I took a look and the test does not read-outs the previously set signals, which might affect the results. The fix for this is here: https://github.com/autotest/tp-qemu/pull/416 I included this fix and re-tested it on ppc64be host with ppc64le guest with those results: https://kvmqe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHELSA-devels/view/PPC64BEguestLEhost/job/30_run_test_PPC64LE2BE/13/testReport/junit/(root)/VirtTest/type_specific_io_github_autotest_qemu_virtio_console_spread_linear_specifiable_virtserialport_virtio_console_smoke_sigio/ 09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.async('vs1', True, 0)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 1 09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.async('vs1', True, 0)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.set_pool_want_return('vs1', select.POLLOUT)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.set_pool_want_return('vs1', select.POLLOUT | select.POLLIN)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:16 qemu_virti L0310 DEBUG| Executing 'virt.set_pool_want_return('vs1', select.POLLHUP | select.POLLIN)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:16 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10 09:37:16 virtio_con L2002 ERROR| Original traceback: Traceback (most recent call last): File "/home/jenkins/autotest/client/shared/error.py", line 141, in new_fn return fn(*args, **kwargs) File "/home/jenkins/virt-test/test-providers.d/downloads/io-github-autotest-qemu/qemu/tests/virtio_console.py", line 344, in test_sigio guest_worker.cmd("virt.get_sigio_poll_return('%s')" % (port.name), 10) File "/home/jenkins/virt-test/virttest/qemu_virtio_port.py", line 296, in cmd (cmd, self.vm.name, data)) VirtioPortException: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, output: virt.get_sigio_poll_return('vs1') FAIL: Events: Expected: IN HUP [(3, 17)] Which means it handled some signals properly, but it did not generate signal when the host disconnected. I never saw this failure on x86 (Fedora), so I guess it's ppc64 specific in deed. Although I tried this on arm64 and it works fine: https://kvmqe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHELSA-devels/view/ARM64-stable/job/30_run_test_ARM64-stable/5/ Can you reproduce the problem with a 2.6 version qemu? qemu-kvm-rhev-2.6.0-15.el7.ppc64le Avocado 38.0 # avocado_ppc64le virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio JOB ID : 0152c4f48158603ba8456149d72851dda0792b78 JOB LOG : /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/job.log TESTS : 1 (1/1) type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio: \** Message: pygobject_register_sinkfunc is deprecated (GstObject) ** Message: pygobject_register_sinkfunc is deprecated (GstObject) -** Message: pygobject_register_sinkfunc is deprecated (GstObject) FAIL (106.14 s) RESULTS : PASS 0 | ERROR 0 | FAIL 1 | SKIP 0 | WARN 0 | INTERRUPT 0 JOB HTML : /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/html/results.html TESTS TIME : 106.14 s Same results as in comment #5: 07:24:42 DEBUG| Host does not support OpenVSwitch: Command 'ovs-vswitchd' could not be found in any of the PATH dirs: ['/bin', '/sbin', '/usr/bin', '/usr/local/sbin', '/usr/libexec', '/root/bin', '/usr/sbin', '/usr/local/bin'] 07:24:42 DEBUG| Checking image file /var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-ppc64le.qcow2 07:24:42 DEBUG| Copying /var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-ppc64le.qcow2.backup -> /var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-ppc64le.qcow2 07:24:44 WARNI| avocado-vt-vm1 is not alive. Can't query the register status 07:24:59 DEBUG| /usr/lib64/python2.7/site-packages/gobject/constants.py:24: Warning: g_boxed_type_register_static: assertion 'g_type_from_name (name) == 0' failed 07:24:59 DEBUG| import gobject._gobject 07:25:46 DEBUG| Encoding video file /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/test-results/1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio/screendumps_avocado-vt-vm1_20042_iter0.ogg 07:25:46 INFO | Video creation failed for /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/test-results/1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio/screendumps_avocado-vt-vm1_20042_iter0: multifilesrc 07:25:46 DEBUG| Thread quit. Used to failed to get register info from guest 20160725-072401-TgOoKC5h for 6 times. 07:25:47 INFO | Running 'true' 07:25:47 INFO | Running 'ps -o comm 1' 07:25:47 DEBUG| [stdout] COMMAND 07:25:47 DEBUG| [stdout] systemd 07:25:47 ERROR| 07:25:47 ERROR| Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado_vt/test.py:420 07:25:47 ERROR| Traceback (most recent call last): 07:25:47 ERROR| File "/usr/lib/python2.7/site-packages/avocado_vt/test.py", line 206, in runTest 07:25:47 ERROR| raise exceptions.TestFail(details) 07:25:47 ERROR| TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output: 07:25:47 ERROR| virt.get_sigio_poll_return('vs1') 07:25:47 ERROR| FAIL: Events: Expected: IN HUP 07:25:47 ERROR| [(3, 17)] 07:25:47 ERROR| [context: Executing test: test_sigio] 07:25:47 ERROR| 07:25:47 ERROR| FAIL 1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio -> TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output: virt.get_sigio_poll_return('vs1') FAIL: Events: Expected: IN HUP [(3, 17)] [context: Executing test: test_sigio] 07:25:47 INFO | But the second time it works: # avocado_ppc64le virtio_console.spread_linear.specifiae.virtserialport.virtio_console_smoke.sigio JOB ID : 2687f4aede7fc3da65017803a8f756825e3ac763 JOB LOG : /root/avocado/job-results/job-2016-07-25T07.56-2687f4a/job.log TESTS : 1 (1/1) type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio: \** Message: pygobject_register_sinkfunc is deprecated (GstObject) PASS (46.12 s) RESULTS : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0 JOB HTML : /root/avocado/job-results/job-2016-07-25T07.56-2687f4a/html/results.html TESTS TIME : 46.12 s And the third time fails as described with missing OUT event: 2016-07-25 08:00:04,188 stacktrace L0041 ERROR| Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado_vt/test.py:420 2016-07-25 08:00:04,188 stacktrace L0044 ERROR| Traceback (most recent call last): 2016-07-25 08:00:04,188 stacktrace L0044 ERROR| File "/usr/lib/python2.7/site-packages/avocado_vt/test.py", line 206, in runTest 2016-07-25 08:00:04,188 stacktrace L0044 ERROR| raise exceptions.TestFail(details) 2016-07-25 08:00:04,188 stacktrace L0044 ERROR| TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output: 2016-07-25 08:00:04,188 stacktrace L0044 ERROR| virt.get_sigio_poll_return('vs1') 2016-07-25 08:00:04,189 stacktrace L0044 ERROR| FAIL: Events: Expected: IN OUT 2016-07-25 08:00:04,189 stacktrace L0044 ERROR| [(3, 5)] 2016-07-25 08:00:04,189 stacktrace L0044 ERROR| [context: Executing test: test_sigio] 2016-07-25 08:00:04,189 stacktrace L0045 ERROR| 2016-07-25 08:00:04,189 test L0580 ERROR| FAIL 1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio -> TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output: virt.get_sigio_poll_return('vs1') FAIL: Events: Expected: IN OUT [(3, 5)] I'm trying to reproduce this bug out of avocado. Steps to Reproduce: 1. Start guest with at least one virtserialport /usr/libexec/qemu-kvm -name avocado-vt-vm1 \ -nodefaults -nographic \ -machine pseries \ -chardev stdio,mux=on,signal=off,id=serial_id_serial0 \ -mon chardev=serial_id_serial0,mode=readline,default \ -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 \ -device virtio-serial-pci,id=virtio_serial_pci3,bus=pci.0,addr=06,disable-legacy=off,disable-modern=on \ -chardev socket,id=devvs1,path=/tmp/virtio_port-vs1,server,nowait \ -device virtserialport,chardev=devvs1,name=vs1,id=vs1 \ -drive id=drive_image1,if=none,format=qcow2,file=$IMAGE \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=08,disable-legacy=off,disable-modern=on \ -m 2048 \ -smp 2,maxcpus=2,cores=1,threads=1,sockets=2 2. Set the port to async mode and create sigio handler which stores the emitted signals I start the following program inside the guest: ------8<----- poll.c ------------------------- /* cc -o poll poll.c */ #include<stdio.h> #include<fcntl.h> #include<signal.h> #include <unistd.h> #include <poll.h> int fd; void handler(int sig) { int ret; struct pollfd fds; fds.fd = fd; fds.events = POLLIN | POLLOUT; fds.revents = 0; ret = poll(&fds, 1, 0); printf("EVENT ret %d revents %x\n", ret, fds.revents); } int main(void) { signal(SIGIO, handler); fd = open("/dev/vport1p1", O_RDWR); fcntl(fd, F_SETOWN, getpid()); fcntl(fd, F_SETFL, O_ASYNC); for(;;) pause(); } ------8<----- poll.c ------------------------- 3. Check if no signals are emitted I check there is no output for the previous "poll" program. 4. Open the port on the host side I run the following script on the host: ----8<---- open.py ---------------------- #!/usr/bin/python import socket sock = socket.socket(socket.AF_UNIX) sock.connect("/tmp/virtio_port-vs1"); raw_input("Press Enter to exit...") ----8<---- open.py ---------------------- 5. Verify it emitted POLLOUT I check the "poll" program emits POLLOUT: EVENT ret 1 revents 4 Lukáš, is that the test case you describe? but I'm not able to reproduce the problem: what is wrong? Are you sure the timing of your avocado test script is ok: the open on the host happens after the "O_ASYNC" in the guest? Hello Laurent, the avocado tests does basically this: 0. set's the sigio handler to read signals of registered fds and append them to their mask p = select.poll() map(p.register, self.poll_fds.keys()) masks = p.poll(1) print masks for mask in masks: self.poll_fds[mask[0]][1] |= mask[1] 1. if the port was open on host, closes it self.sock.shutdown(socket.SHUT_RDWR) self.sock.close() 2. cleans all the previously generated polls on guest fd = self._open([port])[0] maskstr = self.pollmask_to_str(self.poll_fds[fd][1]) if (self.poll_fds[fd][0] ^ self.poll_fds[fd][1]): emaskstr = self.pollmask_to_str(self.poll_fds[fd][0]) print "FAIL: Events: " + maskstr + " Expected: " + emaskstr else: print "PASS: Events: " + maskstr self.poll_fds[fd][1] = 0 3. enables async on guest self.catch_signal = True os.kill(os.getpid(), signal.SIGUSR1) self.use_config.wait() fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_ASYNC) self.poll_fds[fd] = [exp_val, 0] 4. cleans all the previously generated polls on guest (again) 5. sets the expect signals to be select.POLLOUT on guest (which also cleans the previously generated signals) fd = self._open([port])[0] self.poll_fds[fd] = [poll_value, 0] print "PASS: Events: " + self.pollmask_to_str(poll_value) 6. opens port on host self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) self.sock.settimeout(1) self.sock.connect(self.hostfile) self.sock.setsockopt(1, socket.SO_SNDBUF, SOCKET_SIZE) self.sock.settimeout(None) self.port_was_opened = True 7. queries for poll returns on guest The step 7 reports unexpected status. Unfortunately the test does not log the guest output, which lists the actual generated signals. I sent a PR to fix it: https://github.com/autotest/tp-qemu/pull/724 I'll try to reproduce it with this fix to get the actual signals... Hello Laurent, the problem is in test itself, it does not wait for the signals to arrive, therefor sometimes the signals are not yet emitted, but with short timeout it passes. With the fix I noticed rare failure later in the test, but I had no time to properly analyze it. Let me update this BZ when I have the proper fix (next week). I'm sorry for bugging you with this problem, one would think that 6 years old test would work correctly :-) Indeed, this is not a ppc64 bug. The fix to the guest script used in this check can be found here: https://github.com/avocado-framework/avocado-vt/pull/651 and with it everything works well (~13 runs). |