RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1227874 - virtserialport doesn't generates SIGIO signals correctly
Summary: virtserialport doesn't generates SIGIO signals correctly
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev
Version: 7.1
Hardware: ppc64le
OS: Linux
unspecified
unspecified
Target Milestone: rc
: 7.3
Assignee: Laurent Vivier
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: RHV4.1PPC
TreeView+ depends on / blocked
 
Reported: 2015-06-03 17:03 UTC by Lukáš Doktor
Modified: 2016-08-09 10:25 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-09 10:25:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
virttest log of a 2 variant test where the 1st one passes and the 2nd one fails (48.80 KB, text/plain)
2015-06-03 17:05 UTC, Lukáš Doktor
no flags Details
test_log_power (40.34 KB, text/plain)
2015-07-02 06:54 UTC, Qunfang Zhang
no flags Details
test_log_x86 (32.38 KB, text/plain)
2015-07-02 06:54 UTC, Qunfang Zhang
no flags Details

Description Lukáš Doktor 2015-06-03 17:03:07 UTC
Description of problem:
I'm running the "virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio" virt-test and sometimes it fails with incorrect signal after opening the port on host side.

Version-Release number of selected component (if applicable):
qemu-kvm-rhev-2.3.0-1.el7

How reproducible:
~50%

Steps to Reproduce:
1. Start guest with at least one virtserialport
2. Set the port to async mode and create sigio handler which stores the emitted signals
3. Check if no signals are emitted
4. Open the port on the host side
5. Verify it emitted POLLOUT

Actual results:
POLLOUT is emitted only in ~50% of test executions

Expected results:
POLLOUT is emitted and the test passes

Additional info:
The sigio test has multiple steps and the POLLOUT is the first one. When this one passes, all remaining checks (POLLOUT|POLLIN, POLLHUP|POLLIN, ...) works fine.

Comment 1 Lukáš Doktor 2015-06-03 17:05:36 UTC
Created attachment 1034414 [details]
virttest log of a 2 variant test where the 1st one passes and the 2nd one fails

Comment 3 Amit Shah 2015-06-11 12:00:04 UTC
Is this specific to ppc?  Does it reproduce on x86?

Comment 5 Qunfang Zhang 2015-07-02 06:51:45 UTC
I run the cases for several times, the test failed and does not generates SIGIO signals correctly. However it seems the test failed on other events (IN HUP) in my side. As this case failed and the signals have problem, I will ack it first, any problem please correct me.

BTW, I also run the test on x86, the first time it passed. Failed at the second time, failed event is IN OUT HUP.

Comment 6 Qunfang Zhang 2015-07-02 06:54:04 UTC
Created attachment 1045367 [details]
test_log_power

Comment 7 Qunfang Zhang 2015-07-02 06:54:57 UTC
Created attachment 1045368 [details]
test_log_x86

Comment 8 Lukáš Doktor 2015-07-17 13:47:07 UTC
The Qunfang problem might not be related to this one. I took a look and the test does not read-outs the previously set signals, which might affect the results. The fix for this is here:

https://github.com/autotest/tp-qemu/pull/416

I included this fix and re-tested it on ppc64be host with ppc64le guest with those results:

https://kvmqe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHELSA-devels/view/PPC64BEguestLEhost/job/30_run_test_PPC64LE2BE/13/testReport/junit/(root)/VirtTest/type_specific_io_github_autotest_qemu_virtio_console_spread_linear_specifiable_virtserialport_virtio_console_smoke_sigio/

09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.async('vs1', True, 0)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 1
09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.async('vs1', True, 0)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.set_pool_want_return('vs1', select.POLLOUT)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.set_pool_want_return('vs1', select.POLLOUT | select.POLLIN)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:15 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:16 qemu_virti L0310 DEBUG| Executing 'virt.set_pool_want_return('vs1', select.POLLHUP | select.POLLIN)' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:16 qemu_virti L0310 DEBUG| Executing 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, timeout: 10
09:37:16 virtio_con L2002 ERROR| Original traceback:
Traceback (most recent call last):
  File "/home/jenkins/autotest/client/shared/error.py", line 141, in new_fn
    return fn(*args, **kwargs)
  File "/home/jenkins/virt-test/test-providers.d/downloads/io-github-autotest-qemu/qemu/tests/virtio_console.py", line 344, in test_sigio
    guest_worker.cmd("virt.get_sigio_poll_return('%s')" % (port.name), 10)
  File "/home/jenkins/virt-test/virttest/qemu_virtio_port.py", line 296, in cmd
    (cmd, self.vm.name, data))
VirtioPortException: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: virt-tests-vm1, output:
virt.get_sigio_poll_return('vs1')
FAIL: Events:   Expected: IN HUP
[(3, 17)]

Which means it handled some signals properly, but it did not generate signal when the host disconnected.

I never saw this failure on x86 (Fedora), so I guess it's ppc64 specific in deed.

Comment 9 Lukáš Doktor 2015-07-17 13:50:57 UTC
Although I tried this on arm64 and it works fine:

https://kvmqe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHELSA-devels/view/ARM64-stable/job/30_run_test_ARM64-stable/5/

Comment 12 David Gibson 2016-07-12 02:33:16 UTC
Can you reproduce the problem with a 2.6 version qemu?

Comment 13 Laurent Vivier 2016-07-25 12:11:12 UTC
qemu-kvm-rhev-2.6.0-15.el7.ppc64le
Avocado 38.0

# avocado_ppc64le virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio
JOB ID     : 0152c4f48158603ba8456149d72851dda0792b78
JOB LOG    : /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/job.log
TESTS      : 1
 (1/1) type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio: \** Message: pygobject_register_sinkfunc is deprecated (GstObject)
** Message: pygobject_register_sinkfunc is deprecated (GstObject)
-** Message: pygobject_register_sinkfunc is deprecated (GstObject)
FAIL (106.14 s)
RESULTS    : PASS 0 | ERROR 0 | FAIL 1 | SKIP 0 | WARN 0 | INTERRUPT 0
JOB HTML   : /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/html/results.html
TESTS TIME : 106.14 s

Same results as in comment #5:

07:24:42 DEBUG| Host does not support OpenVSwitch: Command 'ovs-vswitchd' could not be found in any of the PATH dirs: ['/bin', '/sbin', '/usr/bin', '/usr/local/sbin', '/usr/libexec', '/root/bin', '/usr/sbin', '/usr/local/bin']
07:24:42 DEBUG| Checking image file /var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-ppc64le.qcow2
07:24:42 DEBUG| Copying /var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-ppc64le.qcow2.backup -> /var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-ppc64le.qcow2
07:24:44 WARNI| avocado-vt-vm1 is not alive. Can't query the register status
07:24:59 DEBUG| /usr/lib64/python2.7/site-packages/gobject/constants.py:24: Warning: g_boxed_type_register_static: assertion 'g_type_from_name (name) == 0' failed
07:24:59 DEBUG|   import gobject._gobject
07:25:46 DEBUG| Encoding video file /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/test-results/1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio/screendumps_avocado-vt-vm1_20042_iter0.ogg
07:25:46 INFO | Video creation failed for /root/avocado/job-results/job-2016-07-25T07.23-0152c4f/test-results/1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio/screendumps_avocado-vt-vm1_20042_iter0: multifilesrc
07:25:46 DEBUG| Thread quit. Used to failed to get register info from guest 20160725-072401-TgOoKC5h for 6 times.
07:25:47 INFO | Running 'true'
07:25:47 INFO | Running 'ps -o comm 1'
07:25:47 DEBUG| [stdout] COMMAND
07:25:47 DEBUG| [stdout] systemd
07:25:47 ERROR| 
07:25:47 ERROR| Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado_vt/test.py:420
07:25:47 ERROR| Traceback (most recent call last):
07:25:47 ERROR|   File "/usr/lib/python2.7/site-packages/avocado_vt/test.py", line 206, in runTest
07:25:47 ERROR|     raise exceptions.TestFail(details)
07:25:47 ERROR| TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output:
07:25:47 ERROR| virt.get_sigio_poll_return('vs1')
07:25:47 ERROR| FAIL: Events:   Expected: IN HUP 
07:25:47 ERROR| [(3, 17)]
07:25:47 ERROR|     [context: Executing test: test_sigio]
07:25:47 ERROR| 
07:25:47 ERROR| FAIL 1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio -> TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output:
virt.get_sigio_poll_return('vs1')
FAIL: Events:   Expected: IN HUP 
[(3, 17)]
    [context: Executing test: test_sigio]
07:25:47 INFO | 

But the second time it works:

# avocado_ppc64le virtio_console.spread_linear.specifiae.virtserialport.virtio_console_smoke.sigio
JOB ID     : 2687f4aede7fc3da65017803a8f756825e3ac763
JOB LOG    : /root/avocado/job-results/job-2016-07-25T07.56-2687f4a/job.log
TESTS      : 1
 (1/1) type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio: \** Message: pygobject_register_sinkfunc is deprecated (GstObject)
PASS (46.12 s)
RESULTS    : PASS 1 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 0
JOB HTML   : /root/avocado/job-results/job-2016-07-25T07.56-2687f4a/html/results.html
TESTS TIME : 46.12 s

And the third time fails as described with missing OUT event:

2016-07-25 08:00:04,188 stacktrace       L0041 ERROR| Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado_vt/test.py:420
2016-07-25 08:00:04,188 stacktrace       L0044 ERROR| Traceback (most recent call last):
2016-07-25 08:00:04,188 stacktrace       L0044 ERROR|   File "/usr/lib/python2.7/site-packages/avocado_vt/test.py", line 206, in runTest
2016-07-25 08:00:04,188 stacktrace       L0044 ERROR|     raise exceptions.TestFail(details)
2016-07-25 08:00:04,188 stacktrace       L0044 ERROR| TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output:
2016-07-25 08:00:04,188 stacktrace       L0044 ERROR| virt.get_sigio_poll_return('vs1')
2016-07-25 08:00:04,189 stacktrace       L0044 ERROR| FAIL: Events:   Expected: IN OUT 
2016-07-25 08:00:04,189 stacktrace       L0044 ERROR| [(3, 5)]
2016-07-25 08:00:04,189 stacktrace       L0044 ERROR|     [context: Executing test: test_sigio]
2016-07-25 08:00:04,189 stacktrace       L0045 ERROR| 
2016-07-25 08:00:04,189 test             L0580 ERROR| FAIL 1-type_specific.io-github-autotest-qemu.virtio_console.spread_linear.specifiable.virtserialport.virtio_console_smoke.sigio -> TestFail: test_sigio failed: Failed to execute 'virt.get_sigio_poll_return('vs1')' on virtio_console_guest.py, vm: avocado-vt-vm1, output:
virt.get_sigio_poll_return('vs1')
FAIL: Events:   Expected: IN OUT 
[(3, 5)]

Comment 14 Laurent Vivier 2016-07-26 13:40:12 UTC
I'm trying to reproduce this bug out of avocado.

Steps to Reproduce:
1. Start guest with at least one virtserialport

/usr/libexec/qemu-kvm -name avocado-vt-vm1 \
         -nodefaults -nographic \
         -machine pseries \
         -chardev stdio,mux=on,signal=off,id=serial_id_serial0 \
         -mon chardev=serial_id_serial0,mode=readline,default \
         -device spapr-vty,reg=0x30000000,chardev=serial_id_serial0 \
         -device virtio-serial-pci,id=virtio_serial_pci3,bus=pci.0,addr=06,disable-legacy=off,disable-modern=on \
         -chardev socket,id=devvs1,path=/tmp/virtio_port-vs1,server,nowait \
         -device virtserialport,chardev=devvs1,name=vs1,id=vs1 \
         -drive id=drive_image1,if=none,format=qcow2,file=$IMAGE \
         -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,bus=pci.0,addr=08,disable-legacy=off,disable-modern=on \
         -m 2048 \
         -smp 2,maxcpus=2,cores=1,threads=1,sockets=2

2. Set the port to async mode and create sigio handler which stores the emitted signals

I start the following program inside the guest:

------8<----- poll.c -------------------------
/* cc -o poll poll.c */
#include<stdio.h>
#include<fcntl.h>
#include<signal.h>
#include <unistd.h>
#include <poll.h>

int fd;

void handler(int sig)
{
    int ret;
    struct pollfd fds;

    fds.fd = fd;
    fds.events = POLLIN | POLLOUT;
    fds.revents = 0;

    ret = poll(&fds, 1, 0);

    printf("EVENT ret %d revents %x\n", ret, fds.revents);
}

int main(void)
{
      signal(SIGIO, handler);
      fd = open("/dev/vport1p1", O_RDWR);
      fcntl(fd, F_SETOWN, getpid());
      fcntl(fd, F_SETFL, O_ASYNC);

      for(;;)
            pause();
}

------8<----- poll.c -------------------------

3. Check if no signals are emitted

  I check there is no output for the previous "poll" program.

4. Open the port on the host side

I run the following script on the host:

----8<---- open.py ----------------------
#!/usr/bin/python
import socket
sock = socket.socket(socket.AF_UNIX)
sock.connect("/tmp/virtio_port-vs1");
raw_input("Press Enter to exit...")
----8<---- open.py ----------------------

5. Verify it emitted POLLOUT

I check the "poll" program emits POLLOUT:

EVENT ret 1 revents 4

Lukáš, is that the test case you describe?
       but I'm not able to reproduce the problem: what is wrong?

Are you sure the timing of your avocado test script is ok: the open on the host happens after the "O_ASYNC" in the guest?

Comment 15 Lukáš Doktor 2016-08-03 07:44:14 UTC
Hello Laurent,

the avocado tests does basically this:

0. set's the sigio handler to read signals of registered fds and append them to their mask
            p = select.poll()
            map(p.register, self.poll_fds.keys())

            masks = p.poll(1)
            print masks
            for mask in masks:
                self.poll_fds[mask[0]][1] |= mask[1]
1. if the port was open on host, closes it
            self.sock.shutdown(socket.SHUT_RDWR)
            self.sock.close()
2. cleans all the previously generated polls on guest
        fd = self._open([port])[0]

        maskstr = self.pollmask_to_str(self.poll_fds[fd][1])
        if (self.poll_fds[fd][0] ^ self.poll_fds[fd][1]):
            emaskstr = self.pollmask_to_str(self.poll_fds[fd][0])
            print "FAIL: Events: " + maskstr + "  Expected: " + emaskstr
        else:
            print "PASS: Events: " + maskstr
        self.poll_fds[fd][1] = 0
3. enables async on guest
                self.catch_signal = True
                os.kill(os.getpid(), signal.SIGUSR1)
                self.use_config.wait()
                fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_ASYNC)
                self.poll_fds[fd] = [exp_val, 0]
4. cleans all the previously generated polls on guest (again)
5. sets the expect signals to be select.POLLOUT on guest (which also cleans the previously generated signals)
        fd = self._open([port])[0]
        self.poll_fds[fd] = [poll_value, 0]
        print "PASS: Events: " + self.pollmask_to_str(poll_value)
6. opens port on host
                self.sock = socket.socket(socket.AF_UNIX,
                                          socket.SOCK_STREAM)
                self.sock.settimeout(1)
                self.sock.connect(self.hostfile)
                self.sock.setsockopt(1, socket.SO_SNDBUF, SOCKET_SIZE)
                self.sock.settimeout(None)
                self.port_was_opened = True
7. queries for poll returns on guest

The step 7 reports unexpected status. Unfortunately the test does not log the guest output, which lists the actual generated signals. I sent a PR to fix it: https://github.com/autotest/tp-qemu/pull/724 I'll try to reproduce it with this fix to get the actual signals...

Comment 16 Lukáš Doktor 2016-08-07 15:15:44 UTC
Hello Laurent, the problem is in test itself, it does not wait for the signals to arrive, therefor sometimes the signals are not yet emitted, but with short timeout it passes.

With the fix I noticed rare failure later in the test, but I had no time to properly analyze it. Let me update this BZ when I have the proper fix (next week).

I'm sorry for bugging you with this problem, one would think that 6 years old test would work correctly :-)

Comment 17 Lukáš Doktor 2016-08-09 10:25:47 UTC
Indeed, this is not a ppc64 bug. The fix to the guest script used in this check can be found here: https://github.com/avocado-framework/avocado-vt/pull/651 and with it everything works well (~13 runs).


Note You need to log in before you can comment on or make changes to this bug.