Bug 1570502

Summary: alsa->pulse: triggers hang on poll() in snd1_pcm_wait_nocheck ()
Product: [Fedora] Fedora Reporter: aalba6675 <ascanio.alba7>
Component: alsa-libAssignee: Jaroslav Kysela <jkysela>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 28CC: jkysela
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-28 19:56:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description aalba6675 2018-04-23 06:13:58 UTC
Description of problem:
Using alsa->pulse I have triggered something that looks like BZ #534130 again.

An application pjsua from https://github.com/pjsip/pjproject compiled to use alsa hangs when closing the capture device alsa->pulse in poll() inside snd1_pcm_wait_nocheck().

pjsua used to work under Fedora 27. The same issue occurs if using the binary compiled on F27 and recompiling under Fedora 28.


Version-Release number of selected component (if applicable):
alsa-lib-1.1.6-2.fc28.x86_64
alsa-plugins-pulseaudio-1.1.6-3.fc28.x86_64


How reproducible:
Always

Steps to Reproduce:
1. Build pjsua on F27/F28 with alsa sound driver
2. Run pjsua, make a call, and try to exit gracefully
3.

Actual results:
Hang 

(gdb) bt
#0  0x00007ff0f0cba929 in poll () at /lib64/libc.so.6
#1  0x00007ff0f164e293 in snd1_pcm_wait_nocheck () at /lib64/libasound.so.2
#2  0x00007ff0f1698274 in snd_pcm_ioplug_drain () at /lib64/libasound.so.2
#3  0x00007ff0f457a868 in ca_thread_func (arg=0x15236a8) at ../src/pjmedia-audiodev/alsa_dev.c:596
#4  0x00007ff0f2c1dca9 in thread_main (param=0x15367c0) at ../src/pj/os_core_unix.c:541
#5  0x00007ff0f1902564 in start_thread () at /lib64/libpthread.so.0
#6  0x00007ff0f0cc531f in clone () at /lib64/libc.so.6


Expected results:
App exits gracefully


Additional info:
Works fine with native Alsa devices, i.e., blacklist device from pulse in pavucontrol

At the hanging point the app is trying to close a capture device; the thread function that blocks looks like

static int ca_thread_func (void *arg)
{
    struct alsa_stream* stream = (struct alsa_stream*) arg;
    snd_pcm_t* pcm             = stream->ca_pcm;
    int size                   = stream->ca_buf_size;
    snd_pcm_uframes_t nframes  = stream->ca_frames;
    void* user_data            = stream->user_data;
    char* buf                  = stream->ca_buf;
    pj_timestamp tstamp;
    int result;
    struct sched_param param;
    pthread_t* thid;

    thid = (pthread_t*) pj_thread_get_os_handle (pj_thread_this());
    param.sched_priority = sched_get_priority_max (SCHED_RR);
    PJ_LOG (5,(THIS_FILE, "ca_thread_func(%u): Set thread priority "
                          "for audio capture thread.",
                          (unsigned)syscall(SYS_gettid)));
    result = pthread_setschedparam (*thid, SCHED_RR, &param);
    if (result) {
        if (result == EPERM)
            PJ_LOG (5,(THIS_FILE, "Unable to increase thread priority, "
                                  "root access needed."));
        else
            PJ_LOG (5,(THIS_FILE, "Unable to increase thread priority, "
                                  "error: %d",
                                  result));
    }

    pj_bzero (buf, size);
    tstamp.u64 = 0;

    TRACE_((THIS_FILE, "ca_thread_func(%u): Started",
            (unsigned)syscall(SYS_gettid)));

    snd_pcm_prepare (pcm);


    TRACE_((THIS_FILE, "ca_thread_func(%u): Started",
            (unsigned)syscall(SYS_gettid)));

    snd_pcm_prepare (pcm);

    while (!stream->quit) {
        pjmedia_frame frame;

        pj_bzero (buf, size);
        result = snd_pcm_readi (pcm, buf, nframes);
        if (result == -EPIPE) {
            PJ_LOG (4,(THIS_FILE, "ca_thread_func: overrun!"));
            snd_pcm_prepare (pcm);
            continue;
        } else if (result < 0) {
            PJ_LOG (4,(THIS_FILE, "ca_thread_func: error reading data!"));
        }
        if (stream->quit)
            break;

        frame.type = PJMEDIA_FRAME_TYPE_AUDIO;
        frame.buf = (void*) buf;
        frame.size = size;
        frame.timestamp.u64 = tstamp.u64;
        frame.bit_info = 0;

        result = stream->ca_cb (user_data, &frame);
        if (result != PJ_SUCCESS || stream->quit)
            break;

        tstamp.u64 += nframes;
    }
    snd_pcm_drain (pcm);
    TRACE_((THIS_FILE, "ca_thread_func: Stopped"));

    return PJ_SUCCESS;
}

The hang seems to be in snd_pcm_drain(pcm)

Comment 1 aalba6675 2018-04-23 06:59:39 UTC
F28 versions:
portaudio-19-27.fc28.x86_64
alsa-plugins-pulseaudio-1.1.6-3.fc28.x86_64


Reverting shared objects to those of F27

portaudio-19-26.fc27.x86_64
/usr/lib64/pulse-11.1/modules:
libalsa-util.so  module-alsa-card.so  module-alsa-sink.so  module-alsa-source.so

alsa-plugins-pulseaudio-1.1.5-1.fc27.x86_64
/usr/lib64/alsa-lib/:
libasound_module_conf_pulse.so  libasound_module_ctl_pulse.so  libasound_module_pcm_pulse.so

makes the application work again.

Comment 2 aalba6675 2018-04-23 07:57:15 UTC
False positive alas. Further testing with the Fedora 27 binaries also exhibit the same problem.

Comment 3 Ben Cotton 2019-05-02 21:54:18 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 4 Ben Cotton 2019-05-28 19:56:54 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.