Description of problem: Every few minutes a client connection is closed. /var/log/messages shows: traps: ocserv-worker[4869] general protection ip:7fdbbb36be37 sp:7ffd7a58e740 error:0 in libc-2.17.so[7fdbbb335000+1b7000] Version-Release number of selected component (if applicable): ocserv-0.11.3-1.el7.x86_64 on CentOS 7. How reproducible: Can't get rid of it on my server. Steps to Reproduce: 1. Start ocserv 2. Connect with an OpenConnect client 3. Wait a few minutes Actual results: Connection is closed after a few minutes, although the client will usually automatically reopen the connection. Expected results: Stable connection. Additional info: When ocserv is run in the foreground with the '-f' option, this message is shown instead: k: ipc.pb-c.c:437: udp_fd_msg__free_unpacked: Assertion `message->base.descriptor == &udp_fd_msg__descriptor' failed. Apparently the assertion failure causes a crash when run as daemon, probably because stderr is not open then. A little debugging shows that the assertion failure only happens when udp_fd_msg__free_unpacked() is called from dtls_pull() in worker-vpn.c, not from any of the other locations (in worker-misc.c). Some additional debugging info in udp_fd_msg__free_unpacked() for example gives: message=0x7f518acad0c0 message->base.descriptor=0x7f518acb19a0 &udp_fd_msg__descriptor=0x7f5189e9d000 With the '-d6' option the error is always directly preceded by this information: ocserv[12277]: main: <ip>:51169: unexpected DTLS content type: 23; possibly a firewall disassociated a UDP session ocserv[12277]: main[roel]: <ip>:64207 sending (socket) message 10 to worker ocserv[12277]: main[roel]: <ip>:64207 passed UDP socket from <ip>:51169 ocserv[12311]: worker[roel]: <ip> worker received message udp fd of 102 bytes ocserv[12311]: worker[roel]: <ip> received another a UDP fd! ocserv[12311]: worker[roel]: <ip> received new UDP fd and connected to peer Should dtls_pull() indeed only be called with a UdpFdMsg? If it is called with another message type, it is probably only the assert() in ipc.pb-c.c that will give problems.
There is a very similar crash fixed on upstream git repository. Could you very that it addresses your issue? If yes, I'll try to release soon and push it to epel. https://gitlab.com/ocserv/ocserv/
Thanks for the suggestion, Nikos. Were you referring to the 'recv_from_new_fd: update tmsg pointer' fix (2ffd80509d7b0f8b07e3f978fbdabb34c08b414d)? I first built ocserv_0_11_3 from source and was able to reproduce the problem. I then built the master HEAD (5825a2cd3e6d429e732b01870818c83bd6d1035a) and had it running for a few hours without being able to reproduce the problem (so far). So you seem to be right about that fix!
ocserv-0.11.4-1.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-7e1133dae0
I assume that this ticket can be closed now?
ocserv-0.11.4-1.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-7e1133dae0
The bug is on auto-pilot now. It will close once 0.11.4 is pushed to stable.
ocserv-0.11.4-1.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.