Bug 1363817 - ocserv-worker thread crashes every few minutes
Summary: ocserv-worker thread crashes every few minutes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora EPEL
Classification: Fedora
Component: ocserv
Version: epel7
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Nikos Mavrogiannopoulos
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-03 16:05 UTC by Roel van de Kraats
Modified: 2016-08-08 21:17 UTC (History)
1 user (show)

Fixed In Version: ocserv-0.11.4-1.el7
Clone Of:
Environment:
Last Closed: 2016-08-08 21:17:07 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Roel van de Kraats 2016-08-03 16:05:37 UTC
Description of problem:
Every few minutes a client connection is closed. /var/log/messages shows:
traps: ocserv-worker[4869] general protection ip:7fdbbb36be37 sp:7ffd7a58e740 error:0 in libc-2.17.so[7fdbbb335000+1b7000]

Version-Release number of selected component (if applicable):
ocserv-0.11.3-1.el7.x86_64 on CentOS 7.

How reproducible:
Can't get rid of it on my server.

Steps to Reproduce:
1. Start ocserv
2. Connect with an OpenConnect client
3. Wait a few minutes

Actual results:
Connection is closed after a few minutes, although the client will usually automatically reopen the connection.

Expected results:
Stable connection.

Additional info:

When ocserv is run in the foreground with the '-f' option, this message is shown instead:
k: ipc.pb-c.c:437: udp_fd_msg__free_unpacked: Assertion `message->base.descriptor == &udp_fd_msg__descriptor' failed.

Apparently the assertion failure causes a crash when run as daemon, probably because stderr is not open then.

A little debugging shows that the assertion failure only happens when  udp_fd_msg__free_unpacked() is called from dtls_pull() in worker-vpn.c, not from any of the other locations (in worker-misc.c). Some additional debugging info in udp_fd_msg__free_unpacked() for example gives:
message=0x7f518acad0c0 message->base.descriptor=0x7f518acb19a0 &udp_fd_msg__descriptor=0x7f5189e9d000

With the '-d6' option the error is always directly preceded by this information:
ocserv[12277]: main: <ip>:51169: unexpected DTLS content type: 23; possibly a firewall disassociated a UDP session
ocserv[12277]: main[roel]: <ip>:64207 sending (socket) message 10 to worker
ocserv[12277]: main[roel]: <ip>:64207 passed UDP socket from <ip>:51169
ocserv[12311]: worker[roel]: <ip> worker received message udp fd of 102 bytes
ocserv[12311]: worker[roel]: <ip> received another a UDP fd!
ocserv[12311]: worker[roel]: <ip> received new UDP fd and connected to peer

Should dtls_pull() indeed only be called with a UdpFdMsg? If it is called with another message type, it is probably only the assert() in ipc.pb-c.c that will give problems.

Comment 1 Nikos Mavrogiannopoulos 2016-08-04 06:33:30 UTC
There is a very similar crash fixed on upstream git repository. Could you very that it addresses your issue? If yes, I'll try to release soon and push it to epel.

https://gitlab.com/ocserv/ocserv/

Comment 2 Roel van de Kraats 2016-08-04 19:06:14 UTC
Thanks for the suggestion, Nikos. Were you referring to the 'recv_from_new_fd: update tmsg pointer' fix (2ffd80509d7b0f8b07e3f978fbdabb34c08b414d)?

I first built ocserv_0_11_3 from source and was able to reproduce the problem. I then built the master HEAD (5825a2cd3e6d429e732b01870818c83bd6d1035a) and had it running for a few hours without being able to reproduce the problem (so far). So you seem to be right about that fix!

Comment 3 Fedora Update System 2016-08-05 11:38:04 UTC
ocserv-0.11.4-1.el7 has been submitted as an update to Fedora EPEL 7. https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-7e1133dae0

Comment 4 Roel van de Kraats 2016-08-05 19:39:32 UTC
I assume that this ticket can be closed now?

Comment 5 Fedora Update System 2016-08-05 22:17:57 UTC
ocserv-0.11.4-1.el7 has been pushed to the Fedora EPEL 7 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2016-7e1133dae0

Comment 6 Nikos Mavrogiannopoulos 2016-08-08 12:28:21 UTC
The bug is on auto-pilot now. It will close once 0.11.4 is pushed to stable.

Comment 7 Fedora Update System 2016-08-08 21:17:05 UTC
ocserv-0.11.4-1.el7 has been pushed to the Fedora EPEL 7 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.