Created attachment 1194946 [details] verbose openconnect output (see end of file for failure) Following the F25 update to gnutls-3.5.3-1, VPN connections established through OpenConnect (against an AnyConnect VPN provided by a Cisco ASA device) started to fail, with the remote side appearing dead to OpenConnect: DTLS Dead Peer Detection detected dead peer! After re-establishment, the connection works again for some time, then the process is repeated. The connection only fails if it's actually used for something. Not quite sure yet what exactly triggers it. Logging into a RHEV/oVirt web interface seems quite reliable, but I've also seen it happen during DNF package installs. Disabling DTLS (--no-dtls to openconnect) makes things work again, as does downgrading of gnutls to the previous 3.5.2 version. Looking at the upstream changelog, 3.5.3 appears to introduce a new DTLS sliding window implementation, maybe related? Version-Release number of selected component (if applicable): gnutls-3.5.3-1.fc25.x86_64 openconnect-7.07-2.fc25.x86_64 Steps to Reproduce: 1. connect to AnyConnect VPN using OpenConnect 2. use it for some time (not sure what exactly triggers it, doesn't take long though) 3. connection dies with "DTLS Dead Peer Detection detected dead peer!" Additional info: I'm well aware that Cisco's DTLS implementation is quite non-standard, but grepping through the GnuTLS code, it seems to me that the intent is to support it (as DTLS0.9), which is why I'm filing this bug against GnuTLS. OpenConnect does not show the issue when using gnutls-3.5.2-1.fc25 or when built against OpenSSL.
Hi, if you downgrade to 3.5.2 does it work? Do you have any debugging output when this issue happens?
The best would be if you could provide some gnutls debugging output while that issue occurs. To get that you can set GNUTLS_DEBUG_LEVEL=6 environment variable prior to running openconnect.
You can also add --gnutls-debug=6 to the openconnect command line. It might be worth getting a packet capture from the real network, showing the DTLS packets. The sequence numbers are in the clear, aren't they? And the only obvious DTLS change I see between 3.5.2 and 3.5.3 is related to the sequence number sliding window — so we hopefully don't even *need* to decrypt the packets to work out what the problem is.
David reproduced the issue and send the log back. I'll try to figure the issue from that.
REC[0x15970d0]: Decrypted Packet[1.798] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.803] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.800] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.805] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.802] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.807] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.804] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.808] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.806] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.810] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.809] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.812] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.814] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.816] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.818] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.820] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.822] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.824] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.826] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.828] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.830] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.832] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.834] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.836] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.838] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.840] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.842]: Application Data REC[0x15970d0]: Discarded duplicate message[1.844]: Application Data REC[0x15970d0]: Discarded duplicate message[1.846]: Application Data REC[0x15970d0]: Discarded duplicate message[1.848]: Application Data REC[0x15970d0]: Discarded duplicate message[1.849]: Application Data REC[0x15970d0]: Discarded duplicate message[1.851]: Application Data REC[0x15970d0]: Discarded duplicate message[1.853]: Application Data REC[0x15970d0]: Discarded duplicate message[1.855]: Application Data REC[0x15970d0]: Discarded duplicate message[1.857]: Application Data REC[0x15970d0]: Discarded duplicate message[1.859]: Application Data REC[0x15970d0]: Discarded duplicate message[1.861]: Application Data REC[0x15970d0]: Discarded duplicate message[1.863]: Application Data REC[0x15970d0]: Discarded duplicate message[1.865]: Application Data REC[0x15970d0]: Discarded duplicate message[1.867]: Application Data REC[0x15970d0]: Discarded duplicate message[1.869]: Application Data REC[0x15970d0]: Discarded duplicate message[1.871]: Application Data REC[0x15970d0]: Discarded duplicate message[1.873]: Application Data REC[0x15970d0]: Discarded duplicate message[1.875]: Application Data REC[0x15970d0]: Discarded duplicate message[1.877]: Application Data REC[0x15970d0]: Discarded duplicate message[1.879]: Application Data REC[0x15970d0]: Discarded duplicate message[1.881]: Application Data REC[0x15970d0]: Discarded duplicate message[1.883]: Application Data REC[0x15970d0]: Discarded duplicate message[1.885]: Application Data REC[0x15970d0]: Discarded duplicate message[1.887]: Application Data REC[0x15970d0]: Discarded duplicate message[1.888]: Application Data REC[0x15970d0]: Discarded duplicate message[1.890]: Application Data REC[0x15970d0]: Discarded duplicate message[1.892]: Application Data REC[0x15970d0]: Discarded duplicate message[1.894]: Application Data REC[0x15970d0]: Discarded duplicate message[1.896]: Application Data REC[0x15970d0]: Discarded duplicate message[1.898]: Application Data REC[0x15970d0]: Discarded duplicate message[1.900]: Application Data REC[0x15970d0]: Discarded duplicate message[1.902]: Application Data REC[0x15970d0]: Discarded duplicate message[1.904]: Application Data REC[0x15970d0]: Discarded duplicate message[1.906]: Application Data REC[0x15970d0]: Discarded duplicate message[1.908]: Application Data REC[0x15970d0]: Discarded duplicate message[1.910]: Application Data REC[0x15970d0]: Discarded duplicate message[1.912]: Application Data REC[0x15970d0]: Discarded duplicate message[1.914]: Application Data REC[0x15970d0]: Discarded duplicate message[1.916]: Application Data REC[0x15970d0]: Discarded duplicate message[1.918]: Application Data REC[0x15970d0]: Discarded duplicate message[1.920]: Application Data REC[0x15970d0]: Discarded duplicate message[1.922]: Application Data REC[0x15970d0]: Discarded duplicate message[1.924]: Application Data REC[0x15970d0]: Discarded duplicate message[1.926]: Application Data REC[0x15970d0]: Discarded duplicate message[1.928]: Application Data REC[0x15970d0]: Discarded duplicate message[1.929]: Application Data REC[0x15970d0]: Discarded duplicate message[1.931]: Application Data REC[0x15970d0]: Discarded duplicate message[1.933]: Application Data REC[0x15970d0]: Decrypted Packet[1.811] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.813] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.935]: Application Data REC[0x15970d0]: Decrypted Packet[1.815] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.817] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.937]: Application Data REC[0x15970d0]: Discarded duplicate message[1.939]: Application Data REC[0x15970d0]: Decrypted Packet[1.821] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.941]: Application Data REC[0x15970d0]: Decrypted Packet[1.823] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.943]: Application Data REC[0x15970d0]: Decrypted Packet[1.825] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.827] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.945]: Application Data REC[0x15970d0]: Decrypted Packet[1.829] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.947]: Application Data REC[0x15970d0]: Discarded duplicate message[1.949]: Application Data REC[0x15970d0]: Decrypted Packet[1.831] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.833] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.951]: Application Data REC[0x15970d0]: Decrypted Packet[1.835] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.953]: Application Data REC[0x15970d0]: Decrypted Packet[1.837] Application Data(23) with length: 1267 REC[0x15970d0]: Decrypted Packet[1.839] Application Data(23) with length: 1267 REC[0x15970d0]: Discarded duplicate message[1.955]: Application Data REC[0x15970d0]: Discarded duplicate message[1.841]: Application Data ... and it never recovers...
https://gitlab.com/gnutls/gnutls/commit/3ca24f24d9322256bc4c6d3bd237f8b98f073030
gnutls-3.5.3-2.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-66cab49b1f
Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you only act on this when a packet passes the MAC validation, why do you not *always* advance the window so that it ends at the latest-received packet? And if you are marking packets received before their MAC is checked, we have bigger things to talk about...
gnutls-3.5.3-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-66cab49b1f
(In reply to David Woodhouse from comment #8) > Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you > only act on this when a packet passes the MAC validation, why do you not > *always* advance the window so that it ends at the latest-received packet? What if first packet is 1 and you receive 3, 6, 5, 4, 2. If window is advanced to the latest 5, 4, 2 would be discarded. > And if you are marking packets received before their MAC is checked, we have > bigger things to talk about... That's after MAC is checked.
(In reply to Nikos Mavrogiannopoulos from comment #10) > (In reply to David Woodhouse from comment #8) > > Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you > > only act on this when a packet passes the MAC validation, why do you not > > *always* advance the window so that it ends at the latest-received packet? > > What if first packet is 1 and you receive 3, 6, 5, 4, 2. If window is > advanced to the latest 5, 4, 2 would be discarded. The way this usually works is the window advances so that the *END* of the window is the last-received packet. So the *end* of the window is advanced to the latest 6. The bitmap then represents the packets leading up to that. cf. http://git.infradead.org/users/dwmw2/openconnect.git/blob/HEAD:/esp.c#l33
Created attachment 1195938 [details] Import sliding window code from OpenConnect Here's my code from OpenConnect. We *always* advance the sliding window so it ends at the latest received packet. We never discard a packet because it's "too new".
Updated patch in https://gitlab.com/gnutls/gnutls/merge_requests/55
gnutls-3.5.3-2.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.