Bug 1370881 - gnutls-3.5.3 breaks OpenConnect using DTLS-0.9 (Cisco AnyConnect VPN)
Summary: gnutls-3.5.3 breaks OpenConnect using DTLS-0.9 (Cisco AnyConnect VPN)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: gnutls
Version: 25
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nikos Mavrogiannopoulos
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-27 20:55 UTC by Lars Seipel
Modified: 2016-09-06 18:23 UTC (History)
3 users (show)

Fixed In Version: gnutls-3.5.3-2.fc25
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-06 18:23:59 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
verbose openconnect output (see end of file for failure) (183.98 KB, text/plain)
2016-08-27 20:55 UTC, Lars Seipel
no flags Details
Import sliding window code from OpenConnect (13.29 KB, patch)
2016-08-30 14:39 UTC, David Woodhouse
no flags Details | Diff

Description Lars Seipel 2016-08-27 20:55:59 UTC
Created attachment 1194946 [details]
verbose openconnect output (see end of file for failure)

Following the F25 update to gnutls-3.5.3-1, VPN connections established through OpenConnect (against an AnyConnect VPN provided by a Cisco ASA device) started to fail, with the remote side appearing dead to OpenConnect:

DTLS Dead Peer Detection detected dead peer!

After re-establishment, the connection works again for some time, then the process is repeated. The connection only fails if it's actually used for something. Not quite sure yet what exactly triggers it. Logging into a RHEV/oVirt web interface seems quite reliable, but I've also seen it happen during DNF package installs.

Disabling DTLS (--no-dtls to openconnect) makes things work again, as does downgrading of gnutls to the previous 3.5.2 version. Looking at the upstream changelog, 3.5.3 appears to introduce a new DTLS sliding window implementation, maybe related?

Version-Release number of selected component (if applicable):
gnutls-3.5.3-1.fc25.x86_64
openconnect-7.07-2.fc25.x86_64


Steps to Reproduce:
1. connect to AnyConnect VPN using OpenConnect
2. use it for some time (not sure what exactly triggers it, doesn't take long though)
3. connection dies with "DTLS Dead Peer Detection detected dead peer!"


Additional info:
I'm well aware that Cisco's DTLS implementation is quite non-standard, but grepping through the GnuTLS code, it seems to me that the intent is to support it (as DTLS0.9), which is why I'm filing this bug against GnuTLS.

OpenConnect does not show the issue when using gnutls-3.5.2-1.fc25 or when built against OpenSSL.

Comment 1 Nikos Mavrogiannopoulos 2016-08-29 06:32:21 UTC
Hi, if you downgrade to 3.5.2 does it work? Do you have any debugging output when this issue happens?

Comment 2 Nikos Mavrogiannopoulos 2016-08-29 06:34:08 UTC
The best would be if you could provide some gnutls debugging output while that issue occurs. To get that you can set GNUTLS_DEBUG_LEVEL=6 environment variable prior to running openconnect.

Comment 3 David Woodhouse 2016-08-29 07:49:44 UTC
You can also add --gnutls-debug=6 to the openconnect command line.

It might be worth getting a packet capture from the real network, showing the DTLS packets. The sequence numbers are in the clear, aren't they? And the only obvious DTLS change I see between 3.5.2 and 3.5.3 is related to the sequence number sliding window — so we hopefully don't even *need* to decrypt the packets to work out what the problem is.

Comment 4 Nikos Mavrogiannopoulos 2016-08-29 08:19:52 UTC
David reproduced the issue and send the log back. I'll try to figure the issue from that.

Comment 5 David Woodhouse 2016-08-29 08:29:12 UTC
REC[0x15970d0]: Decrypted Packet[1.798] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.803] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.800] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.805] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.802] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.807] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.804] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.808] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.806] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.810] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.809] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.812] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.814] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.816] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.818] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.820] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.822] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.824] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.826] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.828] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.830] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.832] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.834] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.836] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.838] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.840] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.842]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.844]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.846]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.848]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.849]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.851]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.853]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.855]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.857]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.859]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.861]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.863]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.865]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.867]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.869]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.871]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.873]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.875]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.877]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.879]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.881]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.883]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.885]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.887]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.888]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.890]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.892]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.894]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.896]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.898]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.900]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.902]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.904]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.906]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.908]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.910]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.912]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.914]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.916]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.918]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.920]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.922]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.924]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.926]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.928]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.929]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.931]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.933]: Application Data
REC[0x15970d0]: Decrypted Packet[1.811] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.813] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.935]: Application Data
REC[0x15970d0]: Decrypted Packet[1.815] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.817] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.937]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.939]: Application Data
REC[0x15970d0]: Decrypted Packet[1.821] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.941]: Application Data
REC[0x15970d0]: Decrypted Packet[1.823] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.943]: Application Data
REC[0x15970d0]: Decrypted Packet[1.825] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.827] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.945]: Application Data
REC[0x15970d0]: Decrypted Packet[1.829] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.947]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.949]: Application Data
REC[0x15970d0]: Decrypted Packet[1.831] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.833] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.951]: Application Data
REC[0x15970d0]: Decrypted Packet[1.835] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.953]: Application Data
REC[0x15970d0]: Decrypted Packet[1.837] Application Data(23) with length: 1267
REC[0x15970d0]: Decrypted Packet[1.839] Application Data(23) with length: 1267
REC[0x15970d0]: Discarded duplicate message[1.955]: Application Data
REC[0x15970d0]: Discarded duplicate message[1.841]: Application Data
... and it never recovers...

Comment 7 Fedora Update System 2016-08-29 13:06:42 UTC
gnutls-3.5.3-2.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-66cab49b1f

Comment 8 David Woodhouse 2016-08-29 16:31:58 UTC
Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you only act on this when a packet passes the MAC validation, why do you not *always* advance the window so that it ends at the latest-received packet? 

And if you are marking packets received before their MAC is checked, we have bigger things to talk about...

Comment 9 Fedora Update System 2016-08-29 22:55:32 UTC
gnutls-3.5.3-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-66cab49b1f

Comment 10 Nikos Mavrogiannopoulos 2016-08-30 04:35:42 UTC
(In reply to David Woodhouse from comment #8)
> Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you
> only act on this when a packet passes the MAC validation, why do you not
> *always* advance the window so that it ends at the latest-received packet? 

What if first packet is 1 and you receive 3, 6, 5, 4, 2. If window is advanced to the latest 5, 4, 2 would be discarded.

> And if you are marking packets received before their MAC is checked, we have
> bigger things to talk about...

That's after MAC is checked.

Comment 11 David Woodhouse 2016-08-30 09:53:47 UTC
(In reply to Nikos Mavrogiannopoulos from comment #10)
> (In reply to David Woodhouse from comment #8)
> > Not sure I understand DTLS_WINDOW_RECEIVED() being conditional. Assuming you
> > only act on this when a packet passes the MAC validation, why do you not
> > *always* advance the window so that it ends at the latest-received packet? 
> 
> What if first packet is 1 and you receive 3, 6, 5, 4, 2. If window is
> advanced to the latest 5, 4, 2 would be discarded.

The way this usually works is the window advances so that the *END* of the window is the last-received packet.

So the *end* of the window is advanced to the latest 6. The bitmap then represents the packets leading up to that.

cf. http://git.infradead.org/users/dwmw2/openconnect.git/blob/HEAD:/esp.c#l33

Comment 12 David Woodhouse 2016-08-30 14:39:48 UTC
Created attachment 1195938 [details]
Import sliding window code from OpenConnect

Here's my code from OpenConnect. We *always* advance the sliding window so it ends at the latest received packet. We never discard a packet because it's "too new".

Comment 13 David Woodhouse 2016-08-31 11:31:03 UTC
Updated patch in https://gitlab.com/gnutls/gnutls/merge_requests/55

Comment 14 Fedora Update System 2016-09-06 18:23:53 UTC
gnutls-3.5.3-2.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.