Bug 439771
Description
IBM Bug Proxy
2008-03-31 11:17:53 UTC
Created attachment 299701 [details]
Openswan responder logs
Created attachment 299702 [details]
Openswan initiator Security Association Database
Created attachment 299703 [details]
Strongswan initiator logs
Created attachment 299704 [details]
Openswan initiator logs
Created attachment 299705 [details]
Strongswan responder Security Association Database
Created attachment 299706 [details]
Strongswan responder logs
Is strongswan able to negotiate with racoon2 using this configuration? ------- Comment From tchicks.com 2008-04-03 11:51 EDT------- I've never tested with racoon2. I've actually never used it either, I will look into this. Same yourself a LOT of trouble (that we went through) in configuring racoon2, and look at openswan-2.6.x/testing/pluto/interop-ikev2-racoon2-* for valid configurations For the case where Openswan is the initiator, I've determined that Openswan is at fault for generating bogus keys. First of all, it's starting the prf+ counter from 0 instead of 1, then it's not cloning the st_ni/st_nr values from the parent SA so the seed for the prf+ is null, finally the SA key order is the wrong way around. Created attachment 300892 [details]
Fix child SA key generation
------- Comment From tchicks.com 2008-04-07 20:30 EDT------- Hi Herbert - I have gotten strongswan to act as the initiator and successfully negotiate with racoon2 using the same config that I posted earlier. I also was able to get openswan to act as the initiator and negotiate with racoon2. The only change I had to make to the previous openswan config was to add a "ike=ENCR-HASH-PRF" line to ipsec.conf. While performing these tests, I have discovered a bug with openswan initiating with racoon2, but I am not sure that it is related to this bug. Upon receiving v2R1, Pluto has a hard crash whenever there is no "ike=ENCR-HASH-PRF" line in the ipsec.conf. I don't see any obvious errors in the logs, either. I will create a new bug for this. Please try that last situation against openswan 2.6.10 Herbert: Thanks for the patch. I applied the first and the last bit. When I set ike=3des and esp=3des on strongswan, the interop now works. The middle bit of your patch seems wrong to me: - if(role == INITIATOR) { + if(role != INITIATOR) { [...] st->st_esp.our_keymat = ikeymat.ptr; st->st_esp.peer_keymat= rkeymat.ptr; Your patch would mean that if we are NOT initiator, we would set st->st_esp.our_keymat to ikeymat.ptr instead of rkeymat.ptr. I think what remains of the interop issue now, is Openswan not recognising "aes-XXX" and strongswan not sending "aes". (In reply to comment #13) > Please try that last situation against openswan 2.6.10 That should be 2.6.11, which should make it to the ftp servers in a few hours > I applied the first and the last bit. When I set ike=3des and esp=3des on > strongswan, the interop now works. Did you pass any data through it? > The middle bit of your patch seems wrong to me: > > - if(role == INITIATOR) { > + if(role != INITIATOR) { > [...] > st->st_esp.our_keymat = ikeymat.ptr; > st->st_esp.peer_keymat= rkeymat.ptr; > > Your patch would mean that if we are NOT initiator, we would set > st->st_esp.our_keymat to ikeymat.ptr instead of rkeymat.ptr. Well I'm travelling right now so I can't do a detailed examination but the reason I made the change is that without it the keys were the wrong way around and when I compared the RFC to the code I found that the if statement was the wrong way around too. Of course I might have been wrong. If you can pass data through it then it should be fine. Otherwise you may still need this. Yes I do, see: openswan-2.6.11/testing/interop-ikev2-strongswan-03-psk-initiator/east-console.txt westinit sets a firewall rule for plaintext traffic to be blocked. east-console.txt shows a ping reply after the tunnel is up. interop-ikev2-strongswan-02-psk-responder shows the packet is lost through.... On the -03 test case I also see on the strongswan side: received netlink error: File exists (17) unable to install source route for 192.0.2.254 Re: #17 - I am not sure if I trust that the ping packet actually initiated the connection. tcpdump does not seem to work with our netkey based kernel required for this interop test. Re:#16 - I've tested the 4 PSK interop cases, and I see no difference before and after I apply the middle bit of your patch, so I have not applied it. It will take some more testing to find the real issue and the real fix. Also, ip xfrm state shows nothing for these test cases, so I don't think we actually got a tunnel setup. Probably has to do with the netlink error strongswan gives? Paul, you need to pass data with a peer other than Openswan to prove that Openswan is correct. When I tested with strongswan (its stable version) with Openswan as the initiator, I was able to get the SAs established as was visible through ip xfrm state. However, the key ordering was the wrong way around on Openswan's side without the middle bit. So if you can pass data say with racoon then it should be fine. Otherwise I still think this patch is needed. (In reply to comment #21) > Paul, you need to pass data with a peer other than Openswan to prove that > Openswan is correct. I understand. Hence our 5 interop tests for racoon and 5 interop tests for strongswan. > When I tested with strongswan (its stable version) with Openswan as the > initiator, I was able to get the SAs established as was visible > through ip xfrm state. However, the key ordering was the wrong way around on > Openswan's side without the middle bit. I was not able to reproduce that, as can be seen with the output of the interop testcases in testing/pluto/interop-* > So if you can pass data say with racoon then it should be fine. Otherwise I > still think this patch is needed. It did not pass data in either way. I will do more testing today to see if I can determine the full issue. It also clearly breaks the key material for klips. Our tests are using klips, not netkey, and also use firewall rules to determine test results that won't work using netkey. I did reproduce data flowing with your INITIATOR modification, but I do think you fixed it at the wrong spot. (which may be fine for now for Red Hat, as they do not use klips. I'm still looking into this bug and where to address the issue properly We applied Herbert'x workaround for 2.6.12..... ------- Comment From tchicks.com 2008-04-22 16:41 EDT------- openswan-2.6.12 fixes interop problems when openswan is the initiator. However, the negotiation still fails when strongswan is the initiator. It looks to me that openswan is violating the "Attribute Negotiation" section of RFC 4306: ------ 3.3.6. Attribute Negotiation During security association negotiation, initiators present offers to responders. Responders MUST select a single complete set of parameters from the offers (or reject all offers if none are acceptable). If there are multiple proposals, the responder MUST choose a single proposal number and return all of the Proposal substructures with that Proposal number. If there are multiple Transforms with the same type, the responder MUST choose a single one. Any attributes of a selected transform MUST be returned unmodified. The initiator of an exchange MUST check that the accepted offer is consistent with one of its proposals, and if not that response MUST be rejected. ------ Strongswan's first proposal is aes128-sha1-modp2048 and openswan agrees to it. However, strongswan attaches a key-length transform attribute to the ENCR transform to specify aes128. When openswan responds, it fails to include the key-length attribute and strongswan does the correct thing and rejects the proposal selection. By configuring openswan to only use 3des in parent and child SA's, the negotiation is handled correctly, since no key-length attributes are used. Here's the tcpdump output showing the failure to include the attribute: ------ 15:22:52.225909 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17 ), length: 616) eal1.ltc.austin.ibm.com.isakmp > eal5.ltc.austin.ibm.com.isakmp: [bad udp cksum 80d9!] isakmp 2.0 msgid 00000000 cookie 3a0a591d2f674290->000000 0000000000: parent_sa ikev2_init[I]: (sa: len=216 (p: #1 protoid=isakmp transform=4 len=44 (t: #1 type=encr id=aes (type=keylen value=0080)) (t: #2 type=integ id=hmac-sha ) (t: #3 type=prf id=hmac-sha ) (t: #4 type=dh id=modp2048 )) (p: #2 protoid=isakmp transform=19 len=172 (t: #1 type=encr id=aes (type=keylen value=0080)) (t: #2 type=encr id=aes (type=keylen value=00c0)) (t: #3 type=encr id=aes (type=keylen value=0100)) (t: #4 type=encr id=3des ) (t: #5 type=integ id=#12 ) (t: #6 type=integ id=hmac-sha ) (t: #7 type=integ id=hmac-md5 ) (t: #8 type=integ id=#13 ) (t: #9 type=integ id=#14 ) (t: #10 type=prf id=#5 ) (t: #11 type=prf id=hmac-sha ) (t: #12 type=prf id=hmac-md5 ) (t: #13 type=prf id=#6 ) (t: #14 type=prf id=#7 ) (t: #15 type=dh id=modp2048 ) (t: #16 type=dh id=modp1536 ) (t: #17 type=dh id=modp1024 ) (t: #18 type=dh id=modp4096 ) (t: #19 type=dh id=modp8192 ))) (v2ke: len=256 group=modp2048) (nonce: len=16 nonce=(c7c3c2a54518c49f8f0f6b4cccc26e81) ) (n: prot_id=#0 type=16389(nat_detection_destination_ip)) (n: prot_id=#0 type=16388(nat_detection_source_ip)) 15:22:52.243445 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: UDP (17 ), length: 400) eal5.ltc.austin.ibm.com.isakmp > eal1.ltc.austin.ibm.com.isakmp: [udp sum ok] isakmp 2.0 msgid 00000000 cookie 3a0a591d2f674290->55d1b8608c0a579 a: parent_sa ikev2_init[]: (sa: len=40 (p: #1 protoid=isakmp transform=4 len=40 (t: #1 type=encr id=aes ) (t: #2 type=integ id=hmac-sha ) (t: #3 type=prf id=hmac-sha ) (t: #4 type=dh id=modp2048 ))) (v2ke: len=256 group=modp2048) (nonce[C]: len=16 nonce=(391c53b548a10bbcdcf688b282e28c08) ) (v2vid: len=12 vid=OEw[[^[pTC@N) ------ ------- Comment From tchicks.com 2008-04-22 19:26 EDT------- After taking a closer look at this bug, I don't think openswan-2.6.12 was fixed correctly when openswan is acting as the initiator, either. The default DH group was switched to modp2048 for IKEv2, but openswan still doesn't pay attention to the Notification Data field in an INVALID_KE_PAYLOAD notification. If modp2048 is removed from the possible DH groups in the racoon2 config (leaving only modp1024), openswan cannot negotiate with racoon2. This is because racoon2 rejects openswan's KE_PAYLOAD (which uses modp2048 now) and tells openswan that it wants to use modp1024. Openswan continues to try to get racoon2 to use modp2048. If openswan doesn't take the INVALID_KE_PAYLOAD Notification Data into account on it's next IKE_SA_INIT, then I guess we should be defaulting to modp1024 since it is a MUST- and modp2048 is a MUST+. Created attachment 303405 [details]
pcap of openswan not honoring racoon2's INVALID_KE_PAYLOAD
Racoon2 is trying to let openswan know that it wants to use modp1024, but
openswan ignores the request. 0x0002 can be seen in the INVALID_KE_PAYLOAD's
Notification Data field specifying modp1024.
OK Openswan is definitely in the wrong here. It is violating section 5.3 of RFC 3602 by omitting the key length attribute with both IKEv1 and IKEv2. I'm looking into this issue. Created attachment 303894 [details]
Add AES key length to SADB.
Created attachment 303895 [details]
Transmit IKEv2 key length attribute
Created attachment 303896 [details]
Receive IKEv2 key length attribute
With these three patches I'm able to interoperate with Strongswan in both
directions.
Created attachment 303963 [details]
Transmit IKEv2 key length attribute
This update fixes a crash with debugging enabled.
Created attachment 303964 [details]
Receive IKEv2 key length attribute
As above.
------- Comment From tchicks.com 2008-04-29 12:42 EDT------- Hi Herbert - I'm not able to interop with strongswan if openswan is the initiator and is using aes192. openswan connection: -------- conn openswan-strongswan left=9.3.190.198 right=9.3.190.194 ike=aes192 phase2alg=aes192 ikev2=insist authby=secret auto=add -------- strongswan connection: -------- conn openswan-strongswan left=9.3.190.198 right=9.3.190.194 keyexchange=ikev2 authby=secret auto=add -------- output (from openswan side): -------- [root@eal5 ~]# ipsec auto --verbose --up openswan-strongswan 002 "openswan-strongswan" #1: initiating v2 parent SA 133 "openswan-strongswan" #1: STATE_PARENT_I1: initiate 002 "openswan-strongswan" #1: transition from state STATE_IKEv2_START to state STATE_PARENT_I1 133 "openswan-strongswan" #1: STATE_PARENT_I1: sent v2I1, expected v2R1 002 "openswan-strongswan" #2: transition from state STATE_PARENT_I1 to state STATE_PARENT_I2 134 "openswan-strongswan" #2: STATE_PARENT_I2: sent v2I2, expected v2R2 {auth=IKEv2 cipher=aes_192 integ=sha1 prf=oakley_md5 group=modp1536} 218 "openswan-strongswan" #2: STATE_PARENT_I2: INVALID_ID_INFORMATION -------- error in /var/log/secure: -------- eal5 pluto[4131]: packet from 9.3.190.194:500: IKEv2 mode no peer ID (hisID) -------- If I change the ike line in the openswan ipsec.conf to "ike=aes192-sha1" to not send a proposal with a prf of md5, the negotiation works fine. So I tried configuring a connection from openswan to racoon2. I got the error above when "ike=aes192". Then I changed it to "ike=aes192-sha1" which worked with strongswan, but it results in the error above, too. ------- Comment From tchicks.com 2008-04-30 14:58 EDT------- Hello again Herbert - It looks like these patches also break the ability to negotiate SA's when using 3des. connection: ------------ conn openswan_i386-openswan_i386 left=fc00:0:0:105::22 right=fc00:0:0:105::24 ikev2=insist ike=3des phase2alg=3des authby=secret auto=add ------------ bringup: ------------ [root@eal5 i386]# ipsec auto --verbose --up openswan_i386-openswan_i386 002 "openswan_i386-openswan_i386" #1: initiating v2 parent SA 133 "openswan_i386-openswan_i386" #1: STATE_PARENT_I1: initiate 002 "openswan_i386-openswan_i386" #1: transition from state STATE_IKEv2_START to state STATE_PARENT_I1 133 "openswan_i386-openswan_i386" #1: STATE_PARENT_I1: sent v2I1, expected v2R1 002 "openswan_i386-openswan_i386" #2: transition from state STATE_PARENT_I1 to state STATE_PARENT_I2 134 "openswan_i386-openswan_i386" #2: STATE_PARENT_I2: sent v2I2, expected v2R2 {auth=IKEv2 cipher=oakley_3des_cbc_192 integ=sha1 prf=oakley_sha group=modp1536} ------------ interesting part in log of responder: ------------ Apr 30 13:52:52 eal3 pluto[28031]: ASSERTION FAILED at /usr/src/redhat/BUILD/ope nswan-2.6.12/programs/pluto/crypt_utils.c:56: space->start + howbig < space->len Apr 30 13:52:52 eal3 pluto[28018]: closing helper(2) pid=28031 fd=9 exit=6 ------------ That's right. That's why we hadn't yet merged in this patch. The problem seems to be that the patch sets the key length to -1, combined with an if(!keylen). It causes the key size to be set to -1 which is then cast into an unsigned variable, causing the keysize to become 65535. You can see this in the debug information in out_attr(). The assert you see is openswan complaining it cannot build such a huge packet. Also, out_attr is an ikev1 function and should not be called from ikev2. I think this problem only happens when one side is not setting any esp/keysize, so herbert never saw the problem with strongswan, which always sends the attribute key length. ------- Comment From tchicks.com 2008-04-30 17:12 EDT------- Red Hat - Can we get confirmation that a fix for this bug is targeted for the zstream release? Thanks! Created attachment 304445 [details]
Handle fixed-length keys correctly.
Indeed, I'd only tested AES :)
Actually the first version did handle this correctly, but somewhere along the
line I changed it to use the value -1 and forgot to update the exit code to
turn it back to zero. This patch fixes the problem for me with 3DES.
I couldn't reproduce the AES192 problem though. Could you double-check that
you've got my second set of patches in this bug report, and not the first?
Thanks!
------- Comment From tchicks.com 2008-05-06 14:59 EDT------- I verified the fixed key length patch between an i386 and ppc machine. Everything looks good there. As for the aes192 problem, I double checked to make sure that I didn't grab any of the obsoleted patches. I'm still getting that error, though. When you use the aes192 configs that I have above, you get a successful negotiation? Did you also apply the patch in BZ442955? That could explain the problem. openswan 2.6.13 fixes all issues in this bug (except aes192 being a valid cipher) ------- Comment From tchicks.com 2008-05-14 14:31 EDT------- I have tried building with and without the patch in BZ442955, on x86 and x86_64, and still have the aes192 problem. Hmm, what version of strongswan are you using? Could you enable debugging on both sides and attach the result? Thanks! ------- Comment From tchicks.com 2008-05-16 16:13 EDT------- strongswan-4.1.11 openswan-2.6.12-2.el5.src.rpm built with these patches applied: From RH439771: add_aes_key_length_to_sadb.patch transmit_ikev2_key_length_attribute.patch receive_ikev2_key_length_attribute.patch handle_fixed_length_keys_correctly.patch From RH442955: add_conf_only_esp_support.patch openswan config: ------------------------ version 2.0 # conforms to second version of ipsec.conf specification config setup # Debug-logging controls: "none" for (almost) none, "all" for lots. # klipsdebug=none plutodebug="all" # For Red Hat Enterprise Linux and Fedora, leave protostack=netkey protostack=netkey nat_traversal=yes conn openswan-strongswan left=9.3.190.198 right=9.3.190.194 ike=aes192 phase2alg=aes192 ikev2=insist authby=secret auto=add ------------------------ strongswan config: ------------------------ config setup # plutodebug=all charondebug="dmn 4, mgr 4, ike 4, chd 4, job 4, cfg 4, knl 4, net 4, enc 2, lib 4" # crlcheckinterval=600 # strictcrlpolicy=yes # cachecrls=yes # nat_traversal=yes # charonstart=no plutostart=no conn openswan-strongswan left=9.3.190.198 right=9.3.190.194 keyexchange=ikev2 authby=secret auto=add ------------------------ I couldn't turn on full debugging on the strongswan (responder) side, because there were some timing issues due to writing so much data to the log that would cause the negotiation to hang before hitting this problem. secrets shared by both: ------------------------ 9.3.190.198 9.3.190.194 : PSK "psk" ------------------------ command: ------------------------ [root@eal5 ~]# ipsec auto --verbose --up openswan-strongswan 002 "openswan-strongswan" #1: initiating v2 parent SA 133 "openswan-strongswan" #1: STATE_PARENT_I1: initiate 002 "openswan-strongswan" #1: transition from state STATE_IKEv2_START to state STATE_PARENT_I1 133 "openswan-strongswan" #1: STATE_PARENT_I1: sent v2I1, expected v2R1 010 "openswan-strongswan" #1: STATE_PARENT_I1: retransmission; will wait 20s for response 002 "openswan-strongswan" #2: transition from state STATE_PARENT_I1 to state STATE_PARENT_I2 134 "openswan-strongswan" #2: STATE_PARENT_I2: sent v2I2, expected v2R2 {auth=IKEv2 cipher=aes_192 integ=md5 prf=oakley_sha group=modp1536} 218 "openswan-strongswan" #2: STATE_PARENT_I2: INVALID_ID_INFORMATION ------------------------ I will attach the logs. Created attachment 305744 [details]
Openswan (initiator) logs
Created attachment 305745 [details]
Strongswan (responder) logs
You can try logging to a tmpfs mount. We put up an openswan-2.6.13rc1.tar.gz, but we're still running tests on it before doing a full release. We did see strongswan-openswan interops work on this release though. Created attachment 305940 [details]
Use the correct algorithm for ID hashing
Doh! I was looking at the child SA but you're having a problem with the parent
SA.
In fact, the bug was rather obvious since there is even a comment which says
that it's wrong :) This patch makes it work for me. Thanks!
Confirmed will be in 2.6.14. I remember putting in the comment with Antony.... Grrr 2.6.14rc7 was built to address the bug reported. ------- Comment From tchicks.com 2008-06-05 15:08 EDT------- I have verified these bug fixes between openswan-2.6.14rc7 on ppc and i386 and strongswan-4.1.11 on i386. I built openswan-2.6.14rc7 from source available at openswan.org. I haven't been able to find the rpm built by Red Hat. note that 2.6.14rc10 is on there now. ------- Comment From tchicks.com 2008-06-18 18:11 EDT------- Changing status to FIXEDAWAITINGTEST on IBM's side. ------- Comment From tchicks.com 2008-06-18 18:12 EDT------- Changing status to TESTED on IBM's side. ------- Comment From tchicks.com 2008-06-18 18:13 EDT------- Changing status to ACCEPTED on IBM's side. ------- Comment From tchicks.com 2008-06-18 18:14 EDT------- Changing status to CLOSED on IBM's side. ~~ Attention Partners RHEL 5.4 Partner Alpha Released! ~~ RHEL 5.4 Partner Alpha has been released on partners.redhat.com. There should be a fix present that addresses this particular request. Please test and report back your results here, at your earliest convenience. Our Public Beta release is just around the corner! If you encounter any issues, please set the bug back to the ASSIGNED state and describe the issues you encountered. If you have verified the request functions as expected, please set your Partner ID in the Partner field above to indicate successful test results. Do not flip the bug status to VERIFIED. Further questions can be directed to your Red Hat Partner Manager. Thanks! Partners, This particular request is of a notably high priority. In order to prepare make the most of this Alpha release, please report back initial test results before the scheduled Beta drop. That way if you encounter any issues, we can work to get additional corrections in before we launch our Public Beta release. Speak with your Partner Manager for additional dates and information. Thank you for your cooperation in this effort. Is there any response about retesting the bug with latest openswan packages? There is a NSS support in new openswan-2.6.21-4.el5, which requires the bug to be retested to in order verify that we got no regression. IBM, could you please retest for this issue, using the latest Beta bits? ~~ Attention - RHEL 5.4 Beta Released! ~~ RHEL 5.4 Beta has been released! There should be a fix present in the Beta release that addresses this particular request. Please test and report back results here, at your earliest convenience. RHEL 5.4 General Availability release is just around the corner! If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Please do not flip the bug status to VERIFIED. Only post your verification results, and if available, update Verified field with the appropriate value. Questions can be posted to this bug or your customer or partner representative. ------- Comment From tyhicks.ibm.com 2009-07-09 17:51 EDT------- Sorry for the lag - this bug is closed on the IBM side and I wasn't giving any attention. Unfortunately, it isn't very likely that I'll have the bandwidth to recreate the environment and verify this for 5.4. IBM, understandable. By chance, do you know if there is any remaining openswan testing planned for execution by anyone other than yourself? Please update this bug if you do eventually find the cycles to re-test. Thanks. ~~ Attention Partners - RHEL 5.4 Snapshot 1 Released! ~~ RHEL 5.4 Snapshot 1 has been released on partners.redhat.com. If you have already reported your test results, you can safely ignore this request. Otherwise, please notice that there should be a fix available now that addresses this particular request. Please test and report back your results here, at your earliest convenience. The RHEL 5.4 exception freeze is quickly approaching. If you encounter any issues while testing Beta, please describe the issues you have encountered and set the bug into NEED_INFO. If you encounter new issues, please clone this bug to open a new issue and request it be reviewed for inclusion in RHEL 5.4 or a later update, if it is not of urgent severity. Do not flip the bug status to VERIFIED. Instead, please set your Partner ID in the Verified field above if you have successfully verified the resolution of this issue. Further questions can be directed to your Red Hat Partner Manager or other appropriate customer representative. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1350.html |