Bug 1342841

Summary: pluto daemon did not start after upgrade to 3.15-5.3 (invalid opcode in nss)
Product: [Fedora] Fedora EPEL Reporter: Michal Bruncko <michal.bruncko>
Component: libreswanAssignee: Paul Wouters <pwouters>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: el6CC: emaldona, hkario, kengert, nss-nspr-maint, pwouters, rrelyea, tis
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-16 21:42:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michal Bruncko 2016-06-05 22:33:48 UTC
Description of problem:

after system upgrade to RHEL 6.8 which includes upgrade libreswan-3.15-5.3.el6.x86_64 from libreswan-3.13-1.el6.x86_64 the pluto daemon did not came up.


Version-Release number of selected component (if applicable):
libreswan-3.15-5.3.el6.x86_64


How reproducible:
always


Actual results:
# /etc/init.d/ipsec start
Starting pluto IKE daemon for IPsec: .....                 [FAILED]

from /var/log/messages:
Jun  6 00:17:30 vpnc1 kernel: : padlock: VIA PadLock not detected.
Jun  6 00:17:30 vpnc1 kernel: : padlock: VIA PadLock Hash Engine not detected.
Jun  6 00:17:30 vpnc1 kernel: : padlock: VIA PadLock not detected.
Jun  6 00:17:30 vpnc1 kernel: : padlock: VIA PadLock not detected.
Jun  6 00:17:30 vpnc1 kernel: : padlock: VIA PadLock Hash Engine not detected.
Jun  6 00:17:30 vpnc1 kernel: : padlock: VIA PadLock not detected.
Jun  6 00:17:31 vpnc1 kernel: : padlock: VIA PadLock not detected.
Jun  6 00:17:31 vpnc1 kernel: : padlock: VIA PadLock Hash Engine not detected.
Jun  6 00:17:31 vpnc1 kernel: : padlock: VIA PadLock not detected.
Jun  6 00:17:31 vpnc1 kernel: : pluto[21205] trap invalid opcode ip:7f0159103d60 sp:7fff53dd6678 error:0 in libfreeblpriv3.so[7f01590b1000+72000]

from /var/log/secure:
Jun  6 00:12:54 vpnc1 ipsec__plutorun: Starting Pluto subsystem...
Jun  6 00:12:54 vpnc1 pluto[12206]: NSS DB directory: sql:/etc/ipsec.d
Jun  6 00:12:54 vpnc1 pluto[12206]: NSS initialized
Jun  6 00:12:54 vpnc1 pluto[12206]: libcap-ng support [enabled]
Jun  6 00:12:54 vpnc1 pluto[12206]: FIPS HMAC integrity verification test passed
Jun  6 00:12:54 vpnc1 pluto[12206]: FIPS: pluto daemon NOT running in FIPS mode
Jun  6 00:12:54 vpnc1 pluto[12206]: Linux audit support [enabled]
Jun  6 00:12:54 vpnc1 pluto[12206]: Linux audit activated
Jun  6 00:12:54 vpnc1 pluto[12206]: Starting Pluto (Libreswan Version 3.15 XFRM(netkey) KLIPS NSS DNSSEC FIPS_CHECK LABELED_IPSEC LIBCAP_NG LINUX_AUDIT XAUTH_PAM NETWORKMANAGER CURL(non-NSS) LDAP(non-NSS)) pid:12206
Jun  6 00:12:54 vpnc1 pluto[12206]: core dump dir: /var/run/pluto
Jun  6 00:12:54 vpnc1 pluto[12206]: secrets file: /etc/ipsec.secrets
Jun  6 00:12:54 vpnc1 pluto[12206]: leak-detective disabled
Jun  6 00:12:54 vpnc1 pluto[12206]: NSS crypto [enabled]
Jun  6 00:12:54 vpnc1 pluto[12206]: XAUTH PAM support [enabled]
Jun  6 00:12:54 vpnc1 pluto[12206]:    NAT-Traversal support  [enabled]
Jun  6 00:12:54 vpnc1 pluto[12206]: ike_alg_register_enc(): Activating OAKLEY_TWOFISH_CBC_SSH: Ok
Jun  6 00:12:54 vpnc1 pluto[12206]: ike_alg_register_enc(): Activating OAKLEY_TWOFISH_CBC: Ok
Jun  6 00:12:54 vpnc1 pluto[12206]: ike_alg_register_enc(): Activating OAKLEY_SERPENT_CBC: Ok
Jun  6 00:12:54 vpnc1 pluto[12206]: ike_alg_register_enc(): Activating OAKLEY_AES_CBC: Ok
Jun  6 00:12:54 vpnc1 pluto[12206]: ike_alg_register_enc(): Activating OAKLEY_AES_CTR: Ok
Jun  6 00:12:54 vpnc1 ipsec__plutorun: !pluto failure!:  exited with error status 132 (signal 4)
Jun  6 00:12:54 vpnc1 ipsec__plutorun: restarting IPsec after pause...

and this all around the same messages (pluto is being started with init script in loop with "sleep 10")


Expected results:
pluto will start correctly.


Additional info:
downgrading back to libreswan-3.13-1.el6.x86_64 and keeping all other packages from RHEL 6.8 in place resolved issue.

Comment 1 Paul Wouters 2016-06-06 22:07:36 UTC
hmm, looks like the error happens in the NSS library.

Is this system running in FIPS mode?

Comment 2 Michal Bruncko 2016-06-06 22:36:39 UTC
> Is this system running in FIPS mode?

not really: 

# cat /proc/sys/crypto/fips_enabled
0

Comment 3 Michal Bruncko 2016-06-06 22:53:50 UTC
> looks like the error happens in the NSS library.

yes, libfreeblpriv3.so is a part of nss-softokn-freebl package - especially of nss-softokn-freebl-3.14.3-23.el6_7.i686 in my system. and this package version wasn't upgraded during last system upgrade to RHEL 6.8. thus I opened bugreport libreswan.

Comment 4 Paul Wouters 2016-06-07 18:27:32 UTC
perhaps the nss migration is not happening properly?

Can you try:

- stop ipsec service
- upgrade packege
- run ipsec checknss
- start ipsec service

and tell me if the problem is still happening?

Comment 5 Michal Bruncko 2016-06-07 20:03:17 UTC
seems your steps working. I tested it on UAT:

# ipsec checknss
Migrating NSS db to sql:/etc/ipsec.d
database already upgraded.
NSS upgrade complete

# /etc/init.d/ipsec start
Starting pluto IKE daemon for IPsec: .                     [  OK  ]

from /var/log/messages:
Jun  7 21:58:25 vpnc2 kernel: : padlock: VIA PadLock not detected.
Jun  7 21:58:25 vpnc2 kernel: : padlock: VIA PadLock Hash Engine not detected.
Jun  7 21:58:25 vpnc2 kernel: : Intel AES-NI instructions are not detected.
Jun  7 21:58:25 vpnc2 kernel: : Intel AES-NI instructions are not detected.
Jun  7 21:58:25 vpnc2 kernel: : padlock: VIA PadLock not detected.
Jun  7 21:58:25 vpnc2 kernel: : sha512_ssse3: Neither AVX nor SSSE3 is available/usable.
Jun  7 21:58:25 vpnc2 kernel: : sha256_ssse3: Neither AVX nor SSSE3 is available/usable.
Jun  7 21:58:25 vpnc2 kernel: : Intel AES-NI instructions are not detected.
Jun  7 21:58:25 vpnc2 kernel: : Intel PCLMULQDQ-NI instructions are not detected.
Jun  7 21:58:25 vpnc2 kernel: : sha256_ssse3: Neither AVX nor SSSE3 is available/usable.
Jun  7 21:58:25 vpnc2 kernel: : sha512_ssse3: Neither AVX nor SSSE3 is available/usable.

this night I will do the same on production VPN Box and let you know.

thank you!

Comment 6 Paul Wouters 2016-06-07 23:47:57 UTC
if it said "already upgraded" then it means it did not do anything though.....

If you are re-testing this bug, note that starting the ipsec service normally should cause the nss db to be converted from dbm to sqlite. You can tell by the filenames. key3.db and cert8.db are the old ones and key4.db and cert9.db are the new ones. So for proper testing, if you now have key4/cert9 files, you should remove those as those are the converted ones.

Comment 7 Michal Bruncko 2016-06-11 22:33:14 UTC
seems I get different results between systems PROD (vpnc1) and TEST/BACKUP (vpnc2). On TEST daemon starts correctly without any additional step except upgrade itself. 
On PROD, with same package versions, same config, I get: 

pluto[21205] trap invalid opcode ip:7f0159103d60 sp:7fff53dd6678 error:0 in libfreeblpriv3.so[7f01590b1000+72000]


but now I found this: https://bugs.centos.org/view.php?id=10930

seems there is similar report for "libfreeblpriv3.so". and issue is reproducible on systems running on Xen hypervisor. and this is exactly my case:

PROD (vpnc1) is running on Xenserver, but TEST system is running on Proxmox which is based on KVM. and thats why I get different results, even if they both share completely same /etc/ipsec.d/ folder and other related files.

Comment 8 Paul Wouters 2016-06-14 14:46:09 UTC
Can you try and add

Environment=NSS_DISABLE_HW_GCM=1

to the service file, by either adding it to /lib/systemd/system/ipsec.service or by copying that service file into /etc/systemd/system/ and then issue:

systemctl daemon-reload 
systemctl start ipsec.service

Comment 9 Paul Wouters 2016-06-14 14:46:54 UTC
this might be rhbz#1249426

Comment 10 Tuomo Soini 2016-06-14 14:50:20 UTC
Sorry. rhel6 is still upstart/sysvinit. So:

Add following lines  file: /etc/sysconfig/ipsec

NSS_DISABLE_HW_GCM=1
export NSS_DISABLE_HW_GCM

Comment 11 Michal Bruncko 2016-06-14 21:24:00 UTC
yes, with this parameter, pluto started successfully with libreswan-3.15-5.3.el6.x86_64 on Xen-virtualized VM.

Comment 13 Paul Wouters 2016-06-15 14:09:24 UTC
An nss update is scheduled that addresses this issue.

Comment 14 Paul Wouters 2017-07-16 21:42:13 UTC
This was addressed in rhbz#1337821 with an nss package update

*** This bug has been marked as a duplicate of bug 1337821 ***