Bug 1564681 - Ansible Installer with FIPs enabled fails to install Hawkular-Metrics
Summary: Ansible Installer with FIPs enabled fails to install Hawkular-Metrics
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Hawkular
Version: 3.6.0
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: 3.11.z
Assignee: Ruben Vargas Palma
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On: 1590451
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-06 21:12 UTC by Richard Foyle
Modified: 2021-03-11 17:25 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-11 11:43:07 UTC
Target Upstream Version:


Attachments (Terms of Use)
hawkular metrics log (91.52 KB, text/plain)
2018-04-19 16:31 UTC, Richard Foyle
no flags Details
hawkular metrics log #2 (9.40 MB, text/plain)
2018-04-19 16:33 UTC, Richard Foyle
no flags Details

Comment 1 Richard Foyle 2018-04-06 21:14:50 UTC
Made description private since it contains host name

Comment 5 Richard Foyle 2018-04-10 20:40:23 UTC
CUT AND PASTE ERROR: the workaround is:

For OCP v3.6 installation in FIPS enabled secure environment, as our workaround solution, after we changed following 2 files in library files, we are successfully to install OCP v3.6 in FIPS enabled secure environment. We changed htpasswd MD5 encrypt/decrypt as SHA256 encrypt/decrypt as workaround. If you need more information from us, please let us know.

Please provide a solution to fix installer playbook issue and need to remove "htpasswd" as MD5 encrypt/decrypt password for OCP metrics and need to use FIPS compliance approach to do encrypt/decrypt password for OCP metrics .


==================================================================
Changed file/usr/lib/python2.7/site-packages/ansible/modules/web_infrastructure/htpasswd.py
==================================================================
def main():
    arg_spec = dict(
        path=dict(required=True, aliases=["dest", "destfile"]),
        name=dict(required=True, aliases=["username"]),
        password=dict(required=False, default=None, no_log=True),
        crypt_scheme=dict(required=False, default="des_crypt"), ==> change this from "apr_md5_crypt" to "des_crypt"

===================================================
Changed file /usr/lib/python2.7/site-packages/passlib/handlers/digests.py
====================================================
vi /usr/lib/python2.7/site-packages/passlib/handlers/digests.py

#hex_md5     = create_hex_hash(hashlib.md5,      "md5") 	==> comment out this
hex_md5     = create_hex_hash(hashlib.sha256,   "sha256") 	==> add this

       return hashlib.md5(data).hexdigest() ==> change as  "return hashlib.sha256(data).hexdigest()"

Comment 8 Richard Foyle 2018-04-12 21:29:08 UTC
The link provided is for FIPS and not FIPS-2. Oracle doc for FIPS-2: https://docs.oracle.com/cd/E53394_01/html/E54966/fips-refs.html

Need clarification, what algorithm to use for the install?

Comment 9 Richard Foyle 2018-04-12 21:29:48 UTC
FIPS-2 document is here: https://csrc.nist.gov/csrc/media/publications/fips/140/2/final/documents/fips1402.pdf

Comment 12 Richard Foyle 2018-04-13 16:14:02 UTC
John,

Metrics is layered on EAP and the mechanism for JGroups authentication uses MD5 too. So both would need an appropriate supported encryption algorithm.

For background on FIPS-2 and Microsoft's take on it see:
https://blogs.technet.microsoft.com/secguide/2014/04/07/why-were-not-recommending-fips-mode-anymore/

We need to use a tested algorithm from official doc: https://csrc.nist.gov/csrc/media/publications/fips/140/2/final/documents/fips1402.pdf

Here is the short cut list from Oracle:
https://docs.oracle.com/cd/E53394_01/html/E54966/fips-refs.html


Upstream doc on JGroups AUTH
https://developer.jboss.org/wiki/JGroupsAUTH

Comment 16 Richard Foyle 2018-04-19 13:11:14 UTC
With the installer workaround the web console comes up but does not have data.

Comment 17 John Sanda 2018-04-19 14:58:14 UTC
(In reply to Richard Foyle from comment #16)
> With the installer workaround the web console comes up but does not have
> data.

Can I get the logs for hawkular-metrics. And what happens if you try to run curl from another pod:

$ oc exec <pod> -- curl -k https://hawkular-metrics/hawkular/metrics/status

Comment 18 Richard Foyle 2018-04-19 16:31:21 UTC
Created attachment 1424218 [details]
hawkular metrics log

requested hawkular metrics log from case

Comment 19 Richard Foyle 2018-04-19 16:33:09 UTC
Created attachment 1424219 [details]
hawkular metrics log #2

second hawkular metrics log

Comment 20 Richard Foyle 2018-04-19 16:38:37 UTC
attached logs as requested.

-Rich

Comment 21 John Sanda 2018-04-19 17:23:55 UTC
Thanks for the logs. It is what I was expecting. The first log in comment 18 show that hawkular-metrics is up and running normally. The second log though in comment 19 show the problem when sending a request to the /status endpoint. Here is the error from the log:

----------------------------------------------
DEBUG [io.undertow.request.security] (default task-6) Attempting to authenticate HttpServerExchange{ GET /hawkular/metrics/status request {Connection=[close], Accept-Encoding=[identity], User-Agent=[Python-urllib/2.7], Host=[localhost:8080]} response {X-Powered-By=[Undertow/1], Server=[JBoss-EAP/7]}}, authentication required: false
[0m[32m2018-04-12 11:48:26,340 DEBUG [io.undertow.request.security] (default task-6) Authentication outcome was NOT_ATTEMPTED with method io.undertow.security.impl.CachedAuthenticatedSessionMechanism@7b4d73a9 for HttpServerExchange{ GET /hawkular/metrics/status request {Connection=[close], Accept-Encoding=[identity], User-Agent=[Python-urllib/2.7], Host=[localhost:8080]} response {X-Powered-By=[Undertow/1], Server=[JBoss-EAP/7]}}
[0m[32m2018-04-12 11:48:30,757 DEBUG [io.undertow.request.io] (default I/O-13) UT005013: An IOException occurred: java.io.IOException: javax.net.ssl.SSLException: Inbound closed before receiving peer's close_notify: possible truncation attack?
	at io.undertow.protocols.ssl.SslConduit.notifyReadClosed(SslConduit.java:612)
	at io.undertow.protocols.ssl.SslConduit.closed(SslConduit.java:977)
	at io.undertow.protocols.ssl.SslConduit.close(SslConduit.java:1072)
	at io.undertow.protocols.ssl.SslConduit.doUnwrap(SslConduit.java:793)
	at io.undertow.protocols.ssl.SslConduit.read(SslConduit.java:565)
	at org.xnio.conduits.ConduitStreamSourceChannel.read(ConduitStreamSourceChannel.java:127)
	at io.undertow.server.protocol.http.HttpReadListener.handleEventWithNoRunningRequest(HttpReadListener.java:154)
	at io.undertow.server.protocol.http.HttpReadListener.handleEvent(HttpReadListener.java:132)
	at io.undertow.server.protocol.http.HttpReadListener.handleEvent(HttpReadListener.java:57)
	at org.xnio.ChannelListeners.invokeChannelListener(ChannelListeners.java:92)
	at org.xnio.conduits.ReadReadyHandler$ChannelListenerHandler.readReady(ReadReadyHandler.java:66)
	at io.undertow.protocols.ssl.SslConduit$SslReadReadyHandler.readReady(SslConduit.java:1122)
	at io.undertow.protocols.ssl.SslConduit$1.run(SslConduit.java:166)
	at org.xnio.nio.WorkerThread.safeRun(WorkerThread.java:592)
	at org.xnio.nio.WorkerThread.run(WorkerThread.java:472)
Caused by: javax.net.ssl.SSLException: Inbound closed before receiving peer's close_notify: possible truncation attack?
	at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1666)
	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1634)
	at sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:1561)
	at io.undertow.protocols.ssl.SslConduit.notifyReadClosed(SslConduit.java:610)
	... 14 more
----------------------------------------------

If in fact it turns out that JGroups does not support any FIPs approved algorithms, would backporting bug 1560695 be a viable option? Let me provide a little more background on that.

Concurrent schema updates in Cassandra should be avoided. Schema changes are applied by Hawkular Metrics at start up. If the replica count for Hawkular Metrics > 1, we needed a coordination mechanism to ensure schema updates were executed serially. We decided to use Infinispan and JGroups for coordination. It never worked as intended and led to other problems, hence bug 1560695.

Comment 29 Richard Foyle 2018-06-20 16:37:23 UTC
customer is on 3.7.44 now. Please let us know when that back port is available

Comment 47 Mister X 2021-03-11 17:21:13 UTC
Hi,

I am deploying in 3.11 and cantos 7, and the deployment failing on TASK [openshift_metrics : generate htpasswd file for hawkular metrics].

My logs showing that the "crypt_scheme": is "apr_md5_crypt".
Tried to change it manually but it seems that this bug is quite old, but apparently i'm hitting it also on 3.11...

TASK [openshift_metrics : generate htpasswd file for hawkular metrics] **************************************************************************************************************************************************************
Thursday 11 March 2021  15:54:45 +0100 (0:00:00.247)       0:22:13.203 ******** 
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ImportError: No module named passlib.apache
fatal: [os-master.int.example.com]: FAILED! => {"changed": false, "msg": "Failed to import the required Python library (passlib) on os-master.int.example.com's Python /usr/bin/python2. Please read module documentation and install in the appropriate location. If the required library is installed, but Ansible is using the wrong Python interpreter, please consult the documentation on ansible_python_interpreter"}

PLAY RECAP **************************************************************************************************************************************************************************************************************************
localhost                  : ok=12   changed=0    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0   
os-infra.int.example.com : ok=119  changed=70   unreachable=0    failed=0    skipped=151  rescued=0    ignored=0   
os-master.int.example.com : ok=706  changed=323  unreachable=0    failed=1    skipped=829  rescued=0    ignored=0   
os-monitoring.int.example.com : ok=119  changed=70   unreachable=0    failed=0    skipped=151  rescued=0    ignored=0   
os-node.int.example.com : ok=119  changed=70   unreachable=0    failed=0    skipped=151  rescued=0    ignored=0   


INSTALLER STATUS ********************************************************************************************************************************************************************************************************************
Initialization               : Complete (0:00:16)
Health Check                 : Complete (0:00:46)
Node Bootstrap Preparation   : Complete (0:03:15)
etcd Install                 : Complete (0:00:32)
Master Install               : Complete (0:04:31)
Master Additional Install    : Complete (0:00:36)
Node Join                    : Complete (0:00:35)
GlusterFS Install            : Complete (0:04:47)
Hosted Install               : Complete (0:00:45)
Cluster Monitoring Operator  : Complete (0:01:14)
Web Console Install          : Complete (0:04:01)
Console Install              : Complete (0:00:39)
Metrics Install              : In Progress (0:00:16)
        This phase can be restarted by running: playbooks/openshift-metrics/config.yml
Thursday 11 March 2021  15:54:45 +0100 (0:00:00.616)       0:22:13.819 ******** 
=============================================================================== 
openshift_web_console : Verify that the console is running ----------------------------------------------------------------------------------------------------------------------------------------------------------------- 232.55s
openshift_control_plane : Wait for all control plane pods to come up and become ready -------------------------------------------------------------------------------------------------------------------------------------- 125.07s
openshift_storage_glusterfs : Wait for GlusterFS pods ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- 85.44s
openshift_node : Install node, clients, and conntrack packages -------------------------------------------------------------------------------------------------------------------------------------------------------------- 73.54s
openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created --------------------------------------------------------------------------------------------------------------------------------------- 61.66s
Run health checks (install) - EL -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 44.91s
openshift_storage_glusterfs : Wait for deploy-heketi pod -------------------------------------------------------------------------------------------------------------------------------------------------------------------- 42.65s
openshift_storage_glusterfs : Wait for copy job to finish ------------------------------------------------------------------------------------------------------------------------------------------------------------------- 32.24s
openshift_console : Waiting for console rollout to complete ----------------------------------------------------------------------------------------------------------------------------------------------------------------- 31.35s
openshift_storage_glusterfs : Wait for glusterblock provisioner pod --------------------------------------------------------------------------------------------------------------------------------------------------------- 21.66s
openshift_storage_glusterfs : Wait for heketi pod --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 21.64s
openshift_node : install needed rpm(s) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 13.70s
openshift_node : Install Ceph storage plugin dependencies ------------------------------------------------------------------------------------------------------------------------------------------------------------------- 12.78s
openshift_manage_node : Wait for sync DS to set annotations on all nodes ---------------------------------------------------------------------------------------------------------------------------------------------------- 11.57s
openshift_node_group : Wait for the sync daemonset to become ready and available -------------------------------------------------------------------------------------------------------------------------------------------- 11.12s
openshift_storage_glusterfs : Create heketi DB volume ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 8.82s
openshift_hosted : Create OpenShift router ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 8.48s
openshift_manageiq : Configure role/user permissions ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 7.94s
openshift_node : Install NFS storage plugin dependencies --------------------------------------------------------------------------------------------------------------------------------------------------------------------- 7.83s
openshift_storage_glusterfs : Load heketi topology --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 6.51s


Failure summary:


  1. Hosts:    os-master.int.example.com
     Play:     OpenShift Metrics
     Task:     generate htpasswd file for hawkular metrics
     Message:  Failed to import the required Python library (passlib) on os-master.int.example.com's Python /usr/bin/python2. Please read module documentation and install in the appropriate location. If the required library is installed, but Ansible is using the wrong Python interpreter, please consult the documentation on ansible_python_interpreter
TASK [openshift_metrics : generate htpasswd file for hawkular metrics] **************************************************************************************************************************************************************
task path: /root/openshift-ansible/roles/openshift_metrics/tasks/generate_hawkular_certificates.yaml:25
Wednesday 10 March 2021  17:03:16 +0100 (0:00:00.244)       0:08:55.870 ******* 
Using module file /usr/lib/python2.7/site-packages/ansible/modules/web_infrastructure/htpasswd.py
Pipelining is enabled.
<localhost> ESTABLISH LOCAL CONNECTION FOR USER: root
<localhost> EXEC /bin/sh -c '/usr/bin/python2 && sleep 0'
The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_htpasswd_payload_lDsZcB/ansible_htpasswd_payload.zip/ansible/modules/web_infrastructure/htpasswd.py", line 105, in <module>
ImportError: No module named passlib.apache
fatal: [os-master.int.example.com]: FAILED! => {
    "changed": false, 
    "invocation": {
        "module_args": {
            "attributes": null, 
            "backup": null, 
            "content": null, 
            "create": true, 
            "crypt_scheme": "apr_md5_crypt", 
            "delimiter": null, 
            "directory_mode": null, 
            "follow": false, 
            "force": null, 
            "group": null, 
            "mode": null, 
            "name": "hawkular", 
            "owner": null, 
            "password": "VALUE_SPECIFIED_IN_NO_LOG_PARAMETER", 
            "path": "/tmp/tmp.VBvaTK6Q5h/hawkular-metrics.htpasswd", 
            "regexp": null, 
            "remote_src": null, 
            "selevel": null, 
            "serole": null, 
            "setype": null, 
            "seuser": null, 
            "src": null, 
            "state": "present", 
            "unsafe_writes": null
        }
    }, 
    "msg": "Failed to import the required Python library (passlib) on os-master.int.example.com's Python /usr/bin/python2. Please read module documentation and install in the appropriate location. If the required library is installed, but Ansible is using the wrong Python interpreter, please consult the documentation on ansible_python_interpreter"
}



Mar 11 15:54:45 os-master ansible-htpasswd: Invoked with directory_mode=None force=None remote_src=None path=/tmp/tmp.hbipT2KKFB/hawkular-metrics.htpasswd owner=None follow=False group=None unsafe_writes=None create=True state=present content=NOT_LOGGING_PARAMETER serole=None setype=None selevel=None regexp=None password=NOT_LOGGING_PARAMETER src=None name=hawkular seuser=None delimiter=None mode=None attributes=None backup=None crypt_scheme=apr_md5_crypt


KR,

Comment 48 Mister X 2021-03-11 17:25:37 UTC
The workaround from comment #5 didn't work for me.

I found the /usr/lib/python2.7/site-packages/ansible/modules/web_infrastructure/htpasswd.py on 3.11
I didn't find the /usr/lib/python2.7/site-packages/passlib/handlers/digests.py on 3.11



KR,


Note You need to log in before you can comment on or make changes to this bug.