Bug 1759152 - [rh-nodejs10] npm --version fail -> Error: Cannot find module '../lib/utils/unsupported.js' in container
Summary: [rh-nodejs10] npm --version fail -> Error: Cannot find module '../lib/utils/u...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Software Collections
Classification: Red Hat
Component: nodejs
Version: unspecified
Hardware: s390x
OS: Unspecified
urgent
high
Target Milestone: ---
: 3.3
Assignee: Jan Staněk
QA Contact: Lukáš Zachar
URL:
Whiteboard:
Depends On:
Blocks: 1744771
TreeView+ depends on / blocked
 
Reported: 2019-10-07 13:44 UTC by Lukáš Zachar
Modified: 2019-10-15 17:26 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-14 15:01:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Strace output of the npm --version (149.35 KB, text/plain)
2019-10-08 12:10 UTC, Honza Horak
no flags Details
Proposed patch for reverting statx that is working on kernel 4.11 only (6.47 KB, patch)
2019-10-09 23:03 UTC, Honza Horak
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3039 0 None None None 2019-10-14 15:01:11 UTC

Internal Links: 1760294

Description Lukáš Zachar 2019-10-07 13:44:58 UTC
Description of problem:

When rh-nodejs10-npm is update to the latest version it becomes broken, however only inside the container. On machine itself npm --version works (same version)

Version-Release number of selected component (if applicable):
rh-nodejs10-npm-6.9.0-10.16.3.3.el7.s390x
e.g. 

How reproducible:
always

Steps to Reproduce:
1. podman run --user 0 -ti registry.access.redhat.com/rhscl/python-27-rhel7 bash
2. (app-root)yum update rh-nodejs10-npm
3. (app-root)rpm -qf /opt/rh/rh-nodejs10/root/usr/bin/npm
rh-nodejs10-npm-6.9.0-10.16.3.3.el7.s390x
4. npm --version

Actual results:
internal/modules/cjs/loader.js:638
    throw err;
    ^

Error: Cannot find module '../lib/utils/unsupported.js'
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
    at Function.Module._load (internal/modules/cjs/loader.js:562:25)
    at Module.require (internal/modules/cjs/loader.js:692:17)
    at require (internal/modules/cjs/helpers.js:25:18)
    at /opt/rh/rh-nodejs10/root/usr/bin/npm:19:21
    at Object.<anonymous> (/opt/rh/rh-nodejs10/root/usr/bin/npm:152:3)
    at Module._compile (internal/modules/cjs/loader.js:778:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
    at Module.load (internal/modules/cjs/loader.js:653:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:593:12)


Additional info:
:latest is rhscl/python-27-rhel7:2.7-67.1568971388

(app-root) env
MANPATH=/opt/rh/python27/root/usr/share/man:/opt/rh/rh-nodejs10/root/usr/share/man:/opt/rh/httpd24/root/usr/share/man:
HOSTNAME=35d2f42fae7a
APP_ROOT=/opt/app-root
PIP_NO_CACHE_DIR=off
TERM=xterm
LIBRARY_PATH=/opt/rh/httpd24/root/usr/lib64
PYTHONUNBUFFERED=1
DESCRIPTION=Python 2.7 available as container is a base platform for building and running various Python 2.7 applications and frameworks. Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python's elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.
NODEJS_SCL=rh-nodejs10
X_SCLS=python27 rh-nodejs10 httpd24 
LC_ALL=en_US.UTF-8
PYTHONIOENCODING=UTF-8
LD_LIBRARY_PATH=/opt/rh/python27/root/usr/lib64:/opt/rh/rh-nodejs10/root/usr/lib64:/opt/rh/httpd24/root/usr/lib64
VIRTUAL_ENV=/opt/app-root
PYTHON_VERSION=2.7
PATH=/opt/app-root/bin:/opt/rh/python27/root/usr/bin:/opt/rh/rh-nodejs10/root/usr/bin:/opt/rh/httpd24/root/usr/bin:/opt/rh/httpd24/root/usr/sbin:/opt/app-root/src/.local/bin/:/opt/app-root/src/bin:/opt/app-root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
STI_SCRIPTS_URL=image:///usr/libexec/s2i
PWD=/opt/app-root/src
STI_SCRIPTS_PATH=/usr/libexec/s2i
LANG=en_US.UTF-8
SUMMARY=Platform for building and running Python 2.7 applications
PS1=(app-root)
PLATFORM=el7
HOME=/opt/app-root/src
SHLVL=1
PYTHONPATH=/opt/rh/rh-nodejs10/root/usr/lib/python2.7/site-packages
XDG_DATA_DIRS=/opt/rh/python27/root/usr/share:/usr/local/share:/usr/share
PKG_CONFIG_PATH=/opt/rh/python27/root/usr/lib64/pkgconfig:/opt/rh/httpd24/root/usr/lib64/pkgconfig
container=oci
_=/usr/bin/env

Comment 2 Honza Horak 2019-10-07 13:47:42 UTC
There are also these two different tracebacks I got:

# npm install npm
fs.js:114
    throw err;
    ^

Error: EINVAL: invalid argument, readlink '/opt/rh'
    at Object.realpathSync (fs.js:1486:9)
    at toRealPath (internal/modules/cjs/loader.js:241:13)
    at Function.Module._findPath (internal/modules/cjs/loader.js:323:22)
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:633:25)
    at Function.Module._load (internal/modules/cjs/loader.js:562:25)
    at Function.Module.runMain (internal/modules/cjs/loader.js:831:12)
    at startup (internal/bootstrap/node.js:283:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:622:3)


# npm --version
internal/modules/cjs/loader.js:638
    throw err;
    ^

Error: Cannot find module '/opt/rh/rh-nodejs10/root/usr/bin/npm'
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
    at Function.Module._load (internal/modules/cjs/loader.js:562:25)
    at Function.Module.runMain (internal/modules/cjs/loader.js:831:12)
    at startup (internal/bootstrap/node.js:283:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:622:3)


I really don't understand why it is not always the same, even for the same command..

Comment 3 Honza Horak 2019-10-07 14:03:07 UTC
This issue is currently blocking the containers rebuild, so raising the priority.

Info found so far:

* this issue is only visible in the newest update:

$> rpm -qa|grep node
rh-nodejs10-runtime-3.2-2.el7.s390x
rh-nodejs10-npm-6.9.0-10.16.3.3.el7.s390x
rh-nodejs10-nodejs-10.16.3-3.el7.s390x

* it only is reproducible only inside a s390x container image (no matter what image it is -- python, nodejs, ...)

Comment 4 Honza Horak 2019-10-07 15:10:06 UTC
I tried to reproduce in the mock, but it works there fine. So what can be different in the container image?

Comment 5 Honza Horak 2019-10-07 15:24:40 UTC
Interesting finding: it starts to work once I run the container as --privileged. Not sure why yet.

Comment 7 Honza Horak 2019-10-08 06:52:53 UTC
Some more info I found:
* the problem started to occur with v10.16, which got a rebased npm from 6.4.1 to 6.9.0, so I expect the change happened in npm between those two versions
* RHOAR nodejs builds behave the same, so it is probably not related 
* I don't understand that while --privileged helps, buy using capabilities separately does not help: docker run -ti --rm -u 0 --cap-add=chown --cap-add=dac_override --cap-add=dac_read_search --cap-add=fowner --cap-add=fsetid --cap-add=kill --cap-add=setgid --cap-add=setuid --cap-add=setpcap --cap-add=linux_immutable --cap-add=net_bind_service --cap-add=net_broadcast --cap-add=net_admin --cap-add=net_raw --cap-add=ipc_lock --cap-add=ipc_owner --cap-add=sys_module --cap-add=sys_rawio --cap-add=sys_chroot --cap-add=sys_ptrace --cap-add=sys_pacct --cap-add=sys_admin --cap-add=sys_boot --cap-add=sys_nice --cap-add=sys_resource --cap-add=sys_time --cap-add=sys_tty_config --cap-add=mknod --cap-add=lease --cap-add=audit_write --cap-add=audit_control --cap-add=setfcap --cap-add=mac_override --cap-add=mac_admin --cap-add=syslog --cap-add=wake_alarm --cap-add=block_suspend registry.access.redhat.com/rhscl/nodejs-10-rhel7 bash

Comment 8 Honza Horak 2019-10-08 12:10:55 UTC
Created attachment 1623505 [details]
Strace output of the npm --version

So, the trick is --security-opt=seccomp=unconfined option (used also in privileged), with this option it works:

docker run -ti --rm -u 0  --security-opt=seccomp=unconfined  registry.access.redhat.com/rhscl/nodejs-10-rhel7 bash

Attaching strace output. Still not sure what happens there though.

Comment 9 Honza Horak 2019-10-08 14:19:56 UTC
Another finding:
I started with working npm (6.4.1) and node (10.10) and installed a specific version of npm via locally:

  npm install npm.1

That resulted in a file /node_modules/npm/bin/npm-cli.js.

Then, I installed just the node package from rhoar, and found out that:

  /usr/bin/node /node_modules/npm/bin/npm-cli.js --version

worked with 10.15.3 and stopped working with 10.16.0. So, based on that, it looks like the problem is somewhere in the node itself, between version 10.15.3 and 10.16.0.

Worth mentioning that the npm-cli.js breaks pretty soon, during one of the require() calls. What is weird, that it is not always the same call.

Comment 12 Honza Horak 2019-10-09 06:53:27 UTC
Dan Horak, we're a bit lost here, I'd appreciate if you or somebody from the team could take a look if you see something suspicious -- we're having issues on s390x in a container (seccomp is somehow involved here, because when turning it off for the container, it starts working). The issue is that we get various nodejs tracebacks, but a common thing is that the node cannot load libraries. Any help is appreciated.

Comment 14 Dan Horák 2019-10-09 07:17:18 UTC
Is it when running the image on RHEL-7 host? How about RHEL-7 Alt?

Comment 15 Petr Kubat 2019-10-09 08:24:23 UTC
Not sure what host Honza tried the images on, but for the CI (where it fails as well afaik) we are running RHEL-ALT-7.6-updates-20190722.1 + RHEL-7.6-updates-20190403.1 according to QE

Comment 17 Honza Horak 2019-10-09 16:19:01 UTC
A minimal reproducer with a ubi7/s2i-base image:

#> docker run -ti -u 0 registry.access.redhat.com/ubi7/s2i-base bash

bash-4.2# yum -y update rh-nodejs10-nodejs

bash-4.2# rpm -qa|grep rh-nodejs
rh-nodejs10-runtime-3.2-2.el7.s390x
rh-nodejs10-nodejs-10.16.3-3.el7.s390x
rh-nodejs10-npm-6.9.0-10.16.3.3.el7.s390x

bash-4.2# scl enable rh-nodejs10 -- npm --version
internal/modules/cjs/loader.js:638
    throw err;
    ^

Error: Cannot find module '../lib/utils/unsupported.js'
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:636:15)
    at Function.Module._load (internal/modules/cjs/loader.js:562:25)
    at Module.require (internal/modules/cjs/loader.js:692:17)
    at require (internal/modules/cjs/helpers.js:25:18)
    at /opt/rh/rh-nodejs10/root/usr/bin/npm:19:21
    at Object.<anonymous> (/opt/rh/rh-nodejs10/root/usr/bin/npm:152:3)
    at Module._compile (internal/modules/cjs/loader.js:778:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:789:10)
    at Module.load (internal/modules/cjs/loader.js:653:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:593:12)

Comment 18 sam roberts 2019-10-09 19:42:03 UTC
https://github.com/nodejs/node/issues/19702

https://github.com/npm/npm/issues/20206

https://github.com/nodejs/node/issues/19997

are previous instances of this. They all seem to have been "fixed" by installing an official node package, or using some other mechanism to get a fresh/unbroken install of npm, indicating that there might be a problem with the redhat build of node, or npm (I'm not sure how RH builds its packages, and how or if it partitions them into).

It doesn't appear to be a node.js bug, per se, but I don't find any statement about a root cause.

I hope this is useful.

Comment 19 sam roberts 2019-10-09 19:46:03 UTC
to be clear, I mentioned "redhat build of node" not in the sense of the compile/link of the node binary, but in the sense that node contains npm as a dependency, so node.js builds are one way to get the npm cli.

Comment 20 Milad Farazmand 2019-10-09 20:15:26 UTC
I have tried the above steps on a RHEL container running on S390 Ubuntu host and it works fine.

A very similar issue was being looked into a few month ago by our team where the same error was being produced on a Docker container.
If I remember correctly it was related to how Docker handles systems calls between different host kernels.
It had trouble doing so in container hence the "path" to a module was not resolved correctly. I'm going to get some more info on this and make sure they are related or not.

Comment 21 Honza Horak 2019-10-09 22:24:27 UTC
Git bisect showed me that the difference is made by this patch:
https://github.com/nodejs/node/commit/7e5ef4a0e1

Before this commit, it works fine, after that I start to see this issue. Now trying to look closer what libuv change caused the difference.

Comment 22 Honza Horak 2019-10-09 23:03:17 UTC
Created attachment 1624103 [details]
Proposed patch for reverting statx that is working on kernel 4.11 only

It looks like the problem is caused by libuv upstram PR:
https://github.com/nodejs/node/pull/26707

Which is also this commit:
https://github.com/nodejs/node/commit/7e5ef4a0e1

And in nodejs later this commit:
https://github.com/nodejs/node/commit/7e5ef4a0e1

Based on the man page and the PR above, statx() is available on kernel 4.11 and later, so reverting this change for older kernel is probably ok.

Comment 24 Milad Farazmand 2019-10-10 01:19:07 UTC
Thanks for the info Honza, 
I can confirm that this is the same issue our team has dealt with before. As an alternative solution to reverting the change, using RHEL 8 or upgrading the docker daemon seems to also solve the problem.

Comment 27 Vít Ondruch 2019-10-10 05:56:31 UTC
(In reply to Milad Farazmand from comment #24)
> Thanks for the info Honza, 
> I can confirm that this is the same issue our team has dealt with before. As
> an alternative solution to reverting the change, using RHEL 8 or upgrading
> the docker daemon seems to also solve the problem.

So can we say this is kernel issue? Do you think it would be useful to let kernel guys take look into this?

Comment 28 Honza Horak 2019-10-10 06:11:56 UTC
Reported to nodejs upstram as https://github.com/nodejs/node/issues/29916

Comment 32 Milad Farazmand 2019-10-10 12:23:45 UTC
(In reply to Vít Ondruch from comment #27)
> (In reply to Milad Farazmand from comment #24)
> > Thanks for the info Honza, 
> > I can confirm that this is the same issue our team has dealt with before. As
> > an alternative solution to reverting the change, using RHEL 8 or upgrading
> > the docker daemon seems to also solve the problem.
> 
> So can we say this is kernel issue? Do you think it would be useful to let
> kernel guys take look into this?

It may not be a kernel issue but instead a Docker problem. It has been addresses in this ticket:
https://github.com/docker/for-linux/issues/208

Comment 34 michael_dawson 2019-10-10 15:00:56 UTC
Vit, is the docker version that is being used to test tied to the test infrastructure or the OS version being tested. If it is the former is updating the version of docker in the test infra an option?

Comment 35 Lukáš Zachar 2019-10-10 15:57:31 UTC
I've checked the logs and we were using slightly older version of docker (docker-1.13.1-102.git7f2769b.el7.s390x).
But we should be running on latest publicly available docker from RHEL-Extras.

Comment 38 errata-xmlrpc 2019-10-14 15:01:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3039


Note You need to log in before you can comment on or make changes to this bug.