Bug 2259320 - Unable to load undici builtin installed from nodejs-undici
Summary: Unable to load undici builtin installed from nodejs-undici
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: nodejs
Version: rawhide
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Jan Staněk
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-01-20 08:43 UTC by Zephyr Lykos
Modified: 2024-02-09 02:31 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-02-09 02:31:54 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Fedora Package Sources nodejs-undici pull-request 3 0 None None None 2024-01-26 06:54:37 UTC
Fedora Package Sources nodejs20 pull-request 5 0 None None None 2024-01-26 06:54:37 UTC
Github nodejs cjs-module-lexer pull 91 0 None open Support building for externally shared js builtins 2024-01-29 13:32:21 UTC
Github nodejs undici pull 2643 0 None Merged Support building for externally shared js builtins 2024-01-29 13:32:22 UTC

Description Zephyr Lykos 2024-01-20 08:43:33 UTC
Undici (https://github.com/nodejs/undici) is the underlying library for Node.js's fetch (https://fetch.spec.whatwg.org/#fetch-method) API implementation. This is usually vendored into `libnode.so`, but we use the `--shared-builtin-undici` flag to eliminate prebuilt binaries for llhttp and split it into a standalone package as `nodejs-undici` in Fedora.

However, the internal module resolution behavior changed in some way in Node 20, and the `AddExternalizedBuiltin` call for Undici in `src/node_builtins.cc` does not recognize our import and returns errors instead.

Reproducible: Always

Steps to Reproduce:
1. Spin up a fresh rawhide installation
2. dnf install -y nodejs
3. node -e Request
Actual Results:  
node:internal/bootstrap/realm:431
  if (!mod) throw new TypeError(`Missing internal module '${id}'`);
            ^

TypeError: Missing internal module 'internal/deps/./lib/client'
    at requireBuiltin (node:internal/bootstrap/realm:431:19)
    at requireWithFallbackInDeps (node:internal/bootstrap/realm:443:10)
    at node:internal/deps/undici/undici:3:16
    at BuiltinModule.compileForInternalLoader (node:internal/bootstrap/realm:401:7)
    at requireBuiltin (node:internal/bootstrap/realm:432:14)
    at lazyUndici (node:internal/process/pre_execution:315:14)
    at get (node:internal/process/pre_execution:324:16)
    at [eval]:1:1
    at runScriptInThisContext (node:internal/vm:144:10)
    at node:internal/process/execution:109:14

[Process exited with exit code 1]

Expected Results:  
No output, exit code 0

Comment 1 Jan Staněk 2024-01-22 13:39:09 UTC
Thanks for the report, I'll take look into it.

This is one of the issues I'm hoping a Fedora CI will help prevent, once the open discussions there are resolved (and the tests are enabled): https://gitlab.com/redhat/centos-stream/tests/nodejs

Comment 2 Jan Staněk 2024-01-22 14:17:51 UTC
Tried it in the container (registry.fedoraproject.org/fedora:rawhide):

> # node --expose-internals -e 'const {default: undici} = import("node:internal/deps/undici/undici");'
> node:internal/bootstrap/realm:431
>   if (!mod) throw new TypeError(`Missing internal module '${id}'`);
>                   ^
>
> TypeError: Missing internal module 'internal/deps/./lib/client'
>     at requireBuiltin (node:internal/bootstrap/realm:431:19)
>     at requireWithFallbackInDeps (node:internal/bootstrap/realm:443:10)
>     at node:internal/deps/undici/undici:3:16
>     at BuiltinModule.compileForInternalLoader (node:internal/bootstrap/realm:401:7)
>     at BuiltinModule.compileForPublicLoader (node:internal/bootstrap/realm:337:10)
>     at loadBuiltinModule (node:internal/modules/helpers:101:7)
>     at ModuleLoader.builtinStrategy (node:internal/modules/esm/translators:452:18)
>     at callTranslator (node:internal/modules/esm/loader:285:14)
>     at ModuleLoader.moduleProvider (node:internal/modules/esm/loader:291:30)
>     at async link (node:internal/modules/esm/module_job:76:21)
>
> Node.js v20.11.0

It looks to me that when undici (index.js) tries to import it's `lib/client`, it cannot find it; that file exists, but I guess the correct search path should be `internal/deps/undici/lib/client` (a directory below where it looks now).

Comment 3 Jan Staněk 2024-01-22 15:58:06 UTC
Further observation: I'm packaging undici differently than upstream nodejs imports it into it's source. Upstream does the following (from `tools/dep_updaters/update-undici.sh`):

> "$NODE" "$NPM" install --no-bin-link --ignore-scripts
> "$NODE" "$NPM" run build:node
> "$NODE" "$NPM" prune --production
...
> mv deps/undici/src/undici-fetch.js deps/undici/undici.js

The problem is that the `undici-fetch.js` later used as "the module" for the externalized builtin is generated via the `build:node` script, which invokes `npx esbuild.4 …`. Esbuild is not listed as a undici dependency, thus is not available for our build.

IIRC I tried to workaround it by simply not running the esbuild, instead pointing the `--shared-builtin` flag to undici's `index.js`. That seemed to work, but not anymore.

---

The more future-proof version would probably be to add the esbuild to dev dependencies and then following the node updater script – that should give us parity with whatever the upstream project expects. However, I would still be interested in what changed between the nodejs versions and why the relative imports stopped working.

Comment 4 Michael Dawson 2024-01-22 16:41:26 UTC
Seems like there was a collision when I posted this earlier:

I'm testing out the Node.js upstream following these steps

1) clone https://github.com/nodejs/node.git
2) checkout out the latest v20.x (git checkout v20.x)
3) copied deps/undici/undici.js to a different directory and name
4) configure with ./configure --shared-builtin-undici/undici-path=/newpull/io.js/undici2.js
5) built/ran full test suite with make test

Seemed to built/pass tests.

Validated that it was using the externalized version of undici by renaming undici2.js and running "./node" and it failed as expected with a message that the externalized builtin could not be found

Comment 5 Michael Dawson 2024-01-22 17:02:25 UTC
I further validated that I can build/test having removed undici under the deps completely in the Node.js project, so there is no change in the `Node.js` project.

From the comments above I can see that what likely changed was how undici is built, which caused the build of the undici package to not package correctly.

I think this is likely related - https://github.com/nodejs/undici/pull/2342/, it removed `esbuild` as a dep and then used `npx to run esbuild` instead.


> The more future-proof version would probably be to add the esbuild to dev dependencies and then following the node updater script – that should give us parity with whatever the upstream project expects. However, I would still be interested in what changed between the nodejs versions and why the relative imports stopped working.

I agree that using the updater script it the best way to ensuring things keep in sync.

Comment 6 Michael Dawson 2024-01-22 17:05:27 UTC
> IIRC I tried to workaround it by simply not running the esbuild, instead pointing the `--shared-builtin` flag to undici's `index.js`. That seemed to work, but not anymore.


I think that might also have had the side effect of exposing more from undici than is done in Node.js.

Comment 7 Jan Staněk 2024-01-23 15:48:24 UTC
I have proposed adding the necessary bundler to devDependencies upstream: https://github.com/nodejs/undici/pull/2633; it was turned down since it apparently breaks another project. We do not have that problem, so I'm modifying the `undici-sources.sh` to add the esbuild to devDependencies, so that I can recreate the correct file while building the RPM. PR incoming once I test locally that it works.

Comment 8 Zephyr Lykos 2024-01-24 15:43:25 UTC
Are we fine with (only) a bundled .js file in the final package?

I'm working on this, and Node doesn't seem to like externalized builtins importing more externalized paths, since it is loading the externalized module as a string and doesn't forward nor reference it by its path (src/node_builtins.cc:BuiltinLoader::AddExternalizedBuiltin).

So we could:
a) only ship the bundled version within the package
   (same as upstream, but guaranteed to break users who use `require('undici')` or any subpaths without a local installation of undici)
b) use a custom loader as the entrypoint to load the unbundled version in /usr/lib/node_modules
   (could have some unknown quirks, which I'm not certain it will happen or not.
    If so, it's most likely due to the hardcoded path in the loader,
    since we have to somehow reference the module without using relative paths)
c) ship the bundled version and the unbundled version at the same time
   (impossible to break anything but comes with 2x space usage)

I prefer (b), since:
- imo it's cleaner than (a)
- the error trace actually maps to the original source
- plus, we don't need to add esbuild to BR

Either choosing (b) or (c) requires a rebuild on nodejs.


---

A custom loader poc:

undici-fetch.js
```javascript
'use strict';

const moduleRoot = "%{nodejs_sitelib}/%{npm_name}"; // should be expanded to real path when building
module.exports = require(moduleRoot);
```

Comment 9 Zephyr Lykos 2024-01-24 15:49:22 UTC
*edit: the previous `moduleRoot` should be `moduleRoot + '/index-fetch.js'`, since it's a different entrypoint being built in the bundle.

Comment 10 Zephyr Lykos 2024-01-24 15:57:18 UTC
I also wonder why they used WASM instead of a NAPI extension for llhttp in such a performance-sensitive context. (it even came with a SIMD vector-optimized build!)
Native is approximately 1.75x-2.5x faster than the WASM implementation on the same codebase.

Comment 11 Richard Lau 2024-01-24 15:59:40 UTC
(In reply to Jan Staněk from comment #7)
> I have proposed adding the necessary bundler to devDependencies upstream:
> https://github.com/nodejs/undici/pull/2633; it was turned down since it
> apparently breaks another project. We do not have that problem, so I'm
> modifying the `undici-sources.sh` to add the esbuild to devDependencies, so
> that I can recreate the correct file while building the RPM. PR incoming
> once I test locally that it works.

W.r.t. https://github.com/nodejs/undici/pull/2633 the "breaks another project" was that previously esbuild didn't have pre-compiled binaries for AIX (an IBM OS) (https://github.com/evanw/esbuild/issues/3549) which meant that npm installs of undici failed citgm runs in upstream Node.js. As of esbuild.10 (and https://github.com/evanw/esbuild/pull/3550) that has been rectified so undici could possibly go back to declaring esbuild as a devDependency again without breaking citgm runs in upstream Node.js.

Comment 12 Zephyr Lykos 2024-01-26 06:38:16 UTC
I propose this fix:

rpms/nodejs20:
diff --git a/nodejs20.spec b/nodejs20.spec
index f1f9fa6..786c302 100644
--- a/nodejs20.spec
+++ b/nodejs20.spec
@@ -547,7 +547,7 @@ export PATH="${cwd}/.bin:$PATH"
            %{!?with_bundled_zlib:--shared-zlib} \
            %{!?with_bootstrap:--shared-builtin-cjs_module_lexer/lexer-path %{nodejs_default_sitelib}/cjs-module-lexer/lexer.js} \
            %{!?with_bootstrap:--shared-builtin-cjs_module_lexer/dist/lexer-path %{nodejs_default_sitelib}/cjs-module-lexer/dist/lexer.js} \
-           %{!?with_bootstrap:--shared-builtin-undici/undici-path %{nodejs_default_sitelib}/undici/index.js} \
+           %{!?with_bootstrap:--shared-builtin-undici/undici-path %{nodejs_default_sitelib}/undici/loader.js} \
            --shared-brotli \
            --shared-libuv \
            --with-intl=small-icu \



/usr/lib/node_modules/undici/loader.js:
```javascript
'use strict';

module.exports = require('node:module').createRequire('/usr/lib/node_modules/undici/loader.js')('./index-fetch.js');
```

Comment 13 Jan Staněk 2024-01-26 10:35:41 UTC
(In reply to Zephyr Lykos from comment #12)
> I propose this fix:
> 
> rpms/nodejs20:
> diff --git a/nodejs20.spec b/nodejs20.spec
> index f1f9fa6..786c302 100644
> --- a/nodejs20.spec
> +++ b/nodejs20.spec
> @@ -547,7 +547,7 @@ export PATH="${cwd}/.bin:$PATH"
>             %{!?with_bundled_zlib:--shared-zlib} \
>             %{!?with_bootstrap:--shared-builtin-cjs_module_lexer/lexer-path
> %{nodejs_default_sitelib}/cjs-module-lexer/lexer.js} \
>            
> %{!?with_bootstrap:--shared-builtin-cjs_module_lexer/dist/lexer-path
> %{nodejs_default_sitelib}/cjs-module-lexer/dist/lexer.js} \
> -           %{!?with_bootstrap:--shared-builtin-undici/undici-path
> %{nodejs_default_sitelib}/undici/index.js} \
> +           %{!?with_bootstrap:--shared-builtin-undici/undici-path
> %{nodejs_default_sitelib}/undici/loader.js} \
>             --shared-brotli \
>             --shared-libuv \
>             --with-intl=small-icu \
> 
> 
> 
> /usr/lib/node_modules/undici/loader.js:
> ```javascript
> 'use strict';
> 
> module.exports =
> require('node:module').createRequire('/usr/lib/node_modules/undici/loader.
> js')('./index-fetch.js');
> ```

As I wrote on the PR (sidenote: Should really start to read all the emails before replying.), I have nothing against the idea in general; however, I'm not comfortable with maintaining an (yet another unsupported) approach different from upstream.

Rigth now, I'm working on:
1. Downstream application of the "move esbuild to devDependencies"; makes us not depend on the upstream conflict, and will be removed in favor of upstream if anyone pushes the change through.
2. Run the esbuild and include the extra file in the RPM (the (c) solution from above). Although wasting space, it will be as close to upstream as possible, which should minimize chance of breakage.

Any other changes I need to see at least discussed upstream before entertaining the idea of maintaining them separately.

TL;DR: We should package only the bare minimum essentials that differ from how upstream handles things; if that is not optimal/pretty/could be done better, try to improve the upstream way of doing things first.

---

To be clear, I'm glad that there is activity and the proposal looks very good (or at least interesting) to me. Kudos to you. :-)

Comment 14 Jan Staněk 2024-01-26 14:08:56 UTC
So, my proposed fixes, which should be as minimal as possible (i.e. no updates for undici yet):

nodejs-undici: https://src.fedoraproject.org/rpms/nodejs-undici/pull-request/4# - Moves esbuild to devDependencies, uses that to create undici-fetch.js during build.
nodejs20: https://src.fedoraproject.org/rpms/nodejs20/pull-request/6 – Utilizes the undici-fetch.js from ^; needs that to land first.

Rawhide is currently a bit wonky for me; I cannot seem to be able to build nodejs20 due to some issues with annobin/GCC 14. I managed to get it working in F39 chroot, though.

---

Personal note: Although I'm very glad I have the attention of others on this and related issues, I feel a bit overwhelmed right now with all NodeJS work I have on my plate. Apologies in case I'm stepping on anybody's toes without realising; it's not intentional.

Comment 15 Fedora Update System 2024-02-09 02:29:10 UTC
FEDORA-2024-ff9e7b78fc (nodejs20-20.11.0-4.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-ff9e7b78fc

Comment 16 Fedora Update System 2024-02-09 02:31:54 UTC
FEDORA-2024-ff9e7b78fc (nodejs20-20.11.0-4.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.