Bug 2060513
| Summary: | Nodejs includes some pre-compiled wasm code | |||
|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Honza Horak <hhorak> | |
| Component: | nodejs | Assignee: | NodeJS Packaging SIG <nodejs-sig> | |
| Status: | NEW --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 37 | CC: | fweimer, midawson, mrunge, nodejs-sig, rlau, sgallagh, thoger, thrcka, zsvetlik | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2133785 (view as bug list) | Environment: | ||
| Last Closed: | Type: | Bug | ||
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 2111861, 2121075, 2121093, 2121095, 2133785 | |||
|
Description
Honza Horak
2022-03-03 16:37:47 UTC
The upstream issue for this is https://github.com/nodejs/node/issues/42199, reported on behalf of another Linux distribution. So far the feature using undici has only landed in upstream's master branch and Node.js 17. It hasn't yet landed in LTS Node.js (e.g. 16, 14, 12) but fetch was a heavily requested feature so there is likely to be a community push to include it in Node.js 16 (although it will be behind a run-time flag). On current estimated schedules [1] we'd be looking at May 2022 at the earliest for landing in Node.js 16 as that's the next penciled in Node.js 16 semver-minor release. [1] https://github.com/nodejs/Release/issues/658 From my look at undici today to get a bit more up to speed: 1) The source for the generated undici.js file is available in this repo within the Node.js organization - https://github.com/nodejs/undici 2) The file which is in Node.js itself can be generated by: a) clone https://github.com/nodejs/undici b) cd to undici c) check out the corresponding version as identified in appropriate update in Node.js repo - https://github.com/nodejs/node/commit/dace5a804ead8bda32bf3d31af8323ede9e0b0a8 which identifies the specific hash that was used to build the file d) npm install e) npm run build:node 3) When I compared the result of doing that against what was checked it, it diff reported it as exactly the same 4) There is package-lock.json in the undici repo which I think would ensure the same versions of the tools were used to build the file From that it seems like there is a reasonable likelihood that the same file content would result each time. I repeated the process on one of the community s390 machines (which is BE) and it worked and generated the same file. Digging a bit deeper, I think the generation of the undici.js file only bundles/packages JavaScript and WASM that was already built/checked into the undici repository in https://github.com/nodejs/undici/tree/main/lib/llhttp. Building the WASM itself include: 1) The C source for llhttp (which is vendored in from (https://github.com/nodejs/llhttp) is in https://github.com/nodejs/undici/tree/main/deps/llhttp 2) The docker file https://github.com/nodejs/undici/blob/main/build/Dockerfile also pulls in a version of wasi-sdk 3) running npm run build:wasm rebuilds the wasm in a docker container and copies back out to the external file system (I had trouble with this due to userid mismatch) Once I ran this I ended up with wasm files that diff told me were the same as the ones that were checked in. The net of this however, is that suggesting that the WASM be built as part of the Node.js build process is not going to be reasonable. Not even considering the added complexity it would add to the overall build process, it requires docker, and that does not run on all of the supported platforms. A few questions I'd have so far: 1) Would having a copy of the undici repo vectored into the Node.js repo improve the situation? the undici repo is already in the Node.js org and it seems like the hash for the version in the undici repo is available. I think that is what the upstream issue https://github.com/nodejs/node/issues/42199 is asking for but not quite sure how it helps. 2) How do we figure out if this case falls into https://fedoraproject.org/wiki/User:Vondruch/Draft_RawhideGuidelines#Use_of_pregenerated_code, or https://fedoraproject.org/wiki/User:Vondruch/Draft_RawhideGuidelines#No_inclusion_of_pre-built_binaries_or_libraries? I can see arguments for either (In reply to Honza Horak from comment #0) > Draft_RawhideGuidelines#No_inclusion_of_pre-built_binaries_or_libraries This is likely a better link for this: https://docs.fedoraproject.org/en-US/packaging-guidelines/what-can-be-packaged/#prebuilt-binaries-or-libraries Looking at the motivations from the link: > Compiler Flags: Pre-packaged program binaries and program libraries not built from the source code were probably not compiled with standard Fedora compiler flags for security and optimization. I don't think this one applies in the WASM case as as there should be no Fedora specific flags etc. as the same WASM should just run across any platform and there are no OS specific options as farm as I know. > Security: Pre-packaged program binaries and program libraries not built from the source code could contain parts that are malicious, dangerous, or just broken. Also, these are functionally impossible to patch. This seems to be the part that would be a concern. The impossible to patch, is valid unless we have validated that we know how to build/replace. The malicious, dangerous part I assume is because it's hard/impossible to review what was committed. In the current context it is possible to review the original source since it's available so the concern would only be that whoever build/PR'd in the updated file maliciously added files that do not match the source that is claimed to be what was used. broken I don't think applies, since the functionality will be tested by the community test suite just as well as if the c code had been compiled instead of the process to build the WASM added. An option might be to build a separate undici package and have the Node.js packages BuildRequires: those to pull in the binaries produced from that package. (In reply to Michael Dawson from comment #6) > Looking at the motivations from the link: > > > Compiler Flags: Pre-packaged program binaries and program libraries not built from the source code were probably not compiled with standard Fedora compiler flags for security and optimization. > > I don't think this one applies in the WASM case as as there should be no > Fedora specific flags etc. as the same WASM should just run across any > platform and there are no OS specific options as farm as I know. > Maybe none *today*, but that doesn't mean never. > > Security: Pre-packaged program binaries and program libraries not built from the source code could contain parts that are malicious, dangerous, or just broken. Also, these are functionally impossible to patch. > > This seems to be the part that would be a concern. The impossible to patch, > is valid unless we have validated that we know how to build/replace. A patch to replace one pre-compiled binary with another is itself problematic (both in terms of the size of the patch and that it will ALSO be running afoul of this policy) > The > malicious, dangerous part I assume is because it's hard/impossible to review > what was committed. In the current context it is possible to review the > original source since it's available so the concern would only be that > whoever build/PR'd in the updated file maliciously added files that do not > match the source that is claimed to be what was used. Or that the compiler used was maliciously replaced, etc. > broken I don't think > applies, since the functionality will be tested by the community test suite > just as well as if the c code had been compiled instead of the process to > build the WASM added. In this case, it usually means something like "it works fine in the upstream environment but doesn't play well with older/newer runtime libraries". > In this case, it usually means something like "it works fine in the upstream environment but doesn't play well with older/newer runtime libraries".
I don't quite follow. WASM runs in the V8 runtime and would not directly depend on any older/newer libraries
> Maybe none *today*, but that doesn't mean never.
The whole purpose of WASM is that is be able to run across different environments. Any OS specific flags, etc. I think would apply to the runtime (V8 within Node.js in this case) on which it runs, otherwise the whole model is broken.
https://github.com/nodejs/node/pull/42246 is adding the source code to the core repo itself, and a script that automates updates. file / libmagic identifies wasm files by having \0asm as first four bytes: https://github.com/file/file/blob/master/magic/Magdir/webassembly Base64 encoded versions would start with AGFzb. Searching for strings (or files) starting with that may be a way to detect wasm blobs being added to nodejs sources. From 16.14.0:
```
$ find . -path ./.git -prune -o -print -exec file {} \; |grep WebAssembly
./deps/v8/test/fuzzer/wasm/regress-1115280.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/test/fuzzer/wasm/regress-1127717.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/test/fuzzer/wasm/regress-1191853.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/test/fuzzer/wasm_async/regression-761784.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/test/fuzzer/wasm_async/valid.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/test/fuzzer/wasm_async/regress-1115431.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/test/mjsunit/wasm/incrementer.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/callback.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/finalize.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/global.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/hello.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/memory.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/reflect.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/serialize.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/table.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/threads.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/trap.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/hostref.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/multi.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./deps/v8/third_party/wasm-api/example/start.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/fixtures/es-modules/simple.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/fixtures/shared-memory.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/fixtures/simple.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/cant_dotdot.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/clock_getres.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/exitcode.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/fd_prestat_get_refresh.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/follow_symlink.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/getentropy.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/getrusage.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/gettimeofday.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/notdir.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/preopen_populates.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/read_file.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/read_file_twice.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/stdin.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/symlink_escape.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/symlink_loop.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/write_file.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/create_symlink.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/freopen.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/link.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/main_args.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/ftruncate.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/poll.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/readdir.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
./test/wasi/wasm/stat.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)
```
And searching for strings:
```
$ grep -lr AGFzb
deps/cjs-module-lexer/dist/lexer.js
deps/cjs-module-lexer/dist/lexer.mjs
```
From that, it doesn't look as if any of the *.wasm files are actually used at runtime. I think it's just the cjs-module-lexer embedded code that we need to worry about.
Current search with "grep -lr AGFzb" on community source shows: ``` cjs-module-lexer/dist/lexer.mjs cjs-module-lexer/dist/lexer.js undici/src/lib/llhttp/llhttp_simd.wasm.js undici/src/lib/llhttp/llhttp.wasm.js undici/undici.js ``` which I think does capture the instances of wasm at this point. This bug appears to have been reported against 'rawhide' during the Fedora Linux 37 development cycle. Changing version to 37. |