User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/113.0 Build Identifier: Working properly on Fedora 37, but after upgrade to 38, running the fahclient init script causes: CommandLine Error: Option 'use-dbg-addr' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options Aborted (core dumped) This appears to be a bug in LLVM 16.0.0 that's fixed in 16.0.1. It's unknown what other applications are also broken by this bug. Reproducible: Always Steps to Reproduce: 1. Install Fedora 38 2. Install Folding@Home client RPM from https://foldingathome.org/start-folding/?lng=en-US 3. Start fahclient service with service fahclient start or by running /usr/bin/FAHClient /etc/fahclient/config.xml --run-as fahclient --pid-file=/var/run/fahclient.pid Actual Results: CommandLine Error: Option 'use-dbg-addr' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options Aborted (core dumped) strace excerpt: openat(AT_FDCWD, "/lib64/libSPIRV-Tools-opt.so", O_RDONLY|O_CLOEXEC) = 4 read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832 newfstatat(4, "", {st_mode=S_IFREG|0755, st_size=2110640, ...}, AT_EMPTY_PATH) = 0 mmap(NULL, 2068536, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 4, 0) = 0x7fbe6c406000 mmap(0x7fbe6c45b000, 1400832, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x55000) = 0x7fbe6c45b000 mmap(0x7fbe6c5b1000, 270336, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x1ab000) = 0x7fbe6c5b1000 mmap(0x7fbe6c5f3000, 53248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 4, 0x1ec000) = 0x7fbe6c5f3000 close(4) = 0 mprotect(0x7fbe6c842000, 53248, PROT_READ) = 0 mprotect(0x7fbe6c9c4000, 241664, PROT_READ) = 0 mprotect(0x7fbe6c5f3000, 49152, PROT_READ) = 0 mprotect(0x7fbe6c8a7000, 16384, PROT_READ) = 0 mprotect(0x7fbe6cd6a000, 8192, PROT_READ) = 0 mprotect(0x7fbe6cd7a000, 4096, PROT_READ) = 0 mprotect(0x7fbe6cd8f000, 4096, PROT_READ) = 0 mprotect(0x7fbe6cec4000, 4096, PROT_READ) = 0 mprotect(0x7fbe5c65d000, 700416, PROT_READ) = 0 mprotect(0x7fbe6ceaa000, 4096, PROT_READ) = 0 mprotect(0x7fbe635bc000, 7102464, PROT_READ) = 0 mprotect(0x7fbe6cdb9000, 8192, PROT_READ) = 0 mprotect(0x7fbe6cdd1000, 4096, PROT_READ) = 0 mprotect(0x7fbe67cce000, 3153920, PROT_READ) = 0 mprotect(0x7fbe6cdee000, 4096, PROT_READ) = 0 mprotect(0x7fbe6cca8000, 188416, PROT_READ) = 0 futex(0x7fbe6c8506fc, FUTEX_WAKE_PRIVATE, 2147483647) = 0 futex(0x7fbe6c850708, FUTEX_WAKE_PRIVATE, 2147483647) = 0 brk(0x241a000) = 0x241a000 brk(0x243b000) = 0x243b000 lseek(2, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) newfstatat(2, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x2), ...}, AT_EMPTY_PATH) = 0 brk(0x245e000) = 0x245e000 write(2, ": CommandLine Error: Option '", 29: CommandLine Error: Option ') = 29 write(2, "use-dbg-addr", 12use-dbg-addr) = 12 write(2, "' registered more than once!\n", 29' registered more than once! ) = 29 write(2, "LLVM ERROR: inconsistency in reg"..., 60LLVM ERROR: inconsistency in registered CommandLine options ) = 60 rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0 gettid() = 2120428 getpid() = 2120428 tgkill(2120428, 2120428, SIGABRT) = 0 --- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=2120428, si_uid=0} --- +++ killed by SIGABRT (core dumped) +++ Aborted (core dumped) Expected Results: The init script should start normally I was able to fix this, but the exact required steps, cause and fix are not clear. 1. Built LLVM 16.0.1 from Rawhide srpm, but there were four minor tests that failed on my system and I had to use --nocheck to bypass them 2. Installed newly built LLVM 16.0.1 rpms 3. Built spirv-llvm-translator from Rawhide srpm, editing SOURCES/0001-Fix-standalone-builds-with-LLVM_LINK_LLVM_DYLIB-ON.patch as described at https://www.linuxquestions.org/questions/slackware-14/llvm-16-and-static-libraries-4175723703/#post6422334 4. Installed newly built spirv-llvm-translator rpm
@bjackson0971 , if this has been fixed with LLVM 16.0.1, could you check if the following update fixes the issue for you, please? https://bodhi.fedoraproject.org/updates/FEDORA-2023-36b95f852a
(In reply to Tulio Magno Quites Machado Filho from comment #1) > @bjackson0971 , if this has been fixed with LLVM 16.0.1, could you > check if the following update fixes the issue for you, please? > https://bodhi.fedoraproject.org/updates/FEDORA-2023-36b95f852a The crash still happens with this llvm and llvm-libs 16.0.1-1.fc38 build and the stock spirv-llvm-translator package. I have to also install my patched spirv-llvm-translator srpm build to fix the crash. I also tried downgrading llvm and llvm-libs to 16.0.0-2.fc38 and kept my patched spirv-llvm-translator, and that also fixes it. It appears the problem is actually in the translator package.
@bjackson0971 , Could you confirm which version of spirv-llvm-translator is installed when the issue happens, please?
(In reply to Tulio Magno Quites Machado Filho from comment #3) > @bjackson0971 , Could you confirm which version of > spirv-llvm-translator is installed when the issue happens, please? Version spirv-llvm-translator-16.0.0-1.fc38.x86_64 is the only stock version available for Fedora 38 and the crash happens with it installed. My patched srpm build is the only fix I've found.
FEDORA-2023-36b95f852a has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-36b95f852a
I've updated spirv-llvm-translator downstream patch which should, together with llvm 16.0.1, address the issue.
Confirmed that spirv-llvm-translator-16.0.0-2.fc38 fixes the crash with both llvm-16.0.0-2.fc38 and llvm-16.0.1-1.fc38.
FEDORA-2023-36b95f852a has been pushed to the Fedora 38 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2023-36b95f852a` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2023-36b95f852a See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
This seems to fix my pyopencl program ( https://github.com/ali1234/vhs-teletext ) - but the performance is AWFUL - at least 3/4 times slower than f37.
(In reply to Dr. David Alan Gilbert from comment #9) > This seems to fix my pyopencl program ( > https://github.com/ali1234/vhs-teletext ) - but the performance is AWFUL - > at least 3/4 times slower than f37. David, could you report this performance issue in a new bug, please? We'll need some details, e.g.: 1. How can I reproduce this slowdown? i.e. which steps do I have to execute. A small reproducer is ideal. 2. If you can profile the code before and after is even better. 3. Details about the execution, e.g. processor, OS used before and after.
(In reply to Tulio Magno Quites Machado Filho from comment #10) > (In reply to Dr. David Alan Gilbert from comment #9) > > This seems to fix my pyopencl program ( > > https://github.com/ali1234/vhs-teletext ) - but the performance is AWFUL - > > at least 3/4 times slower than f37. > > David, could you report this performance issue in a new bug, please? Sure, will do - what component would you like it against? > We'll need some details, e.g.: > 1. How can I reproduce this slowdown? i.e. which steps do I have to execute. > A small reproducer is ideal. It's tricky, since I've only got the one OpenCL application I've been using regularly and have perf numbers for; it is open but you need a datafile to process with it. > 2. If you can profile the code before and after is even better. There's very little host CPU usage (before or after), so I assume it's one of: a) The SPIR code generated (except that I tried forcing the old code in and that's still slow as far as I can tell) b) The translation of the SPIR to the native Radeon c) Something else in the environment (but I have tried downgrading the kernel to f37) Tips on profiling of the GPU behaviour are welcome. > 3. Details about the execution, e.g. processor, OS used before and after. Sure. Dave
(In reply to Dr. David Alan Gilbert from comment #11) > Sure, will do - what component would you like it against? LLVM is fine. We can change that later as we get more details. > It's tricky, since I've only got the one OpenCL application I've been using > regularly and have perf numbers for; it is open but you need a datafile to > process with it. No problem. All we need is to reproduce and debug the issue.
FEDORA-2023-36b95f852a has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report.
(In reply to Tulio Magno Quites Machado Filho from comment #12) > (In reply to Dr. David Alan Gilbert from comment #11) > > Sure, will do - what component would you like it against? > > LLVM is fine. We can change that later as we get more details. > > > It's tricky, since I've only got the one OpenCL application I've been using > > regularly and have perf numbers for; it is open but you need a datafile to > > process with it. > > No problem. All we need is to reproduce and debug the issue. This was a red herring; so it's actually all fine - sorry for the noise. (The speed is data dependent, and I'd previously seen ranges of 700-1200 lps on this test; the recovered data on the day I upgraded to f38 triggered a case of ~200 lps which I'd never seen something anywhere that bad before; bad luck it happened the same day)
(In reply to Dr. David Alan Gilbert from comment #14) > This was a red herring; so it's actually all fine - sorry for the noise. > (The speed is data dependent, and I'd previously seen ranges of 700-1200 lps > on this test; the recovered data on the day I upgraded to f38 triggered a > case of ~200 lps which I'd never seen something anywhere that bad before; > bad luck it happened the same day) Great! Thanks for the update!