Description of problem: I recompiled the kernel without NUMA support (while trying to debug another NUMA issue) and ran 'numactl --hardware' and it crashed with a seg fault. The problem is that /sys/devices/system/node does not exist on non-NUMA systems, but numactl ignoes the ENOENT error it receives when trying to access this path. The bug lies in the read_distance_table() function just above in distance.c:52 In particular, this loop: 61 for (nd = 0;; nd++) { 62 char fn[100]; 63 FILE *dfh; 64 sprintf(fn, "/sys/devices/system/node/node%d/distance", nd); 65 dfh = fopen(fn, "r"); 66 if (!dfh) { 67 if (errno == ENOENT) 68 err = 0; 69 if (!err && nd<maxnode) 70 continue; 71 else 72 break; 73 } There are no /sys/devices/system/node/node*/distance files on this system, so fopen (line 65) repeatedly fails and eventually it breaks out of the loop (line 72). Then it continues with: 89 free(line); 90 if (err) { 91 numa_warn(W_distance, 92 "Cannot parse distance information in sysfs: %s", 93 strerror(errno)); 94 free(table); 95 return err; 96 } The problem is that err is 0 due to line 68 above which considers ENOENT as a non-error, so it skips over this warning and continues with 106 distance_table = table; 107 return 0; 108 } And table is still set to 0x0 (the initial value): (gdb) p distance_table $4 = (int *) 0x0 Which then leads to the segfault and crash here in distance.c:117 return distance_table[a * distance_numnodes + b]; Why is ENOENT treated like a non-error? 67 if (errno == ENOENT) 68 err = 0; Version-Release number of selected component: numactl-2.0.9-1.fc20 Additional info: reporter: libreport-2.2.0 backtrace_rating: 4 cmdline: numactl --hardware crash_function: numa_distance executable: /usr/bin/numactl kernel: 3.13.7-200.nonuma.fc20.x86_64 runlevel: N 5 type: CCpp uid: 0 Truncated backtrace: Thread no. 1 (3 frames) #0 numa_distance at distance.c:117 #1 print_distances at numactl.c:201 #2 hardware at numactl.c:294
Created attachment 878426 [details] File: backtrace
Created attachment 878427 [details] File: cgroup
Created attachment 878428 [details] File: core_backtrace
Created attachment 878429 [details] File: dso_list
Created attachment 878430 [details] File: environ
Created attachment 878431 [details] File: exploitable
Created attachment 878432 [details] File: limits
Created attachment 878433 [details] File: maps
Created attachment 878434 [details] File: open_fds
Created attachment 878435 [details] File: proc_pid_status
Created attachment 878436 [details] File: var_log_messages
A scratch build of the non-NUMA kernel is here (until koji cleans it up): http://koji.fedoraproject.org/koji/taskinfo?taskID=6670131
Created attachment 918201 [details] patch to check for NUMA A patch from upstream to check for NUMA and avoid a segfault on non-NUMA systems http://blog.gmane.org/gmane.linux.kernel.numa/month=20131123
numactl-2.0.9-2.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/numactl-2.0.9-2.fc20
Package numactl-2.0.9-2.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing numactl-2.0.9-2.fc20' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-9089/numactl-2.0.9-2.fc20 then log in and leave karma (feedback).
I tested again with a non-NUMA custom kernel and verified that the update no longer segfaults, but instead exits cleanly with a nice error message: [root@localhost ~]# uname -r 3.15.8-200.nonuma.fc20.x86_64 [root@localhost ~]# rpm -q numactl numactl-2.0.9-1.fc20.x86_64 [root@localhost ~]# numactl --hardware available: 0 nodes () Segmentation fault [root@localhost ~]# yum update numactl* ... [root@localhost ~]# rpm -q numactl numactl-2.0.9-2.fc20.x86_64 [root@localhost ~]# numactl --hardware No NUMA available on this system
numactl-2.0.9-2.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.