Bug 974679

Summary: Incorrect example for sched_getaffinity relying on EINVAL return.
Product: [Fedora] Fedora Reporter: Carlos O'Donell <codonell>
Component: man-pagesAssignee: Peter Schiffer <pschiffe>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 18CC: fweimer, pschiffe
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 974685 (view as bug list) Environment:
Last Closed: 2013-07-04 14:16:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 974685    

Description Carlos O'Donell 2013-06-14 19:29:18 UTC
Description of problem:
The man-pages package contains a patch to the sched_getaffinity manual page that includes an example program that incorrectly relies on EINVAL being returned by sched_getaffinity.

Version-Release number of selected component (if applicable):
man-pages-3.35-4.fc17.noarch

How reproducible:
`man 2 sched_getaffinity`

Actual results:
Man page has the example program:
~~~
EXAMPLE
          #define _GNU_SOURCE

          #include <sched.h>
          #include <stdio.h>
          #include <errno.h>

          int main(void)
          {
               cpu_set_t *mask;
               size_t size;
               int i;
               int nrcpus = 1024;

       realloc:
               mask = CPU_ALLOC(nrcpus);
               size = CPU_ALLOC_SIZE(nrcpus);
               CPU_ZERO_S(size, mask);
               if ( sched_getaffinity(0, size, mask) == -1 ) {
                       CPU_FREE(mask);
                       if (errno == EINVAL &&
                           nrcpus < (1024 << 8)) {
                              nrcpus = nrcpus << 2;
                              goto realloc;
                       }
                       perror("sched_getaffinity");
                       return -1;
               }

               for ( i = 0; i < nrcpus; i++ ) {
                       if ( CPU_ISSET_S(i, size, mask) ) {
                               printf("CPU %d is set\n", (i+1));
                       }
               }

               CPU_FREE(mask);

               return 0;
          }
~~~

Expected results:
Should have the example:
~~~
          #define _GNU_SOURCE

          #include <sched.h>
          #include <stdio.h>
          #include <errno.h>
          #include <unistd.h>

          int main(void)
          {
               cpu_set_t *mask;
               size_t size;
               int i;
               int nrcpus = sysconf(_SC_NPROCESSORS_ONLN);

               if (nrcpus == -1) {
                       perror("sysconf");
                       return -1;
               }

               mask = CPU_ALLOC(nrcpus);
               size = CPU_ALLOC_SIZE(nrcpus);
               CPU_ZERO_S(size, mask);
               if ( sched_getaffinity(0, size, mask) == -1 ) {
                       perror("sched_getaffinity");
                       return -1;
               }

               for ( i = 0; i < nrcpus; i++ ) {
                       if ( CPU_ISSET_S(i, size, mask) ) {
                               printf("CPU %d is set\n", (i+1));
                       }
               }

               CPU_FREE(mask);

               return 0;
          }
~~~

Additional info:
The only easy way to determine the online cpus is to use sysconf (_SC_PROCESSORS_ONLN);

The GNU C Library does not guarantee that sched_getaffinity will return EINVAL if the mask is not the right size. Future versions of glibc will never return EINVAL, and will always copy only the cpus that were requested by the user (which has always been the intent of the library).

If you want the affinity for all of the cpus then you must use sysconf to determine how many there might be and then request that amount.

The value returned by sysconf is constant for the lifetime of the process.

The Linux kernel online cpu mask is also constant.

Notes:

This has been fixed in F18, F19, and rawhide.

The example could go upstream, but should not use the EINVAL method to determine kernel's cpu mask size.

Comment 1 Fedora End Of Life 2013-07-04 02:43:59 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 2 Carlos O'Donell 2013-07-04 13:41:14 UTC
Peter,

Please note that I didn't file this against F18, because AFAICT in F18 the example patch was dropped and is no longer present.

Comment 3 Peter Schiffer 2013-07-04 13:51:09 UTC
Oh, I'm sorry, I've missed the note. So in that case, can I close this bug as CURRENTRELEASE?

Comment 4 Carlos O'Donell 2013-07-04 14:04:39 UTC
(In reply to Peter Schiffer from comment #3)
> Oh, I'm sorry, I've missed the note. So in that case, can I close this bug
> as CURRENTRELEASE?

Please confirm the example is missing from f19 and rawhide?

Then I think it's OK to close as CURRENTRELEASE.

Comment 5 Peter Schiffer 2013-07-04 14:16:35 UTC
Yes, the example is missing in F19 and rawhide. Closing as CURRENTRELEASE.

Comment 6 Florian Weimer 2015-05-18 18:30:57 UTC
(In reply to Carlos O'Donell from comment #0)
> The only easy way to determine the online cpus is to use sysconf
> (_SC_PROCESSORS_ONLN);

Just noting in case someone else stumbles upon this bug report: This is not the number of CPUs which is relevant to the sched_getaffinity system call.  I have system which reports _NPROCESSORS_ONLN and _NPROCESSORS_CONF as 40 (and /proc/cpuinfo and /proc/stat match that), yet calling sched_getaffinity with small arguments fails:

[pid  3420] sched_getaffinity(0, 8, 0x146b010) = -1 EINVAL (Invalid argument)
[pid  3420] sched_getaffinity(0, 16, 0x146b010) = -1 EINVAL (Invalid argument)
[pid  3420] sched_getaffinity(0, 32, {ffffffffff, 0, 0, 0}) = 32

The kernel seems to operate with a nr_cpu_ids value of 240:

kernel: setup_percpu: NR_CPUS:5120 nr_cpumask_bits:240 nr_cpu_ids:240 nr_node_ids:2
kernel:         RCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=240.

nr_cpu_ids is not directly exposed to user space, I think, so the EINVAL behavior is pretty much required.

Comment 8 Carlos O'Donell 2015-05-20 02:50:57 UTC
(In reply to Florian Weimer from comment #6)
> nr_cpu_ids is not directly exposed to user space, I think, so the EINVAL
> behavior is pretty much required.

No, EINVAL is not required. There is no race, we spoke with Kosaki Motohiro about this and the kernel always allocates a static mask (nr_cpumask_bits) and it is a kernel compile time constant.