Bug 976368

Summary: glibc: printf() not properly handling pthread cancelling
Product: Red Hat Enterprise Linux 8 Reporter: Nils Olav Selåsdal <nos>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED UPSTREAM QA Contact: qe-baseos-tools-bugs
Severity: medium Docs Contact:
Priority: unspecified    
Version: 8.2CC: ashankar, codonell, dj, fweimer, mnewsome, nos, pfrankli
Target Milestone: rcKeywords: Triaged
Target Release: 8.0   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-06-08 14:04:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nils Olav Selåsdal 2013-06-20 12:56:46 UTC
Description of problem:

It seems printf() doesn't properly handle pthread cancellation requests.

According to http://pubs.opengroup.org/onlinepubs/9699919799//functions/V2_chap02.html printf is either a cancellation point, or it's not allowed to be a cancellation point, in either case it should not leave stdout in an inconsistent state.

Version-Release number of selected component (if applicable):
glibc-2.12-1.80.el6_3.7.i686

Steps to Reproduce:
The following program, compiled with g++ t.cpp -pthread
#include <stdio.h>
#include <unistd.h>
#include <pthread.h>

void *my_routine(void *arg) {
  int i;
  for (i = 0; i < 200; i++) {
    printf("%d\n", i);
  }
  return NULL;
}

int main(void) {
  pthread_t thread;
  if (pthread_create(&thread, NULL, my_routine, NULL)) {
    fprintf(stderr, "Cannot create pthread\n");
    return 1;
  }
  usleep(0);
  pthread_cancel(thread);
  pthread_join(thread, NULL);
  //fflush(stdout);
  sleep(1);
  return 0;
}

will occasionally print a number twice, suggesting that printf() has been cancelled in mid operation - not properly cleaning up its internals, e.g. as in this output:

$ ./a.out 
0
1
2
3
4
5
6
7
8
9
10
11
12
13
13

Comment 2 Carlos O'Donell 2013-06-20 14:11:42 UTC
(In reply to Nils Olav Selesdal from comment #0)
> Description of problem:
> 
> It seems printf() doesn't properly handle pthread cancellation requests.

Agreed, it looks like the stream state isn't fully consistent. It would appear that the write completes but before it can be recorded as complete the cancellation is acted upon. Then when we flush the stream before thread exit and we flush the output again a second time (recording it correctly this time). Thus I suspect it's always a double flush at the end of a the list of numbers.

The immediate workaround is to disable cancellation around printf.

> will occasionally print a number twice, suggesting that printf() has been
> cancelled in mid operation - not properly cleaning up its internals, e.g. as
> in this output:

That is not correct. Asynchronous cancellation is not the default state for glibc. Therefore printf can't be interrupted mid-operation. However, printf must ensure that it's internal state is consistent *before* calling any other operation that may be a cancellation point (since that would cause the cancellation to trigger).

How many times do you have to run this to see the double output? A hundred? A million?

Comment 3 Nils Olav Selåsdal 2013-06-20 14:34:45 UTC
> How many times do you have to run this to see the double output? A hundred?
> A million?

This happened 3 times within 20 runs now.

Comment 7 Carlos O'Donell 2015-01-14 22:06:53 UTC
It is not likely we will fix this for RHEL 6 given the rework required to make IO streams completely safe to cancel. The best workaround I can recommend is not to do IO or expect that stdio may be unusable afterwards.

I'm moving the bug to RHEL 7 becuase we want to track this, but the work is going to be done upstream first after the fixes for cancellation are done. The RHEL 6 cancellation implementation is fundamentally flawed and a rewrite was required to make it work correctly and conform to POSIX.

Comment 11 Carlos O'Donell 2019-06-07 19:47:53 UTC
Given that we don't have a solution for this issue yet, and RHEL 7 is entering Maintenance Phase 1, I'm going to move this to RHEL 8 which is tracking closer to upstream glibc. We can continue to monitor upstream to see if we will get fixes for cancellation point issues.

Comment 12 Florian Weimer 2020-06-08 14:04:15 UTC
Carlos and I looked at the libio implementation (behind printf), and we believe that this issue will be addressed once this upstream bug has been fixed:

https://sourceware.org/bugzilla/show_bug.cgi?id=12683

The libio code assumes the intended new behavior: If cancellation is acted upon, no side effect from I/O has happened, so the buffer pointer update is not needed

In the current cancellation implementation, the libio code has no way of knowing whether the side effect happened or not when cancellation is acted upon. There is no way to leave the buffer pointers in a state that is consistent with both cases (I/O or no I/O).