Created attachment 962749 [details] Test program describing the bug Description of problem: There is a race condition in qemuGetProcessInfo() (qemu_driver.c). It occurs when the /proc file for the qemu process can be opened (fopen() succeeds), but when one tries to read from it, it is gone (fscanf() fails). The behavior is not consistent when both fopen and fscanf succeed and when only fopen succeeds. Version-Release number of selected component (if applicable): 1.2.2 How reproducible: Occasionaly. Has been observed in some OpenStack CI builds. Steps to Reproduce: Can be simulated with the test program attached. The steps are: 1. Create a subprocess. 2. Open /proc/{pid}/stat. 3. Wait for subprocess to exit. 4. Try to read from the descriptor opened in step #2. Actual results: qemuGetProcessInfo() returns -1 and sets errno = -EINVAL. Expected results: qemuGetProcessInfo() should return 0 and set output arguments to 0, as it does when fopen() fails. Additional info: Logs from OpenStack CI builds: http://logstash.openstack.org/#eyJmaWVsZHMiOiBbXSwgInNlYXJjaCI6ICJtZXNzYWdlOlwib3BlcmF0aW9uIGZhaWxlZDogY2Fubm90IHJlYWQgY3B1dGltZSBmb3IgZG9tYWluXCIgQU5EIHRhZ3M6XCJzY3JlZW4tbi1jcHUudHh0XCJcbiIsICJ0aW1lZnJhbWUiOiAiODY0MDAwIiwgImdyYXBobW9kZSI6ICJjb3VudCIsICJvZmZzZXQiOiAwfQ== sample libvirt log from one of the failed jobs: http://logs.openstack.org/17/135917/3/check/gate-tempest-dsvm-neutron-src-python-neutronclient/1880f95/logs/libvirt/libvirtd.txt.gz#_2014-11-27_06_46_39_818
Created attachment 962752 [details] Proposed patch
Please send the patch upstream to libvir-list. Patches in attachments are harder to apply and get fewer eyes looking at them, so they tend to be forgotten in comparison to patches directly on the mailing list.
Patch sent to mailing list, as requested.
Fixed in git: commit ff018e686a8a412255bc34d3dc558a1bcf74fac5 Author: Eduardo Costa <eduardobmc> Date: Mon Dec 1 18:24:20 2014 -0200 Fix race condition in qemuGetProcessInfo There is a race condition between the fopen and fscanf calls in qemuGetProcessInfo. If fopen succeeds, there is a small possibility that the file no longer exists before reading from it. Now, if either fopen or fscanf calls fail, the function will behave just as only fopen had failed. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1169055 Signed-off-by: Eric Blake <eblake>