DescriptionRichard W.M. Jones
2012-02-15 09:21:31 UTC
+++ This bug was initially created as a clone of Bug #790528 +++
So, this error seems to originate in the code that creates hard links for launching the appliance VM that libguestfs uses to do its job. The relevant code is here:
https://github.com/libguestfs/libguestfs/blob/master/src/appliance.c
The comments at the top make it sound as if this activity _should_ be thread/concurrency safe.
However, maybe not.
Guestfs is using a read lock on the checksum file to avoid walking on itself when creating the links. Oz did something similar to avoid having two concurrent Oz instances download the same ISO file into the same canonical location.
The issue with Oz, which may come into play here, is that we (Factory) are a single process multi-threaded application and multiple threads of the same process seem to be able to acquire a lock on the same file at the same time without error.
So, perhaps we have two threads doing g.launch() at the same time. Both libguestfs instances try, and succeeded, in acquiring a read lock on the checksum file (as they are both part of the same process), one creates the kernel.$PID symlink first and the second one fails because the link is already there.
Am going to ask Mr. Jones to weigh in on my largely uninformed speculation above to see if there's any sense in it.
If this is indeed the case we may be able to work around the race in the short term by launching and then stopping a dummy libguestfs instance during Factory startup, before we start the server and "go multithreaded".
Comment 1Richard W.M. Jones
2012-02-15 19:34:17 UTC