203938 – yum-updatesd creates a stale yum PID lock and therefore hoggs yum usability

Bug 203938 - yum-updatesd creates a stale yum PID lock and therefore hoggs yum usability

Summary: yum-updatesd creates a stale yum PID lock and therefore hoggs yum usability

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	yum
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Jeremy Katz
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	FC6Blocker
TreeView+	depends on / blocked

Reported:	2006-08-24 16:41 UTC by Stewart Adam
Modified:	2018-04-11 08:58 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2007-08-03 19:29:10 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
strace (14.65 KB, text/plain) 2007-07-06 15:28 UTC, Doug Ledford	no flags	Details
View All

Description Stewart Adam 2006-08-24 16:41:11 UTC

Description of problem:
the yum-updatesd service is very useful, but it seems to create a yum lock on my
system.

yum-updatesd does the refreshing properly, the applet notifies me about updates
when I login to Gnome, but the lock never seems to go away - My CPU usage isn't
high (maybe 5 - 15% average), so it's not updating the metadata or some other
activity, yet if I try to run
$   yum update
I get:
"Existing lock /var/run/yum.pid: another copy is running. Aborting."
The PID listed points to the seemingly stale yum-updatesd process.

Version-Release number of selected component (if applicable):
yum-2.9.5-1
yum-updatesd-2.9.5-1

How reproducible:
Sometimes

Steps to Reproduce:
1. Get yum-updatesd to make a yum lock. Maybe 'refresh' on the applet helps
start this?
2. When yum-updatesd makes the lock, run 'yum update'
3. When the lock message is received, wait a few minutes for it to end
4. Run 'yum update' again.
  
Actual results:
The lock's still there.

Expected results:
The stale (?) lock should be removed, or if the process is not stale, let the
user know that yum-updatesd is busy doing something.

Additional info:

Comment 1 Jeremy Katz 2006-09-05 22:51:54 UTC

It shouldn't be left around unless yum-updatesd is actually doing something --
can you verify that it hasn't crashed and if it's still running, try to strace it?

Comment 2 Stewart Adam 2006-09-06 21:27:27 UTC

How can I strace a running process?

Comment 3 Jeremy Katz 2006-09-11 19:28:24 UTC

ps axu | grep yum-updatesd
strace -p YOUR_PID_HERE

Comment 4 Stewart Adam 2006-09-11 21:24:08 UTC

Hmmm... I notices that previous version of yum would stall when updating the
metadata cache, and the CPU would stay at 100% - This seems to be fixed since
the Yum updates about a week ago, maybe it was a similar issue. I'll re-enable
yum-updatesd and see how it goes.

Comment 5 Stewart Adam 2006-09-18 23:05:00 UTC

I'm closing this, seeing as it hasn't happened since I reenabled yum-updatesd.

Comment 6 Stewart Adam 2006-09-20 21:07:30 UTC

OK, it's jsut happened. Yum is locked, so I take a look at the /var/run/yum.pid
file. It shows "2581". Si I run:
$  ps 2581
and it shows:

2581 ?        S      0:44 /usr/bin/python /usr/sbin/yum-updatesd

There's the yum-updatesd. my CPU's low, nothing is going on and my RAM usage
isn't high as it normally when I run Yum (I only have 256MB).

This is the trace:

poll([{fd=6, events=POLLIN}, {fd=5, events=POLLIN}], 2, 100) = 0
gettimeofday({1158786060, 854046}, NULL) = 0
futex(0x831ddf8, FUTEX_WAKE, 1)         = 0
futex(0x831ddf8, FUTEX_WAKE, 1)         = 0
futex(0x831ddf8, FUTEX_WAKE, 1)         = 0
futex(0x831ddf8, FUTEX_WAKE, 1)         = 0
futex(0x831ddf8, FUTEX_WAKE, 1)         = 0
futex(0x831ddf8, FUTEX_WAKE, 1)         = 0
futex(0x83a70b8, FUTEX_WAKE, 1)         = 0
gettimeofday({1158786060, 856111}, NULL) = 0
poll( <unfinished ...>
Process 2581 detached
I cut it, but that poll, gettimeofday, and many futexs repeat over and over with
changing timeofday numbers.

Comment 7 Jeremy Katz 2006-09-21 13:38:41 UTC

This seems to be happening for a few people, but I'm not reproducing it here.  

Can you think of anything weird that you might be doing?  I'll work on getting a
yum-updatesd that does a lot more logging around it's locking and unlocking once
I get into the office so that hopefully we can figure out the codepath that's
being hit.

Comment 8 Jeremy Katz 2006-09-21 14:42:04 UTC

Okay, http://people.redhat.com/~katzj/yumupd.py replaces
/usr/share/yum-cli/yumupd.py.  With it, you should get a lot of things in syslog
from yum-updatesd about and around the pid file locking.  If you can reproduce
with that running, please grep yum-updatesd out of /var/log/messages and send
all of the messages to me.

Comment 9 Stewart Adam 2006-09-21 20:55:05 UTC

I've found another issue. This one isn't the same, because in all the other
cases puplet wasn't running, but in this case I noticed it was, but no icon had
appeared in my taskbar. Yumex / yum was locked, and so I decided to run 'puplet'
and I got:
[user@FC6T2 ~]$ puplet
Traceback (most recent call last):
  File "/usr/bin/puplet", line 179, in updates_avail_handler
    num =
self._countUpdates(self.updatesObject.GetUpdateInfo(dbus_interface="edu.duke.linux.yum"))
  File "/usr/lib/python2.4/site-packages/dbus/proxies.py", line 102, in __call__
    reply_message = self._connection.send_with_reply_and_block(message, timeout)
  File "dbus_bindings.pyx", line 455, in
dbus_bindings.Connection.send_with_reply_and_block
dbus_bindings.DBusException: Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/dbus/service.py", line 304, in _message_cb
    retval = candidate_method(self, *args, **keywords)
  File "/usr/share/yum-cli/yumupd.py", line 542, in GetUpdateInfo
    syslog.syslog(syslog.LOG_INFO, "lock file removed after check")
  File "/usr/share/yum-cli/yumupd.py", line 488, in getUpdateInfo
    # if we can't get the lock, return what we have if we can
  File "/usr/share/yum-cli/yumupd.py", line 383, in populateUpdates
    pkgDict['type'] = md['type']
  File "/usr/share/yum-cli/yumupd.py", line 345, in populateUpdateMetadata
    syslog.syslog(syslog.LOG_INFO, "need to unlock pid file from refresh updates
end -- this one is missing")
  File "/usr/lib/python2.4/site-packages/yum/__init__.py", line 1498, in
getPackageObject
    raise Errors.DepError, 'Package tuple %s could not be found in packagesack'
% str(pkgtup)
DepError: Package tuple ('eclipse-ecj', 'i386', '1', '3.2.0', '6.fc6') could not
be found in packagesac

It seems that there was an error, but puplet didn't quit until I did ctrl + c
and killed it. So then I did 'killall puplet' which killed the stale one that
hadn't run from the terminal, the automaticly-starting one. Then Yumex was OK.

Anyhow, when it happens and puplet isn't around I'll post again.

Comment 10 Stewart Adam 2006-09-21 21:08:42 UTC

Edit: Sorry, my mistake - Yumex and yum WAS still locked after killing puplet...
But another issue nontheless...
Anyways, now that puplet is gone, and yum is still locked:
[user@FC6T2 ~]$ ps $(cat /var/run/yum.pid)
  PID TTY      STAT   TIME COMMAND
10310 ?        S      1:08 /usr/bin/python /usr/sbin/yum-updatesd

So it's used a little CPU time, it's done something at least, but still stale. I
restarted yum-updatesd right after changing the yum python file, which was all
of five minutes ago. There are no syslog messages.

Comment 11 Jeremy Katz 2006-09-22 02:52:09 UTC

That's actually _very_ helpful!

I've just uploaded a new file to the same URL -- can you try again and see if
you can reproduce again?  In either case, getting the yum-updatesd lines from
/var/log/messages would help

Comment 12 Stewart Adam 2006-09-22 20:50:34 UTC

I'll try again - Sometimes it takes a day, others a week so it might be a few
days before I post again. Anyways, just incase it helps, it's been working fine
and here's a sample from my "messages" log file:
Sep 22 14:08:05 FC6T2 yum-updatesd: unlock pid on close
Sep 22 15:08:03 FC6T2 yum-updatesd: locking pid file from refresh updates
Sep 22 15:08:04 FC6T2 yum-updatesd: need to unlock pid file from refresh updates
end -- this one is missing
[ repeats, every hour or so ]

Comment 13 Jeremy Katz 2006-09-25 21:05:03 UTC

Committed this to CVS -- if you hit the stale lock again, add a comment to reopen

Comment 14 Stewart Adam 2006-10-30 02:12:41 UTC

First time I can confirm with FC6. Logs show:
Oct 23 18:58:14 fc6 yum-updatesd: error getting update info: Cannot open/read
repomd.xml file for repository: core
I was fooling around with the metadata around that time...

Anyways, now yum update's not working again, the .run pid is pointing to the PID
of yum-updatesd, and it's showing nothing but a FUTEX WAIT again.

Comment 15 Doug Ledford 2007-05-23 15:02:57 UTC

This happens consistently for me with every reboot.  In fact, if I remove the
yum.pid file, I can do things with yum like run update, but the next time
yum-updatesd hits a refresh interval, it recreates the pid file, and I'm locked
again.  I have to keep removing the file.  This is made worse if you install the
yum-refresh-updatesd optional package, as every single yum command then requires
you to remove the pid file since it pings yum-updatesd and then yum-updatesd
recreates the pid file.  I straced the application and I'm getting the same
futex(), gettimeofday(), poll() loop that was previously posted.  This is
happening with the latest fc7test4 + updates as of this morning.

Comment 16 Victor Bogado 2007-06-27 13:36:31 UTC

I am having this problem also, my machine is with FC7. I usually have to stop
yum-updatesd to run yum. I also have configured (or tried) yum-updatesd to
download the updates before hand so when I do update I can save time, but this
does not seem to work correctly and last but not least the applet is not showing
up in my main user. 

I am not sure if those problems are indeed related to each other, I think is
relevant to state that this machine is updated from FC6. 

I did the strace and it showed the same loop.

Comment 17 Doug Ledford 2007-07-06 15:25:34 UTC

This still happens for me, but I've done a little tinkering and have a few
updates.  I updated to yum-3.2.1-1.fc7, which is current, before doing my tests.

So, the problem appears to be dbus related.  If I have the emit_by=dbus in the
config file, yum-updatesd goes into a permanent loop and locks the subsystem
from start up.  If I change that to be emit_by=email, then yum-updatesd starts
fine and runs fine, no locks, and just does it's job...up until I start puplet
and it has to handle incoming dbus messages from puplet.  So, basically, as long
as yum-updatesd doesn't need to communicate over dbus, I'm ok, but once it does,
it locks up and won't work any more.

So, in an effort to track this down, I launched yum-updatesd with emit_by=email,
waited for it to settle down after the initial startup was over, then attached
strace -f to the pid, then started puplet, then waited for the strace to devolve
into the continuous poll loop, then killed the strace and started looking
through it.  What I found is that the continuous poll loop and forever locked
yum process is the result of an unhandled exception.  Basically, there's a
*huge* amount of strace data from the thing checking the databases, then it
forks off a process, the process has an exception and exits, and yum-updatesd is
stuck in an infinite loop polling the pipe between what used to be the forked
process and itself.  Since there are a number of things that might cause this,
I'm including all the yum related rpms I have installed, my yum.conf and
yum-updatesd.conf, and the relevant part of the strace (the whole thing is huge,
if you need more, just let me know).

[root@firewall pluginconf.d]# rpm -qa | grep yum
yum-updatesd-3.2.1-1.fc7
yum-merge-conf-1.1.5-1.fc7
yum-utils-1.1.5-1.fc7
yum-3.2.1-1.fc7
yum-priorities-1.1.5-1.fc7
yum-kernel-module-1.1.5-1.fc7
yum-metadata-parser-1.1.0-2.fc7
yum-changelog-1.1.5-1.fc7
yum-protectbase-1.1.5-1.fc7
yum-allowdowngrade-1.1.5-1.fc7
yum-fedorakmod-1.1.5-1.fc7
[root@firewall pluginconf.d]# 

[root@firewall yum]# more yum-updatesd.conf 
[main]
# how often to check for new updates (in seconds)
run_interval = 3600
# how often to allow checking on request (in seconds)
updaterefresh = 600

# how to send notifications (valid: dbus, email, syslog)
emit_via = email

# automatically install updates
do_update = no
# automatically download updates
do_download = yes
# automatically download deps of updates
do_download_deps = yes

[root@firewall etc]# more yum.conf 
[main]
cachedir=/var/cache/yum
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
exactarch=1
obsoletes=1
gpgcheck=1
plugins=1
metadata_expire=1800

# PUT YOUR REPOS HERE OR IN separate files named file.repo
# in /etc/yum.repos.d
[root@firewall etc]# 


I'll attach the strace as a file to this report.  I edited it down, but the
basic flow prior to the edit was that the forked process read a *BUNCH* of data
over a pipe, then it did what's in the strace.  I got the last few pipe reads in
the strace I'm attaching.

Comment 18 Doug Ledford 2007-07-06 15:27:06 UTC

Oh, and I guess I should mention that given that this process is blowing up,
dbus may be a red herring.  It may just be that starting puplet triggers an
attempt to update the cache, and it's the attempt to update the cache that's the
*real* culprit.  If my system wasn't already up to date, and it attempted to
update the cache on startup, that may cause it to blow up as well.

Comment 19 Doug Ledford 2007-07-06 15:28:17 UTC

Created attachment 158667 [details]
strace

the cleaned up strace

Comment 20 Doug Ledford 2007-07-06 16:15:19 UTC

Running this with the -f flag and triggering the update, here is the exception I
get:

Exception in thread UpdateDownloadThread:
Traceback (most recent call last):
  File "/usr/lib64/python2.5/threading.py", line 460, in __bootstrap
    self.run()
  File "/usr/share/yum-cli/yumupd.py", line 264, in run
    (result, msgs) = self.updd.buildTransaction()
  File "/usr/lib/python2.5/site-packages/yum/__init__.py", line 528, in
buildTransaction
    self.plugins.run('postresolve', rescode=rescode, restring=restring)
  File "/usr/lib/python2.5/site-packages/yum/plugins.py", line 164, in run
    func(conduitcls(self, self.base, conf, **kwargs))
  File "/usr/lib/yum-plugins/fedorakmod.py", line 251, in postresolve_hook
    avaModules = c.getRepos().getPackageSack().searchProvides("kernel-modules")
  File "/usr/lib/python2.5/site-packages/yum/packageSack.py", line 240, in
searchProvides
    return self._computeAggregateListResult("searchProvides", name)
  File "/usr/lib/python2.5/site-packages/yum/packageSack.py", line 369, in
_computeAggregateListResult
    sackResult = apply(method, args)
  File "/usr/lib/python2.5/site-packages/yum/sqlitesack.py", line 525, in
searchProvides
    return self.searchPrco(name, "provides")
  File "/usr/lib/python2.5/site-packages/yum/sqlitesack.py", line 441, in searchPrco
    cur = cache.cursor()
ProgrammingError: SQLite objects created in a thread can only be used in that
same thread.The object was created in thread id 46912496262992 and this is
thread id 1084229968

Comment 21 Doug Ledford 2007-07-06 17:06:02 UTC

Since it appears the fedorakmod is the plugin that was attempting to access the
SQLite object incorrectly, I removed it and tried things again.  With the
fedorakmod plugin removed, things work OK for me now.  I'll set things back to
emit_via=dbus and confirm it works ok that way too.

Comment 22 Doug Ledford 2007-07-06 17:08:06 UTC

Yep, dbus operation is fine now.

Comment 23 Doug Ledford 2007-07-09 18:55:13 UTC

Further update:  Now that yum-updatesd no longer leaves a stale lock behind due
to the download thread throwing an exception, I still don't have a properly
working yum-updatesd (it acts like it's checking things, but even though there
were updates today, it had not downloaded them as my config tells it to) and
puplet still doesn't give me any kind of status indicator in the notification area.

Comment 24 Seth Vidal 2007-08-03 19:29:10 UTC

Doug could you give yum and yum-updatesd in rawhide a try? I think that should
resolve this. Reopen if it doesn't.

Note You need to log in before you can comment on or make changes to this bug.