Bug 225203 - Reading /sys/hypervisor/uuid in Dom0 hangs if XenStoreD isn't running
Summary: Reading /sys/hypervisor/uuid in Dom0 hangs if XenStoreD isn't running
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Markus Armbruster
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 224494
TreeView+ depends on / blocked
 
Reported: 2007-01-29 19:07 UTC by Daniel Berrangé
Modified: 2010-08-10 11:52 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-09-10 08:29:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Crude but simple patch to avoid the hang (429 bytes, patch)
2007-02-01 09:20 UTC, Markus Armbruster
no flags Details | Diff

Description Daniel Berrangé 2007-01-29 19:07:40 UTC
Description of problem:
Attempting to read from /sys/hypervisor/uuid in Dom0 will hang indefinitely if
XenStored hasn't been launched.

Version-Release number of selected component (if applicable):
uname -r   2.6.18-1.2747.el5xen
kernel-xen-2.6.18-1.2747.el5

Confirmed same behaviour on 32 & 64 bit kernels.

How reproducible:
Always, provided XenD has *never* been started since boot

Steps to Reproduce:
1. chkconfig  xend off
2. Reboot  Dom0 in kernel-xen
3. # cat /sys/hypervisor/uuid 
  
Actual results:
Hangs indefinitely

Expected results:
Prints 0000000000000000000000000000000000 (16 of them)

Additional info:
SysRq+t  shows the following trace:

Jan 29 07:07:47 dhcp-5-234 kernel: cat           D ffff88004c4c3d98     0   388
   344                     (NOTLB)
Jan 29 07:07:47 dhcp-5-234 kernel:  ffff88004c4c3d98  ffff88004c4c3d38 
0000014f00000002  0000000000000008 
Jan 29 07:07:47 dhcp-5-234 kernel:  ffff88007406a820  ffffffff804b9a00 
00000000000d403f  ffff88007406aa08 
Jan 29 07:07:47 dhcp-5-234 kernel:  ffff88007406a820  ffffffffffffffff 
Jan 29 07:07:47 dhcp-5-234 kernel: Call Trace:
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff80219798>] vsnprintf+0x33b/0x59e
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff803926bd>] read_reply+0x85/0xf5
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff80294dbe>]
autoremove_wake_function+0x0/0x2e
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff8030512d>] inode_has_perm+0x56/0x63
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff803928a5>] xs_talkv+0xba/0x176
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff80392ab1>] xs_single+0x3e/0x43
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff80392d64>] xenbus_read+0x3c/0x53
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff8038f0b9>] uuid_show+0x1e/0x7c
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff802e8f4c>] sysfs_read_file+0xa5/0x13f
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff8020b3ae>] vfs_read+0xcb/0x171
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff802116b4>] sys_read+0x45/0x6e
Jan 29 07:07:47 dhcp-5-234 kernel:  [<ffffffff8025c65d>] tracesys+0xa7/0xb2

Comment 1 Markus Armbruster 2007-01-29 19:14:47 UTC
Possible work-around: read /sys/hypervisor/uuid only if xenstored is running,
check with kill -0 `cat /var/run/xenstore.pid` or equivalent.


Comment 2 Bret McMillan 2007-01-29 19:26:59 UTC
Ick.  At that point, I'd almost just go w/ using the bios stuff we need for
fully-virt to pick up UUID's.

I've got emails out on that to see if all we need is to have xenstore running
for that to show up in HAL... it may be a lot more robust to go off of that.

Comment 3 Bret McMillan 2007-01-29 19:29:13 UTC
Proposing this as a blocker for RHEL5 GA, either we fix this issue, or bz #
224494 becomes a RHEL5 blocker w/ some work-around.

Comment 4 Daniel Berrangé 2007-01-29 19:35:11 UTC
Aside from the option mentioned by Markus in comment #1, the other possible
workaround is to look for UUID in the SMBIOS / DMIDecode data first, only trying
the /sys/hypervisor/uuid file if SMBIOS data is not present. A baremetal or Dom0
kernel/host will typically always have SMBIOS data avaiable, so by preferring
SMBIOS you should minimise chances of hitting this bug.


Comment 5 Suzanne Logcher 2007-01-31 19:16:46 UTC
Bret is working on this fix in bug 224494 in rhn_register.
Moved this bugzilla to 5.1.
Note that depending on Bret's fix, this issue may be mute.

Comment 6 Markus Armbruster 2007-02-01 09:20:12 UTC
Created attachment 147079 [details]
Crude but simple patch to avoid the hang

The read hangs because communication with xenstored blocks.  Sending the
request can block (ring buffer filled up already), receiving the reply will
block.	

This is a crude but simple patch to fail the read with EAGAIN right away unless
xenstored is known to have started.  The read will still hang if xenstored
starts okay, but later dies for whatever reason.  Note that stopping xend does
not kill xenstored.  Death of xenstored loses all xenstore contents, which
makes Xen quite unhappy.

Comment 7 RHEL Program Management 2007-04-25 21:56:17 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Markus Armbruster 2007-04-30 19:50:58 UTC
Comment #5 says `this issue may be mute'.  It blocks bz#224494, which is now
CLOSED.  This suggests to me that it is indeed moot (and doesn't block).  Is it
moot or not?

I'm not sure who can answer this, and I'm making this a NEEDINFO from Pete just
because bz#224494 is assigned to him.


Comment 9 RHEL Program Management 2007-09-07 19:56:12 UTC
This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time.  This request will be
reviewed for a future Red Hat Enterprise Linux release.

Comment 10 Markus Armbruster 2007-09-10 08:29:19 UTC
This bug lingered in comment#8's NEEDINFO for months, until comment#9 switched
it back to ASSIGNED without answering my question.  Instead of setting NEEDINFO
again, I now simply assume that the issue is indeed moot and resolve it WONTFIX.
 If it is not, please reopen the bug.


Comment 11 Andrew Gormanly 2010-04-14 15:30:51 UTC
Not moot, have just hit this on a production server with patches up to 2010-03-17 10:25 GMT applied.

Comment 12 Umesh 2010-08-10 11:06:07 UTC
Hi Markus,

I am faceing the same issue regarding Rhel 5..

Which is gerating unwanted services in back hand as fallows.

  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  396 cat /sys/hypervisor/uuid
  668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print progname ":\n"?????   progname="";????       }????
  668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print progname ":\n"?????   progname="";????       }????
  668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print progname ":\n"?????   progname="";????       }????
  668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print progname ":\n"?????   progname="";????       }????
  668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print progname ":\n"?????   progname="";????       }????
  668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print progname ":\n"?????   progname="";????       }????
 
-----------------
After some time server get hang becouse of this OS

kindly letmeknow after the installing this Patch is this issue will resolve..
please tell me how to install this Patch..in Rhel 5

Thanks
Umesh

(In reply to comment #6)
> Created an attachment (id=147079) [details]
> Crude but simple patch to avoid the hang
> The read hangs because communication with xenstored blocks.  Sending the
> request can block (ring buffer filled up already), receiving the reply will
> block.	
> This is a crude but simple patch to fail the read with EAGAIN right away unless
> xenstored is known to have started.  The read will still hang if xenstored
> starts okay, but later dies for whatever reason.  Note that stopping xend does
> not kill xenstored.  Death of xenstored loses all xenstore contents, which
> makes Xen quite unhappy.

Comment 13 Umesh 2010-08-10 11:36:21 UTC
(In reply to comment #12)
> Hi,
> I am faceing the same issue regarding Rhel 5..
> Which is genrating unwanted services in back hand as fallows.
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond

> ---------------------------------------------------

> After some time server get hang becouse of this Services & memory usage will rise becoz of the services
> kindly let me know after the installing this Patch is this issue will resolve..
My server is runing on cluster mode.....
> please tell me how to install this Patch..in Rhel 5

> Thanks

> Umesh

> (In reply to comment #6)
> > Created an attachment (id=147079) [details] [details]
> > Crude but simple patch to avoid the hang
> > The read hangs because communication with xenstored blocks.  Sending the
> > request can block (ring buffer filled up already), receiving the reply will
> > block.	
> > This is a crude but simple patch to fail the read with EAGAIN right away unless
> > xenstored is known to have started.  The read will still hang if xenstored
> > starts okay, but later dies for whatever reason.  Note that stopping xend does
> > not kill xenstored.  Death of xenstored loses all xenstore contents, which
> > makes Xen quite unhappy.

Comment 14 Umesh 2010-08-10 11:52:59 UTC
(In reply to comment #12)
> Hi,
> I am faceing the same issue regarding Rhel 5..
> Which is genrating unwanted services in back hand as fallows.
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   396 cat /sys/hypervisor/uuid
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
>   668 awk -v progname=/etc/cron.hourly/mcelog.cron progname {?????   print
> progname ":\n"?????   progname="";????       }????
1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
 1036 /bin/bash /usr/bin/run-parts /etc/cron.hourly
1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond
 1588 crond

> ---------------------------------------------------

> After some time server get hang becouse of this Services & memory usage will rise becoz of the services
> kindly let me know after the installing this Patch is this issue will resolve..
My server is runing on cluster mode.....
> please tell me how to install this Patch..in Rhel 5

> Thanks

> Umesh

> (In reply to comment #6)
> > Created an attachment (id=147079) [details] [details]
> > Crude but simple patch to avoid the hang
> > The read hangs because communication with xenstored blocks.  Sending the
> > request can block (ring buffer filled up already), receiving the reply will
> > block.	
> > This is a crude but simple patch to fail the read with EAGAIN right away unless
> > xenstored is known to have started.  The read will still hang if xenstored
> > starts okay, but later dies for whatever reason.  Note that stopping xend does
> > not kill xenstored.  Death of xenstored loses all xenstore contents, which
> > makes Xen quite unhappy.


Note You need to log in before you can comment on or make changes to this bug.