Bug 253634

Summary: Install issues with rhel5 ga smp guest on rhel5.1 dom0
Product: Red Hat Enterprise Linux 5 Reporter: Gurhan Ozen <gozen>
Component: xorg-x11-fontsAssignee: Søren Sandmann Pedersen <sandmann>
Status: CLOSED WONTFIX QA Contact: desktop-bugs <desktop-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 5.1CC: areis, clalance, gozen, jburke, kem, mrezanin, nitin.a.kamble, shaohui.zheng, wilfred.yu, xen-maint
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-02 13:21:35 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 222082, 391221    
Attachments:
Description Flags
/proc/cpuinfo output of the box
none
/proc/cpuinfo output on the box where this problem didn't occur.
none
dmesg output of the Dom0 host
none
xm dmesg on dom0
none
dmesg from guest
none
Requested /tmp/anaconda.log
none
Requested /tmp/syslog
none
Requested ps -ef output none

Description Gurhan Ozen 2007-08-20 22:26:27 UTC
Description of problem:
   This has happened consistently to me and mjenner so i am opening a bug about
it. When installing a rhel5 GA smp guest on the latest release candidate 5.1
dom0, the guest installation just freezes at some random point while installing
packages. There is no feedback from the installation that i can get, it just
freezes there. Note that UP guests off of same distro installed just fine. 
   There is no information in xend.log about this.

Version-Release number of selected component (if applicable):
On dom0:
# rpm -qa | egrep 'xen|libvirt'
libvirt-python-0.2.3-7.el5
kernel-xen-2.6.18-41.el5
xen-3.0.3-36.el5
xen-libs-3.0.3-36.el5
libvirt-0.2.3-7.el5

Use rhel5 ga i386 guest.

How reproducible:
Everytime.

Steps to Reproduce:
1. virt-install -n rhel5ga_hvm -f /var/lib/xen/images/rhel5ga_hvm.img -c
/var/lib/xen/images/isos/rhel5ga_i386.iso -s 10 -r 1024 --hvm --vnc --vcpus=2
2.
3.
  
Actual results:
 Installation freezes while installing packages

Expected results:
  Should complete installation.

Additional info:

Comment 1 Stephen Tweedie 2007-08-21 11:43:38 UTC
What hardware/CPU are you using?  Thanks.


Comment 2 Gurhan Ozen 2007-08-21 14:47:49 UTC
(In reply to comment #1)
> What hardware/CPU are you using?  Thanks.
> 

Sorry for not giving that information out in the bug report. This *only* happens
with Intel i386 dom0, and i386 hvm smp guests. 

Even UP guest in the same setup is fine. 



Comment 3 Gurhan Ozen 2007-08-22 16:06:46 UTC
An update to this bug. If you install a UP guest and change the vcpus value in
its config file and restart, then it does work.



Comment 6 Don Domingo 2007-08-22 23:11:18 UTC
proposed release note for RHEL5.1 release notes updates:
<quote>
(x86) When installing Red Hat Enterprise Linux 5 on a fully virtualized SMP
guest, the installation will freeze. This occurs when the host is running Red
Hat Enterprise Linux 5.1.

To avoid this, set the guest to use UP by editing the corresponding vcpus value
in the config file prior to installing Red Hat Enterprise Linux 5.
</quote>

can you provide me with the exact vcpus value (e.g. "vcpus=blah") and the exact
name/location of the config file that needs to be revised for the workaround?
thanks!

Comment 8 Stephen Tweedie 2007-08-23 20:41:35 UTC
Has this been observed on more than one box?  Can you please attach
/proc/cpuinfo for the affected system[s]?  Thanks.


Comment 9 Gurhan Ozen 2007-08-24 14:26:06 UTC
Created attachment 172419 [details]
/proc/cpuinfo output of the box

Comment 10 Gurhan Ozen 2007-08-24 14:27:43 UTC
yes, mjenner has seen the same issue as well. I just attached /proc/cpuinfo
attachment of the box. 

Comment 11 Martin Jenner 2007-08-24 14:47:50 UTC
My observations/testing 

- This issue does not happens on the AMD systems at all (none I have tested on)
- I have only seen this on my Intel tests systems which are
 -  lagrande
 -  broadwater




Comment 13 Gurhan Ozen 2007-08-24 20:39:30 UTC
Another update. I can't reproduce this on an SGI Woodcrest box with 4 gig of
memory. I am attaching that particular box' cpuinfo . 

I think Clovertown-1 box, which is where i have been seeing this problem, is a
preproduction/prototype box. Martin, can you confirm this?

Comment 14 Gurhan Ozen 2007-08-24 20:40:35 UTC
Created attachment 172455 [details]
/proc/cpuinfo output on the box where this problem didn't occur.

Comment 15 Nitin Kamble 2007-08-24 22:54:52 UTC
Can you send the dmesg log of host and guest?

Did you see it on all Intel boxes, or just few?

Comment 16 Gurhan Ozen 2007-08-27 18:33:02 UTC
(In reply to comment #13)
> Another update. I can't reproduce this on an SGI Woodcrest box with 4 gig of
> memory. I am attaching that particular box' cpuinfo . 

  Sorry that should've read 16 gigs memory.
> 
> I think Clovertown-1 box, which is where i have been seeing this problem, is a
> preproduction/prototype box. Martin, can you confirm this?



Comment 17 Gurhan Ozen 2007-08-28 16:17:30 UTC
(In reply to comment #15)
> Can you send the dmesg log of host and guest?
> 
> Did you see it on all Intel boxes, or just few?

We saw this in at least couple boxen but also got at least another box working.
The working one was a woodcrest box. I'll be attaching the dmesg logs shortly.

Comment 18 Gurhan Ozen 2007-08-28 16:20:47 UTC
Created attachment 177001 [details]
dmesg output of the Dom0 host

Comment 19 Gurhan Ozen 2007-08-28 16:27:44 UTC
Created attachment 177021 [details]
xm dmesg on dom0

Comment 20 Gurhan Ozen 2007-08-28 16:31:51 UTC
Created attachment 177041 [details]
dmesg from guest

Comment 21 Nitin Kamble 2007-08-29 00:26:11 UTC
>	I try to reproduce this issue; it does not exist on 
>xen-unstable tree. It does not exist on woodcrest with 
>RHEL5.1, either. It can be reproduced on broadwater.  There is 
>a little difference with reporter’s description. Guest OS will 
>hang when it try to probe the video card, not a random point. 
>See the attachment.
>	RH disabled the console of Guest OS, so I can not get 
>any useful information. As the report said, UP guest is Ok.
>
>-
>Shaohui


Comment 22 Shaohui 2007-09-14 02:44:34 UTC
Our developer thinks that this is a time virtualization issue and no perfect 
solution at this point. We will introduce a simple solution in future. Before 
this, pls.
1. Do not make LP too busy (e.g. pin multiple VPs on same LP)
2. If required this, use "clock=pit" as a work around.

During my testing, I find that it works sometimes.
Shaohui



Comment 28 RHEL Program Management 2008-03-11 19:42:13 UTC
This request was previously evaluated by Red Hat Product Management
for inclusion in the current Red Hat Enterprise Linux release, but
Red Hat was unable to resolve it in time.  This request will be
reviewed for a future Red Hat Enterprise Linux release.

Comment 29 Don Domingo 2008-04-02 02:15:30 UTC
Hi,
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:
http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.

Cheers,
Don

Comment 31 Michal Novotny 2009-10-05 09:51:15 UTC
I was unable to reproduce this one using RHEL 5.4 dom0 and RHEL 5 GA guest on i386 for both host and guest systems on RHTS system. Could you please try to reproduce this one with RHEL 5.4 host machine ? Thanks

Michal

Comment 32 Gurhan Ozen 2009-10-07 09:56:12 UTC
(In reply to comment #31)
> I was unable to reproduce this one using RHEL 5.4 dom0 and RHEL 5 GA guest on
> i386 for both host and guest systems on RHTS system. Could you please try to
> reproduce this one with RHEL 5.4 host machine ? Thanks
> 
> Michal  

It doesn't happen at each and every box, only happens in certain Intel boxen. 

The box that had this issue is now at RHTS and I can reproduce it with 5.4 dom0. The hostname is intel-s3e8132-01.rhts.bos.redhat.com , currently it's reserved, so feel free to jump in and poke around.

Comment 33 Michal Novotny 2009-10-07 10:27:59 UTC
(In reply to comment #32)
> (In reply to comment #31)
> > I was unable to reproduce this one using RHEL 5.4 dom0 and RHEL 5 GA guest on
> > i386 for both host and guest systems on RHTS system. Could you please try to
> > reproduce this one with RHEL 5.4 host machine ? Thanks
> > 
> > Michal  
> 
> It doesn't happen at each and every box, only happens in certain Intel boxen. 
> 
> The box that had this issue is now at RHTS and I can reproduce it with 5.4
> dom0. The hostname is intel-s3e8132-01.rhts.bos.redhat.com , currently it's
> reserved, so feel free to jump in and poke around.  

Ok, on what boxes does it happen exactly? Not just intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the issue for comparison of parameters and to get to know what could be the problem - ie. if some processor type, model, etc...

Comment 34 Jiri Denemark 2009-10-07 13:12:49 UTC
Hmm, sometimes it hangs during X server start, sometimes it doesn't hang at all. I'll check that with newer RHEL versions. I guess it could be a bug in cirrus_vga emulation in dom0 or something in the corresponding X driver in RHEL 5 GA. Nothing suspicious can be seen in dom0 logs.

Comment 35 Gurhan Ozen 2009-10-08 14:50:45 UTC
(In reply to comment #33)

> Ok, on what boxes does it happen exactly? Not just
> intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the
> issue for comparison of parameters and to get to know what could be the problem
> - ie. if some processor type, model, etc...  

I don't know the pattern really, it had only happened to me on that one box when I opened this bug but mjenner had run into it on other boxes too. Also on comment #11, he states that he only ran into this on lagrande and broadwater boxes. 

I could schedule a lot of jobs in rhts and fish for more if you want to.

Comment 36 Michal Novotny 2009-10-09 08:55:16 UTC
(In reply to comment #35)
> (In reply to comment #33)
> 
> > Ok, on what boxes does it happen exactly? Not just
> > intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the
> > issue for comparison of parameters and to get to know what could be the problem
> > - ie. if some processor type, model, etc...  
> 
> I don't know the pattern really, it had only happened to me on that one box
> when I opened this bug but mjenner had run into it on other boxes too. Also on
> comment #11, he states that he only ran into this on lagrande and broadwater
> boxes. 
> 
> I could schedule a lot of jobs in rhts and fish for more if you want to.  

That could be great Gurhan. I've never been able to reproduce it on the RHTS machine I reserved and just one machine didn't help a lot because if we know about more machines, we can compare the parameters and check where the problem is but when we have just one machine, we can't know if it's an hardware issue like it seems to be one - at least according to comment #32... 

Jirka, in addition to your comment #34, I would like to ask you whether you can find something suspicious in guest's X server log if you think it may be a bug in cirrus_vga emulation. In dom0 it's emulation for domU but in domU there is the emulated environment so logs in domU should contain some evidence of errors I think.

Michal

Comment 37 Michal Novotny 2009-12-14 10:26:15 UTC
(In reply to comment #35)
> (In reply to comment #33)
> 
> > Ok, on what boxes does it happen exactly? Not just
> > intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the
> > issue for comparison of parameters and to get to know what could be the problem
> > - ie. if some processor type, model, etc...  
> 
> I don't know the pattern really, it had only happened to me on that one box
> when I opened this bug but mjenner had run into it on other boxes too. Also on
> comment #11, he states that he only ran into this on lagrande and broadwater
> boxes. 
> 
> I could schedule a lot of jobs in rhts and fish for more if you want to.  

Any update on this ? Have you run into this problem more times?

Michal

Comment 39 Miroslav Rezanina 2010-04-21 14:03:39 UTC
I retested this on intel-s3e8132-01.rhts.eng.bos.redhat.com. Installation is stuck during package copying even for 5.5 dom0 and 5.5 hvm domU. Guest itself is running and after some (long) time copying continues.

Comment 40 Miroslav Rezanina 2010-04-23 10:55:42 UTC
Testing shows, that guest is working without problems - network and disk works without problems, manual execution of commands is ok, switching to different consoles works. Problem can be in anaconda or in x server (I observe freeze during monitor probing once, and graphical console is not redrawn after stucking).

More testing also shows, that installing 5.5 domU is almost evrytimes successful (have one or too stuck). 5.0 domU stucks when xorg-x11-fonts-Type1-7.1-2.1.el5.noarch is installed.

Comment 42 Chris Lumens 2010-04-23 13:42:09 UTC
When the hang happens, can you switch VTs?  If so, can you grab /tmp/syslog and /tmp/anaconda.log?  Can you also attach the output of ps -ef?

Comment 43 Miroslav Rezanina 2010-04-26 06:00:02 UTC
Created attachment 409071 [details]
Requested /tmp/anaconda.log

Comment 44 Miroslav Rezanina 2010-04-26 06:00:38 UTC
Created attachment 409073 [details]
Requested /tmp/syslog

Comment 45 Miroslav Rezanina 2010-04-26 06:01:21 UTC
Created attachment 409074 [details]
Requested ps -ef output

Requested files attached

Comment 46 Chris Lumens 2010-04-26 14:39:02 UTC
Looks like a packaging bug to me:

root      4789   602  0 00:35 tty1     00:00:00 /bin/sh /var/tmp/rpm-tmp.66059 1
root      4793  4789  0 00:35 tty1     00:00:00 fc-cache /usr/share/X11/fonts/Type1

anaconda's got to wait until the package's scriptlets are done before proceeding.  If the scriptlets never finish, you'll never get to the next package.

Comment 48 RHEL Program Management 2010-08-09 18:55:01 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 50 RHEL Program Management 2014-03-07 13:52:48 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 51 RHEL Program Management 2014-06-02 13:21:35 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).

Comment 52 Red Hat Bugzilla 2023-09-14 01:11:24 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days