Description of problem: This has happened consistently to me and mjenner so i am opening a bug about it. When installing a rhel5 GA smp guest on the latest release candidate 5.1 dom0, the guest installation just freezes at some random point while installing packages. There is no feedback from the installation that i can get, it just freezes there. Note that UP guests off of same distro installed just fine. There is no information in xend.log about this. Version-Release number of selected component (if applicable): On dom0: # rpm -qa | egrep 'xen|libvirt' libvirt-python-0.2.3-7.el5 kernel-xen-2.6.18-41.el5 xen-3.0.3-36.el5 xen-libs-3.0.3-36.el5 libvirt-0.2.3-7.el5 Use rhel5 ga i386 guest. How reproducible: Everytime. Steps to Reproduce: 1. virt-install -n rhel5ga_hvm -f /var/lib/xen/images/rhel5ga_hvm.img -c /var/lib/xen/images/isos/rhel5ga_i386.iso -s 10 -r 1024 --hvm --vnc --vcpus=2 2. 3. Actual results: Installation freezes while installing packages Expected results: Should complete installation. Additional info:
What hardware/CPU are you using? Thanks.
(In reply to comment #1) > What hardware/CPU are you using? Thanks. > Sorry for not giving that information out in the bug report. This *only* happens with Intel i386 dom0, and i386 hvm smp guests. Even UP guest in the same setup is fine.
An update to this bug. If you install a UP guest and change the vcpus value in its config file and restart, then it does work.
proposed release note for RHEL5.1 release notes updates: <quote> (x86) When installing Red Hat Enterprise Linux 5 on a fully virtualized SMP guest, the installation will freeze. This occurs when the host is running Red Hat Enterprise Linux 5.1. To avoid this, set the guest to use UP by editing the corresponding vcpus value in the config file prior to installing Red Hat Enterprise Linux 5. </quote> can you provide me with the exact vcpus value (e.g. "vcpus=blah") and the exact name/location of the config file that needs to be revised for the workaround? thanks!
Has this been observed on more than one box? Can you please attach /proc/cpuinfo for the affected system[s]? Thanks.
Created attachment 172419 [details] /proc/cpuinfo output of the box
yes, mjenner has seen the same issue as well. I just attached /proc/cpuinfo attachment of the box.
My observations/testing - This issue does not happens on the AMD systems at all (none I have tested on) - I have only seen this on my Intel tests systems which are - lagrande - broadwater
Another update. I can't reproduce this on an SGI Woodcrest box with 4 gig of memory. I am attaching that particular box' cpuinfo . I think Clovertown-1 box, which is where i have been seeing this problem, is a preproduction/prototype box. Martin, can you confirm this?
Created attachment 172455 [details] /proc/cpuinfo output on the box where this problem didn't occur.
Can you send the dmesg log of host and guest? Did you see it on all Intel boxes, or just few?
(In reply to comment #13) > Another update. I can't reproduce this on an SGI Woodcrest box with 4 gig of > memory. I am attaching that particular box' cpuinfo . Sorry that should've read 16 gigs memory. > > I think Clovertown-1 box, which is where i have been seeing this problem, is a > preproduction/prototype box. Martin, can you confirm this?
(In reply to comment #15) > Can you send the dmesg log of host and guest? > > Did you see it on all Intel boxes, or just few? We saw this in at least couple boxen but also got at least another box working. The working one was a woodcrest box. I'll be attaching the dmesg logs shortly.
Created attachment 177001 [details] dmesg output of the Dom0 host
Created attachment 177021 [details] xm dmesg on dom0
Created attachment 177041 [details] dmesg from guest
> I try to reproduce this issue; it does not exist on >xen-unstable tree. It does not exist on woodcrest with >RHEL5.1, either. It can be reproduced on broadwater. There is >a little difference with reporter’s description. Guest OS will >hang when it try to probe the video card, not a random point. >See the attachment. > RH disabled the console of Guest OS, so I can not get >any useful information. As the report said, UP guest is Ok. > >- >Shaohui
Our developer thinks that this is a time virtualization issue and no perfect solution at this point. We will introduce a simple solution in future. Before this, pls. 1. Do not make LP too busy (e.g. pin multiple VPs on same LP) 2. If required this, use "clock=pit" as a work around. During my testing, I find that it works sometimes. Shaohui
This request was previously evaluated by Red Hat Product Management for inclusion in the current Red Hat Enterprise Linux release, but Red Hat was unable to resolve it in time. This request will be reviewed for a future Red Hat Enterprise Linux release.
Hi, the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at which point no further additions or revisions will be entertained. a mockup of the RHEL5.2 release notes can be viewed at the following link: http://intranet.corp.redhat.com/ic/intranet/RHEL5u2relnotesmockup.html please use the aforementioned link to verify if your bugzilla is already in the release notes (if it needs to be). each item in the release notes contains a link to its original bug; as such, you can search through the release notes by bug number. Cheers, Don
I was unable to reproduce this one using RHEL 5.4 dom0 and RHEL 5 GA guest on i386 for both host and guest systems on RHTS system. Could you please try to reproduce this one with RHEL 5.4 host machine ? Thanks Michal
(In reply to comment #31) > I was unable to reproduce this one using RHEL 5.4 dom0 and RHEL 5 GA guest on > i386 for both host and guest systems on RHTS system. Could you please try to > reproduce this one with RHEL 5.4 host machine ? Thanks > > Michal It doesn't happen at each and every box, only happens in certain Intel boxen. The box that had this issue is now at RHTS and I can reproduce it with 5.4 dom0. The hostname is intel-s3e8132-01.rhts.bos.redhat.com , currently it's reserved, so feel free to jump in and poke around.
(In reply to comment #32) > (In reply to comment #31) > > I was unable to reproduce this one using RHEL 5.4 dom0 and RHEL 5 GA guest on > > i386 for both host and guest systems on RHTS system. Could you please try to > > reproduce this one with RHEL 5.4 host machine ? Thanks > > > > Michal > > It doesn't happen at each and every box, only happens in certain Intel boxen. > > The box that had this issue is now at RHTS and I can reproduce it with 5.4 > dom0. The hostname is intel-s3e8132-01.rhts.bos.redhat.com , currently it's > reserved, so feel free to jump in and poke around. Ok, on what boxes does it happen exactly? Not just intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the issue for comparison of parameters and to get to know what could be the problem - ie. if some processor type, model, etc...
Hmm, sometimes it hangs during X server start, sometimes it doesn't hang at all. I'll check that with newer RHEL versions. I guess it could be a bug in cirrus_vga emulation in dom0 or something in the corresponding X driver in RHEL 5 GA. Nothing suspicious can be seen in dom0 logs.
(In reply to comment #33) > Ok, on what boxes does it happen exactly? Not just > intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the > issue for comparison of parameters and to get to know what could be the problem > - ie. if some processor type, model, etc... I don't know the pattern really, it had only happened to me on that one box when I opened this bug but mjenner had run into it on other boxes too. Also on comment #11, he states that he only ran into this on lagrande and broadwater boxes. I could schedule a lot of jobs in rhts and fish for more if you want to.
(In reply to comment #35) > (In reply to comment #33) > > > Ok, on what boxes does it happen exactly? Not just > > intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the > > issue for comparison of parameters and to get to know what could be the problem > > - ie. if some processor type, model, etc... > > I don't know the pattern really, it had only happened to me on that one box > when I opened this bug but mjenner had run into it on other boxes too. Also on > comment #11, he states that he only ran into this on lagrande and broadwater > boxes. > > I could schedule a lot of jobs in rhts and fish for more if you want to. That could be great Gurhan. I've never been able to reproduce it on the RHTS machine I reserved and just one machine didn't help a lot because if we know about more machines, we can compare the parameters and check where the problem is but when we have just one machine, we can't know if it's an hardware issue like it seems to be one - at least according to comment #32... Jirka, in addition to your comment #34, I would like to ask you whether you can find something suspicious in guest's X server log if you think it may be a bug in cirrus_vga emulation. In dom0 it's emulation for domU but in domU there is the emulated environment so logs in domU should contain some evidence of errors I think. Michal
(In reply to comment #35) > (In reply to comment #33) > > > Ok, on what boxes does it happen exactly? Not just > > intel-s3e8132-01.rhts.bos.redhat.com but what *all* boxes tested does have the > > issue for comparison of parameters and to get to know what could be the problem > > - ie. if some processor type, model, etc... > > I don't know the pattern really, it had only happened to me on that one box > when I opened this bug but mjenner had run into it on other boxes too. Also on > comment #11, he states that he only ran into this on lagrande and broadwater > boxes. > > I could schedule a lot of jobs in rhts and fish for more if you want to. Any update on this ? Have you run into this problem more times? Michal
I retested this on intel-s3e8132-01.rhts.eng.bos.redhat.com. Installation is stuck during package copying even for 5.5 dom0 and 5.5 hvm domU. Guest itself is running and after some (long) time copying continues.
Testing shows, that guest is working without problems - network and disk works without problems, manual execution of commands is ok, switching to different consoles works. Problem can be in anaconda or in x server (I observe freeze during monitor probing once, and graphical console is not redrawn after stucking). More testing also shows, that installing 5.5 domU is almost evrytimes successful (have one or too stuck). 5.0 domU stucks when xorg-x11-fonts-Type1-7.1-2.1.el5.noarch is installed.
When the hang happens, can you switch VTs? If so, can you grab /tmp/syslog and /tmp/anaconda.log? Can you also attach the output of ps -ef?
Created attachment 409071 [details] Requested /tmp/anaconda.log
Created attachment 409073 [details] Requested /tmp/syslog
Created attachment 409074 [details] Requested ps -ef output Requested files attached
Looks like a packaging bug to me: root 4789 602 0 00:35 tty1 00:00:00 /bin/sh /var/tmp/rpm-tmp.66059 1 root 4793 4789 0 00:35 tty1 00:00:00 fc-cache /usr/share/X11/fonts/Type1 anaconda's got to wait until the package's scriptlets are done before proceeding. If the scriptlets never finish, you'll never get to the next package.
This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days