Bug 749779

Summary: Can not boot VM from PXE, there display 'connection timed out... No more network device'
Product: Red Hat Enterprise Linux 6 Reporter: Ying Cui <ycui>
Component: qemu-kvmAssignee: Alex Williamson <alex.williamson>
Status: CLOSED WORKSFORME QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2CC: acathrow, apevec, areis, bsarathy, chayang, cshao, dyasny, gouyang, juzhang, knoel, leiwang, mburns, mkenneth, ovirt-maint, rananda, virt-maint
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-22 07:25:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
connection timed out pic
none
vdsm log
none
attachment include "Screenshot of vm console" and "tcpdump.log" none

Description Ying Cui 2011-10-28 10:43:06 UTC
Description.
clean RHEV-H installation successful, connect RHEVM IC146. VM can not boot from PXE. There display 'connection timed out (0x4c126035)No more network devices'.

Test build:
rhevh 6.2-20111019.5

Reproduce
40%, not every times, but in HP CCISS machine with Broadcom 5785, it is reproduced can reach 70%.

Packages:
gpxe-roms-qemu-0.9.7-6.9.el6.noarch
ovirt-node-2.0.2-0.13.2.gitb764606.el6.noarch
qemu-kvm-0.12.1.2-2.199.el6.x86_64
vdsm-4.9-110.el6.x86_64

Test steps:
1. clean install RHEV-H successful.
2. Configure network successful.
3. Configure RHEVM successful.
4. Approve RHEVH in RHEVM successful.
5. Connect NFS data domain successful.
6. Create a VM, boot from PXE.

Note: VM with 'virtio'or'e1000'or'rtl8139' type can reproduce this issue.

Actual result:
Can not boot VM from PXE, there display as the following:

net0: 00:1a:4a:42:0b:06 on TCI00:03.0(open)
 [Link:up, TX:0 TXE:0 RX:0 RXE:0]
Waiting for link-up on net0...ok
DHCP(net0 00:1a:4a:42:0b:06 ).............. Connection timed out (0x4c106035)
No more network device


Expected result:
Can boot VM from PXE successful.

Additional info:
1. If boot VM failed, I will boot VM some times, after iterative boot, VM can boot from PXE. It is strange.
2. I submit this bug just to record this issue, we can not reproduce it every time.

Comment 1 Ying Cui 2011-10-28 10:44:08 UTC
Created attachment 530637 [details]
connection timed out pic

Comment 2 Ying Cui 2011-10-28 10:45:34 UTC
Created attachment 530638 [details]
vdsm log

Comment 4 Mike Burns 2011-10-28 11:22:34 UTC
I've pxe-booted 4 VMs including multiple at the same time from the same host with no issues.  

Is it possible there were intermittent networking issues in your environment?

Comment 5 Mike Burns 2011-10-28 15:36:45 UTC
Can you also try with 6.2-20111026.2?

Comment 6 RHEL Program Management 2011-11-01 05:47:12 UTC
Since RHEL 6.2 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 7 Ying Cui 2011-11-01 09:11:12 UTC
(In reply to comment #5)
> Can you also try with 6.2-20111026.2?

I tested this issue with 6.2-20111026.2 build, I can not reproduce this issue on normal machine. 

Note: I can not test this build on HP CCISS machine, because of https://bugzilla.redhat.com/show_bug.cgi?id=747102#c31

This bug does occur occasionally,we are not sure whether it is possible there were intermittent networking issues in our environment.

Comment 14 Guohua Ouyang 2012-03-30 08:37:37 UTC
Try add RHEL6.2-20111117.0 to RHEVM IC155.1, but failed with below error:

Installation of 10.66.11.120. Recieved message: <BSTRAP component='VDS PACKAGES' status='OK' result='vdsm' message='vdsm-4.9.6-4.5.x86_64 '/>
<BSTRAP component='CreateConf' status='FAIL' message='Basic configuration failed to import default values'/>
<BSTRAP component='RHEV_INSTALL' status='FAIL'/>

it should be bug 804618.

so I will try this bug on new RHEV-H 6.3 build when new build available and try on RHEL62 after 804618 is fixed.

Comment 15 Guohua Ouyang 2012-03-30 09:46:03 UTC
(In reply to comment #14)
> Try add RHEL6.2-20111117.0 to RHEVM IC155.1, but failed with below error:
> 
> Installation of 10.66.11.120. Recieved message: <BSTRAP component='VDS
> PACKAGES' status='OK' result='vdsm' message='vdsm-4.9.6-4.5.x86_64 '/>
> <BSTRAP component='CreateConf' status='FAIL' message='Basic configuration
> failed to import default values'/>
> <BSTRAP component='RHEV_INSTALL' status='FAIL'/>
> 

Tested again, the error did present and add RHEL62 host to RHEVM IC 155.1 successfully.  Added 4 vms by using network "rhevm" but with all type virtio, e1000, rtl8139, all can boot from PXE.

Since this bug cannot be always reproduced, QE will test on next RHEV-H 6.3, if still cannot reproduce, will close it.

Comment 16 Mike Burns 2012-04-11 19:13:37 UTC
Please test with beta version and close if not valid anymore

Comment 17 Guohua Ouyang 2012-04-17 02:40:59 UTC
(In reply to comment #16)
> Please test with beta version and close if not valid anymore

QE will test this when can register to RHEVM successfully.

Comment 20 Guohua Ouyang 2012-04-23 04:02:56 UTC
Close this bug since can seldomly reproduce it.

Comment 25 Mike Burns 2012-05-24 12:17:14 UTC
Can you provide tcpdump as well?

Comment 26 Guohua Ouyang 2012-06-13 06:18:40 UTC
(In reply to comment #25)
> Can you provide tcpdump as well?

The machine is used for another testing current, not convenient to move back to test this.  If I get it back or can reproduce it on other machine I will give the tcpdump output ASAP.

Comment 27 haiyang,dong 2012-07-16 08:52:32 UTC
Created attachment 598395 [details]
attachment include  "Screenshot of vm console" and "tcpdump.log"

Reproduce this bug on rhevh-6.3-20120710.0 build.
attachment include  "Screenshot of vm console" and "tcpdump.log"

Comment 28 Ramesh A 2012-08-07 14:31:14 UTC
I am able to reproduce this on My Dell workstation consistently and able to do a workaround to overcome this issue as well.  Please refer the reproduction steps and workaround to overcome this issue

Reprodcution Steps:
==================
1. Install Redhat Enterprise Linux Workstation release 6.3 (Santiago) using RHEL6.3-20120613.2-Workstation-x86_64-DVD1.iso.
2. Configure the bridge network on this workstation as mentioned below in Bridge Configuration and do "service network restart"
3. Try to create a new VM using Virtual Machine Manager

Bridge Configuration:
=====================
#cat /etc/sysconfig/network-scripts/ifcfg-em1 
DEVICE=em1
TYPE=Ethernet
ONBOOT="yes"
BOOTPROTO="dhcp"
HWADDR=MAC ADDR OF THE PHYSICAL PORT
BRIDGE=br0
#UUID=UUID OF THE MACHINE


# cat /etc/sysconfig/network-scripts/ifcfg-br0 
DEVICE=br0
TYPE=Bridge
BOOTPROTO="dhcp"
ONBOOT="yes"
NM_CONTROLLED="no"


Expected Result:
=================
Should be able to connect and install via PXE

Actual Result:
==============
Throws Connection timed out error.


Work Around:
============
I did a work around in this issue and was able to install the VM's successfully.

Proceedure:
===========
During the initial stage of gPXE initialization or when the gPXE throws timed out exception, Press CTRL+B (when the prompt is displayed for this option).  After entering the gPXE prompt type the following two commands and dont worry about the usage error
1. clear
2. exit

After doing this, the system will be able to connect to the PXE server automatically and will be able to complete the installation process successfully

Comment 29 Alex Williamson 2012-08-07 14:48:19 UTC
(In reply to comment #28)
> 
> # cat /etc/sysconfig/network-scripts/ifcfg-br0 
> DEVICE=br0
> TYPE=Bridge
> BOOTPROTO="dhcp"
> ONBOOT="yes"
> NM_CONTROLLED="no"

Your bridge is misconfigured, try adding:

DELAY=0

to set the fowarding delay to zero, otherwise the bridge is still learning the mac addresses and won't forward dhcp replies do your VM.

Comment 30 Ramesh A 2012-08-08 10:48:55 UTC
(In reply to comment #29)
> (In reply to comment #28)
> > 
> > # cat /etc/sysconfig/network-scripts/ifcfg-br0 
> > DEVICE=br0
> > TYPE=Bridge
> > BOOTPROTO="dhcp"
> > ONBOOT="yes"
> > NM_CONTROLLED="no"
> 
> Your bridge is misconfigured, try adding:
> 
> DELAY=0
> 
> to set the fowarding delay to zero, otherwise the bridge is still learning
> the mac addresses and won't forward dhcp replies do your VM.

Hi Alex,

Thanks for the update, things are working fine now.

Comment 31 Alex Williamson 2013-04-10 19:51:32 UTC
Is this problem with the NEC uPD720400 bridge and qla3xxx NIC still reproducible on RHEL6.4?  If so, can a system be made available to investigate?