RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1090237 - e1000 NIC of win2012r2 VM can' get IP from dhcp server after setup net debug and reboot
Summary: e1000 NIC of win2012r2 VM can' get IP from dhcp server after setup net debug ...
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.0
Hardware: Unspecified
OS: Unspecified
urgent
medium
Target Milestone: rc
: ---
Assignee: jason wang
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-04-23 02:33 UTC by lijin
Modified: 2015-06-26 02:53 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-06-26 02:53:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
win2012 debug capacity test hck log file (1.46 MB, application/zip)
2014-04-23 02:33 UTC, lijin
no flags Details
proposed patch (1.39 KB, patch)
2014-12-01 16:46 UTC, Michael S. Tsirkin
no flags Details | Diff
e1000 debug message with win2012r2-64 guest (net debug off) (210.95 KB, text/plain)
2014-12-09 20:32 UTC, Amos Kong
no flags Details
e1000 debug message with win2012r2-64 guest (net debug on) (14.60 MB, text/plain)
2014-12-09 20:40 UTC, Amos Kong
no flags Details
defer tx: on top of my previous patch (799 bytes, patch)
2014-12-15 11:14 UTC, Michael S. Tsirkin
no flags Details | Diff
Three debug log (netdug on/off, win2012/win2012r2) (980 bytes, application/x-gzip)
2014-12-19 03:06 UTC, Amos Kong
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1088815 0 urgent CLOSED [svvp] Debug Capability test (Logo) job should pass in order to achieve SVVP certification 2021-02-22 00:41:40 UTC

Internal Links: 1088815

Description lijin 2014-04-23 02:33:52 UTC
Created attachment 888708 [details]
win2012 debug capacity test hck log file

Description of problem:
run svvp job "Debug Capability test (Logo)" with net transport,job always fail on rhel7 host,it can pass on rhel6 host.

As https://bugzilla.redhat.com/show_bug.cgi?id=1088815#c21 suggests,open a new bug to track this issue.

Version-Release number of selected component (if applicable):
3.10.0-121.el7.x86_64
qemu-kvm-rhev-1.5.3-60.el7ev.x86_64
virtio-win-1.7.0-1.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1.boot win2012R2 SUT guest with one virtio-net-pci and one e1000:
/usr/libexec/qemu-kvm --nodefaults --nodefconfig -m 2G -smp 2 -cpu Nehalem,+kvm_pv_unhalt,hv_spinlocks=0x1fff,hv_relaxed,hv_vapic,hv_time -M pc-i440fx-rhel7.0.0 -usb -device usb-tablet,id=tablet0 -drive file=2012r2.raw,if=none,id=drive-virtio0-0-0,format=raw,werror=stop,rerror=stop,cache=none,serial=number -device virtio-blk-pci,drive=drive-virtio0-0-0,id=virti0-0-0,bootindex=1 -uuid f6b25900-9fb7-4bed-b365-4d85b79efa20 -monitor stdio -vnc :21 -vga cirrus -name win2012R2-INTEL-MAX-new -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -cdrom /usr/share/virtio-win/virtio-win.iso -boot menu=on -device usb-ehci,id=ehci0 -drive file=usb-storage-intel-max.raw,if=none,id=drive-usb-2-0,media=disk,format=raw,cache=none,werror=stop,rerror=stop,aio=threads -device usb-storage,bus=ehci0.0,drive=drive-usb-2-0,id=usb-2-0,removable=on -rtc base=localtime,clock=host,driftfix=slew -chardev socket,id=b111a,path=/tmp/monitor-win2012R2-intel-max,server,nowait -mon chardev=b111a,mode=readline -netdev tap,id=hostnet1,vhost=on,script=/etc/qemu-ifup1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:52:54:2a:22:1f,addr=0x6 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup1 -device e1000,netdev=hostnet0,id=net0,mac=00:52:54:2b:02:e4,addr=0x7

2.boot debug machine with one e1000:
/usr/libexec/qemu-kvm --nodefaults --nodefconfig -m 2G -smp 2 -cpu Opteron_G3,+kvm_pv_unhalt,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time -M pc-i440fx-rhel7.0.0 -usb -device usb-tablet,id=tablet0 -drive file=debug-amd.raw,if=none,id=drive-virtio0-0-0,format=raw,werror=stop,rerror=stop,cache=none,serial=number -device virtio-blk-pci,drive=drive-virtio0-0-0,id=virti0-0-0,bootindex=1 -uuid d04887dd-6a55-4c5f-b9d3-dc76759edd37 -monitor stdio -vnc :16 -vga cirrus -name amd-debug -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot menu=on -rtc base=localtime,clock=host,driftfix=slew -chardev socket,id=b111a,path=/tmp/monitor-win2012R2-amd-max,server,nowait -mon chardev=b111a,mode=readline -netdev tap,id=hostnet1,script=/etc/qemu-ifup1 -device e1000,netdev=hostnet1,id=net1,mac=00:52:56:2a:42:2e,addr=0x07

3.submit the job with net transport in hck2.1:
BusParams  0.7.0  
NetHostIP  192.168.1.195  ---> debug machine ip
NetKey  this.key.isnt.secure  
NetPort  50000
Transport  net

Actual results:
Job failed at child job "vefify debug configuration";
error message:
System.Exception: Kernel debugging over the network fails due to InitializeNetwork failed to get the ethernet address of the host debugger. error code -1073741643.Please address the problem or run the test from another debug connection. at KernelDebugLogoTestSetup.DebugSetupVerifier.DebugSetupVerify()

e1000 of SUT ***CANNOT*** get ip after the first reboot during job running,after the second reboot,everything get normal,both virtio-net-pci and e1000 can get ip.

mount these two images on rhel6 host,then run the job,job can pass easily.
e1000 can get ip after the first reboot.

the detailed network info:
on RHEL7 host:
ipconfig of e1000:192.168.1.29/255.255.255.0/192.168.1.1(gw)
after first reboot during job running
ipconfig of e1000:169.254.241.190/255.255.0.0/null
job failed

on RHEL6 host:
ipconfig of e1000:192.168.1.29/255.255.255.0/192.168.1.1(gw)
after first reboot during job running
ipconfig of e1000:192.168.1.29/255.255.255.0/192.168.1.1
job passed

Expected results:
job can pass on rhel7 host

Additional info:

Comment 2 Ronen Hod 2014-04-23 13:27:05 UTC
This bug was opened since in https://bugzilla.redhat.com/show_bug.cgi?id=1088815#c15 Nike Cao wrote:
Yan ,according to comment #13 ,it seems another bug that HCK can not detect emulated e1000 card device ID as 100E ,Do we need to open a new bug to track it ?
Note that this is a regression from RHEL6 to RHEL7

Comment 3 juzhang 2014-04-24 02:00:32 UTC
Hi Mike,

It's a regression? If it is, please add regression keyword and update priority as high/urgent.

Best Regards,
Junyi

Comment 4 Mike Cao 2014-04-24 02:52:15 UTC
(In reply to juzhang from comment #3)
> Hi Mike,
> 
> It's a regression? If it is, please add regression keyword and update
> priority as high/urgent.
> 
> Best Regards,
> Junyi

It is a regression compared w/ RHEL6 
lijin ,pls try to check whether it is a regresson on RHEL7

Comment 5 Amos Kong 2014-04-26 04:05:24 UTC
Please tell me the network configuration, thanks.

static or DHCP?

private bridge or not?

How long it takes from the beginning to first reboot?

SUT server connect debug server by e1000 connection, and reboot debug server remotely, then fail to connect debug server (ip lost)?

Comment 6 Mike Cao 2014-04-27 01:20:54 UTC
(In reply to Amos Kong from comment #5)
> Please tell me the network configuration, thanks.
> 
> static or DHCP?
DHCP
We tried using static ip,it still failed
> 
> private bridge or not?

bridge
> 
> How long it takes from the beginning to first reboot?
What is the beginning mean here?
> 
> SUT server connect debug server by e1000 connection, and reboot debug server
> remotely, then fail to connect debug server (ip lost)?
yes and it reports no supported nic card for e1000 actually we can make it on rhel 6

Comment 8 Amos Kong 2014-04-29 00:53:03 UTC
I got a simple way to reproduce this QEMU bug:


1) launch a single win2012 guest with only 1 e1000 nic

2) enable debug
  (guest)# bcdedit /debug on

3) setup net debug
  (guest)# bcdedit /dbgsettings net hostip:192.168.122.100 port:50000

  Note: we can use non-existed IP here, it's just used to pass the setup,
        not really use it

4) reboot guest
  (guest)# shutdown /r /t 0


Result: guest can't get IP from dhcp server, and is assigned a internal/local ip.

Comment 9 Amos Kong 2014-04-29 03:22:26 UTC
This regression was introduced in qemu-1.3 by the following commit:

commit 1c380f9460522f32c8dd2577b2a53d518ec91c6d
Author: Avi Kivity <avi>
Date:   Wed Oct 3 17:42:58 2012 +0200

    pci: honor PCI_COMMAND_MASTER
    
    Currently we ignore PCI_COMMAND_MASTER completely: DMA succeeds even when
    the bit is clear.
    
    Honor PCI_COMMAND_MASTER by inserting a memory region into the device's
    bus master address space, and tying its enable status to PCI_COMMAND_MASTER.
    
    Tested using
    
      setpci -s 03 COMMAND=3
    
    while a ping was running on a NIC in slot 3.  The kernel (Linux) detected
    the stall and recovered after the command
    
      setpci -s 03 COMMAND=7
    
    was issued.
    
    Signed-off-by: Avi Kivity <avi>

--------------------------------------------------------------

When I enabled DEBUG_RX, we can see this NOTE when bug occurs: 
  Null RX descriptor!!

Comment 10 Amos Kong 2014-04-29 07:08:06 UTC
I can only reproduce this bug with win2012r2, it works with win2012-64-virtio.qcow2. The nics driver versions are same.

debug adapter: 6/21/2006 Driver Version: 6.2.9200.16384
Intel PRO/1000: 3/23/2010 8.4.13.0

avi's commit 1c380f946 just inserted a memory region into the device's bus master address space, and ties its enable status to PCI_COMMAND_MASTER. And the strict checking is necessary.

If I always set bus_master_enable_region to be true, problem won't occur.

Another probability is that it's a win2012r2 (PCI) driver bug, it doesn't set the PCI_COMMAND_MASTER bit correctly.

Comment 12 Amos Kong 2014-04-29 10:18:49 UTC
I enabled e1000 debug message, and output bus_master bit operations.

I found that bus_master bit of PCI_COMMAND isn't enabled in win2012r2 before re-enable RX (TX & RX might be disabled some times in init/reset stage). It causes DMA isn't allowed, so guest network is down and can't get IP from dhcp server.

But in win2012, bus_master is enabled before re-enable RX, so it works well.

I also debug with RHEL7 guest, the bus_master is also enabled before re-enable RX.


Win2012
===============================================================
                                 1bit:IO   2bit:MEMORY  3bit:MASTER
pci_default_write_config: 259  
pci_default_write_config: 259  
pci_default_write_config: 259  
pci_default_write_config: 259  
pci_default_write_config: 259  
pci_default_write_config: 259  
> pci_default_write_config: 259    1         1           0
> pci_default_write_config: 263    1         1           1
> pci_default_write_config: 256    0         0           0
> pci_default_write_config: 263    1         1           1

                                   #IO/MEMORY/MASTER bits are enalbed
...
> e1000: RCTL: 0, mac_reg[RCTL] = 0x0
> e1000: tx disabled
                                  # TX,RX are disabled (in init/reset?)
....
e1000: set_ics 0, ICR 2, IMR 0
e1000: set_ics 0, ICR 2, IMR 0
> e1000: RCTL: 191, mac_reg[RCTL] = 0x8018
> e1000: RCTL: 191, mac_reg[RCTL] = 0x801a

                                  # RX is re-enabled
e1000: set_ics 80, ICR 2, IMR 0
e1000: set_ics 80, ICR 82, IMR 0
e1000: set_ics 80, ICR 82, IMR 0
e1000: set_ics 80, ICR 82, IMR 0
e1000: set_ics 80, ICR 82, IMR 0
e1000: set_ics 80, ICR 82, IMR 0
e1000: index 0: 0x33c000 : b000135 0
e1000: set_ics 3, ICR 82, IMR 0
e1000: ICR read: 83
                                 # when RX is re-enable,
                                   bus MASTER bit is enabled
                                   DMA is allowed, so everything is fine


Win2012R2
===============================================================
                                 1bit:IO   2bit:MEMORY  3bit:MASTER
pci_default_write_config: 259 
pci_default_write_config: 259 
pci_default_write_config: 259 
pci_default_write_config: 259 
pci_default_write_config: 259 
pci_default_write_config: 259 
pci_default_write_config: 259 
> pci_default_write_config: 263    1         1           1
> pci_default_write_config: 256    0         0           0
> pci_default_write_config: 259    1         1           0

                                   #IO/MEMORY bits are enalbed
                                   #MASTER bit is disabled
e1000: reading eeprom bit 0 (reading 0)
...
> e1000: RCTL: 0, mac_reg[RCTL] = 0x0
> e1000: tx disabled
                                  # TX,RX are disabled (in init/reset?)
...
e1000: set_ics 0, ICR 2, IMR 0
> e1000: RCTL: 191, mac_reg[RCTL] = 0x8018
> e1000: RCTL: 191, mac_reg[RCTL] = 0x801a

                                  # RX is re-enabled
> e1000: Null RX descriptor!!
> e1000: Null RX descriptor!!
> e1000: Null RX descriptor!!
                                  # bus MASTER isn't enabled,
                                    NIC can't become master of PCI bus
                                    DMA isn't allowed.
                                    when RX is re-enabled, problem occurs!

Comment 14 Ludek Smid 2014-06-26 10:55:00 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Comment 15 Ludek Smid 2014-06-26 11:15:24 UTC
The comment above is incorrect. The correct version is bellow.
I'm sorry for any inconvenience.
---------------------------------------------------------------

This request was NOT resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you need
to escalate this bug.

Comment 16 Amos Kong 2014-07-06 12:34:29 UTC
Huawei engineer (lvqian1) told me they are using virtio-nic in this condition, Mike, why can't we use virtio-nic?

Comment 17 Mike Cao 2014-07-06 13:32:30 UTC
(In reply to Amos Kong from comment #16)
> Huawei engineer (lvqian1) told me they are using virtio-nic in
> this condition, Mike, why can't we use virtio-nic?
Impossible . Virtio-nic is not a protocol in MSFT support list for Kernel-Mode Debugging over a Network Cable

Mike

Comment 18 Mike Cao 2014-07-06 13:33:27 UTC
(In reply to Amos Kong from comment #16)
> Huawei engineer (lvqian1) told me they are using virtio-nic in
> this condition, Mike, why can't we use virtio-nic?

Here is the support list 
http://msdn.microsoft.com/zh-cn/windows/hardware/dn337010(v=vs.90)

Comment 19 Amos Kong 2014-07-06 13:37:37 UTC
(In reply to Mike Cao from comment #17)
> (In reply to Amos Kong from comment #16)
> > Huawei engineer (lvqian1) told me they are using virtio-nic in
> > this condition, Mike, why can't we use virtio-nic?
>
> Impossible . Virtio-nic is not a protocol in MSFT support list for
> Kernel-Mode Debugging over a Network Cable

I repeated the case 3 times, but they still insist on the answer..
Let's confirm by email.

> 
> Mike

Comment 20 Mike Cao 2014-07-06 13:42:20 UTC
(In reply to Amos Kong from comment #19)
> (In reply to Mike Cao from comment #17)
> > (In reply to Amos Kong from comment #16)
> > > Huawei engineer (lvqian1) told me they are using virtio-nic in
> > > this condition, Mike, why can't we use virtio-nic?
> >
> > Impossible . Virtio-nic is not a protocol in MSFT support list for
> > Kernel-Mode Debugging over a Network Cable
> 
> I repeated the case 3 times, but they still insist on the answer..
> Let's confirm by email.

Sure . Let's confirm ,cc'ing Yan here as well .

unless they did not use serial protocol instead of  NET protocol 

BTW are they testing svvp as well ?

Mike
> 
> > 
> > Mike

Comment 23 Ronen Hod 2014-10-27 09:39:44 UTC
Jason,
I think that you resolved some issues with E1000 recently.

Comment 24 Ronen Hod 2014-11-17 09:49:44 UTC
Mike,
There were several changes upstream. Can you please test the latest version.
Thanks.

Comment 25 Mike Cao 2014-11-17 15:09:27 UTC
(In reply to Ronen Hod from comment #24)
> Mike,
> There were several changes upstream. Can you please test the latest version.
> Thanks.

Retest this issue on 3.10.0-200.el7.x86_64
qemu-kvm-rhev-2.1.2-7.el7.x86_64

steps are same in comment #0

Actual Results:
Test failed w/ the error in comment #0

I believe commit 1c380f9460522f32c8dd2577b2a53d518ec91c6d has not been reverted yet 

Based on above, this issue hasn't been fixed.

Mike

Comment 26 Ronen Hod 2014-12-01 09:50:02 UTC
MST, you probably remember Avi's good fix, but I think that E1000 is not fully compatible (or something). What is the right solution.

Comment 27 Michael S. Tsirkin 2014-12-01 16:46:18 UTC
Created attachment 963378 [details]
proposed patch

proposed qemu patch

Comment 28 Michael S. Tsirkin 2014-12-01 16:47:10 UTC
patch above simply defers all packets until bus master is enabled.
Amos, could you tell us whether it helps?

Comment 29 Amos Kong 2014-12-03 09:12:17 UTC
Bug can still be reproduced with fixed qemu, but the error "Null RX descriptor!!" disappeared.

Comment 30 Amos Kong 2014-12-09 20:32:43 UTC
Created attachment 966463 [details]
e1000 debug message with win2012r2-64 guest (net debug off)

Comment 31 Amos Kong 2014-12-09 20:40:06 UTC
Created attachment 966466 [details]
e1000 debug message with win2012r2-64 guest (net debug on)

Comment 32 Amos Kong 2014-12-10 01:06:03 UTC
(In reply to Amos Kong from comment #31)
> Created attachment 966466 [details]
> e1000 debug message with win2012r2-64 guest (net debug on)

e1000: set_ics 0, ICR 6, IMR 0
e1000: RCTL: 191, mac_reg[RCTL] = 0x8018
e1000: RCTL: 191, mac_reg[RCTL] = 0x801a
e1000: index 0: (nil) : 0 0                 <<<< desc.buffer_addr is nil
e1000: set_ics 2, ICR 6, IMR 0
e1000: index 1: (nil) : 0 0
e1000: set_ics 2, ICR 6, IMR 0
e1000: index 2: (nil) : 0 0
e1000: set_ics 2, ICR 6, IMR 0
e1000: index 3: (nil) : 0 0
e1000: set_ics 2, ICR 6, IMR 0
e1000: index 4: (nil) : 0 0
e1000: set_ics 2, ICR 6, IMR 0
e1000: ICR read: 6
e1000: set_ics 0, ICR 0, IMR 0
e1000: ICR read: 0


[PATCH] Disable xmit until BUSMaster is enabled. (it doesn't work)

+++ b/hw/net/e1000.c
@@ -792,7 +792,8 @@ start_xmit(E1000State *s)
     struct e1000_tx_desc desc;
     uint32_t tdh_start = s->mac_reg[TDH], cause = E1000_ICS_TXQE;
 
-    if (!(s->mac_reg[TCTL] & E1000_TCTL_EN)) {
+    if (!((s->mac_reg[TCTL] & E1000_TCTL_EN) &&
+          (s->parent_obj.config[PCI_COMMAND] & PCI_COMMAND_MASTER))) {
         DBGOUT(TX, "tx disabled\n");
         return;
     }
@@ -806,7 +806,12 @@ start_xmit(E1000State *s)
                (void *)(intptr_t)desc.buffer_addr, desc.lower.data,
                desc.upper.data);
 
-        process_tx_desc(s, &desc);
+        if (desc.buffer_addr) {
+            process_tx_desc(s, &desc);
+        } else {
+            DBGOUT(TX, "Null TX descriptor!!\n");
+        }
+
         cause |= txdesc_writeback(s, base, &desc);
 
         if (++s->mac_reg[TDH] * sizeof(desc) >= s->mac_reg[TDLEN])

Comment 33 Amos Kong 2014-12-10 01:35:43 UTC
(In reply to Amos Kong from comment #29)
> (In reply to Michael S. Tsirkin from comment #28)
> > patch above simply defers all packets until bus master is enabled.
> > Amos, could you tell us whether it helps?
>
> Bug can still be reproduced with fixed qemu, but the error "Null RX
> descriptor!!" disappeared.

Michael, your patch defer packets until BM enabled, but the problem is that BM is never enabled, so the packets are deferred forever.


If we unconditionally enable BM in QEMU, win2012r2 guest (w/o your defer patch) can get IP.

+++ b/hw/pci/pci.c
@@ -1162,8 +1162,9 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int
     if (range_covers_byte(addr, l, PCI_COMMAND)) {
         pci_update_irq_disabled(d, was_irq_disabled);
         memory_region_set_enabled(&d->bus_master_enable_region,
-                                  pci_get_word(d->config + PCI_COMMAND)
-                                    & PCI_COMMAND_MASTER);
+                                  true);
+        //pci_get_word(d->config + PCI_COMMAND)
+        //& PCI_COMMAND_MASTER);
     }
 
     msi_write_config(d, addr, val_in, l);

Comment 34 Amos Kong 2014-12-10 01:49:34 UTC
The PCI bus master bit should be set by guest pci driver or nic driver (I saw some linux nic drivers help set this bit)

But why the win2012r2 guest doesn't set it? win2012 guest set it rightly.
We should know if it's caused by qemu e1000 backend.

Comment 35 Michael S. Tsirkin 2014-12-15 11:14:22 UTC
Created attachment 968902 [details]
defer tx: on top of my previous patch

Comment 36 Amos Kong 2014-12-18 09:28:00 UTC
I found bus mastering _never_ be enabled by pci driver/or e1000 driver if we enable network debug of e1000 in win2012-64r2.

So we have to ignore bus master for e1000 in qemu to workaround this issue.

Posted patches to upstream: http://marc.info/?l=qemu-devel&m=141889454120570&w=2
 [PATCH 1/2] e1000: defer packets until BM enabled
 [PATCH 2/2] e1000: unconditionally enable bus mastering

Comment 37 Amos Kong 2014-12-19 03:06:57 UTC
Created attachment 970967 [details]
Three debug log (netdug on/off, win2012/win2012r2)

We can see BM is enabled in win2012r2-debugoff, win2012-debugon.
But the BM isn't enabled in win2012r2-debugon.

There are the pci_write_config logs (address, value) in the files, I'm analyzing the write operations.


$ diff win2012-debugon win2012r2-debugon -up
+++ win2012r2-debugon   2014-12-19 10:40:46.640136843 +0800
@@ -42,44 +42,18 @@ e1000_write_config: addr: 4, len:2, val:
 pci_default_write_config: (e1000) BUS Master disabled
 e1000_write_config: addr: 16, len:4, val:4294967295
 e1000_write_config: addr: 16, len:4, val:4273733632
-e1000_write_config: addr: 16, len:4, val:4273733632 << different starts here
 e1000_write_config: addr: 20, len:4, val:4294967295
 e1000_write_config: addr: 20, len:4, val:49153
-e1000_write_config: addr: 20, len:4, val:49152
-e1000_write_config: addr: 24, len:4, val:4294967295
-e1000_write_config: addr: 24, len:4, val:0
-e1000_write_config: addr: 28, len:4, val:4294967295
-e1000_write_config: addr: 28, len:4, val:0
-e1000_write_config: addr: 32, len:4, val:4294967295
-e1000_write_config: addr: 32, len:4, val:0
-e1000_write_config: addr: 36, len:4, val:4294967295
-e1000_write_config: addr: 36, len:4, val:0
-e1000_write_config: addr: 4, len:2, val:263
-pci_default_write_config: (e1000) BUS Master enabled ..... bit: 4 .....
-e1000_write_config: addr: 6, len:2, val:260
-e1000_write_config: addr: 16, len:4, val:4294967295
-e1000_write_config: addr: 16, len:4, val:4273733632
-e1000_write_config: addr: 16, len:4, val:4273733632
-e1000_write_config: addr: 20, len:4, val:4294967295
-e1000_write_config: addr: 20, len:4, val:49153
-e1000_write_config: addr: 20, len:4, val:49152
-e1000_write_config: addr: 24, len:4, val:4294967295
-e1000_write_config: addr: 24, len:4, val:0
-e1000_write_config: addr: 28, len:4, val:4294967295
-e1000_write_config: addr: 28, len:4, val:0
-e1000_write_config: addr: 32, len:4, val:4294967295
-e1000_write_config: addr: 32, len:4, val:0
-e1000_write_config: addr: 36, len:4, val:4294967295
-e1000_write_config: addr: 36, len:4, val:0
-e1000_write_config: addr: 6, len:2, val:263
+e1000_write_config: addr: 4, len:2, val:259
+pci_default_write_config: (e1000) BUS Master disabled
 pci_default_write_config: (piix3-ide) BUS Master disabled
 pci_default_write_config: (piix3-ide) BUS Master disabled
 pci_default_write_config: (piix3-usb-uhci) BUS Master disabled
 .....

$ diff win2012r2-debugoff win2012r2-debugon -up
+++ win2012r2-debugon   2014-12-19 10:40:46.640136843 +0800
@@ -38,6 +38,14 @@ pci_default_write_config: (piix3-usb-uhc
 e1000_write_config: addr: 48, len:4, val:4273471489
 e1000_write_config: addr: 48, len:4, val:4273471488
+e1000_write_config: addr: 4, len:2, val:256
+pci_default_write_config: (e1000) BUS Master disabled
+e1000_write_config: addr: 16, len:4, val:4294967295
+e1000_write_config: addr: 16, len:4, val:4273733632
+e1000_write_config: addr: 20, len:4, val:4294967295
+e1000_write_config: addr: 20, len:4, val:49153
+e1000_write_config: addr: 4, len:2, val:259
+pci_default_write_config: (e1000) BUS Master disabled
 pci_default_write_config: (piix3-ide) BUS Master disabled
 pci_default_write_config: (piix3-ide) BUS Master disabled
 pci_default_write_config: (piix3-usb-uhci) BUS Master disabled
@@ -74,29 +82,3 @@ pci_default_write_config: (piix3-usb-uhc
 pci_default_write_config: (VGA) BUS Master disabled
 pci_default_write_config: (VGA) BUS Master disabled
 pci_default_write_config: (VGA) BUS Master enabled ..... bit: 4 .....
-e1000_write_config: addr: 60, len:1, val:11
-e1000_write_config: addr: 4, len:2, val:1280
-pci_default_write_config: (e1000) BUS Master disabled
-e1000_write_config: addr: 16, len:4, val:4273733632
-e1000_write_config: addr: 20, len:4, val:49152
-e1000_write_config: addr: 24, len:4, val:0
-e1000_write_config: addr: 28, len:4, val:0
-e1000_write_config: addr: 32, len:4, val:0
-e1000_write_config: addr: 36, len:4, val:0
-e1000_write_config: addr: 48, len:4, val:0
-e1000_write_config: addr: 60, len:1, val:11
-e1000_write_config: addr: 12, len:1, val:0
-e1000_write_config: addr: 13, len:1, val:0
-e1000_write_config: addr: 4, len:2, val:1280
-pci_default_write_config: (e1000) BUS Master disabled
-e1000_write_config: addr: 4, len:2, val:1287
-pci_default_write_config: (e1000) BUS Master enabled ..... bit: 4 .....
-e1000_write_config: addr: 6, len:2, val:63744
-e1000_write_config: addr: 4, len:2, val:1287
-pci_default_write_config: (e1000) BUS Master enabled ..... bit: 4 .....
 .....

Comment 38 Amos Kong 2014-12-19 03:11:30 UTC
Hi Mike,

Can you help to test and confirm if other emulated nics work after enabling net-debug?

Network devices:
name "e1000", bus PCI, desc "Intel Gigabit Ethernet"  (doesn't work)
name "e1000-82540em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
name "e1000-82544gc", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
name "e1000-82545em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)

name "i82550", bus PCI, desc "Intel i82550 Ethernet"
name "i82551", bus PCI, desc "Intel i82551 Ethernet"
name "i82557a", bus PCI, desc "Intel i82557A Ethernet"
name "i82557b", bus PCI, desc "Intel i82557B Ethernet"
name "i82557c", bus PCI, desc "Intel i82557C Ethernet"
name "i82558a", bus PCI, desc "Intel i82558A Ethernet"
name "i82558b", bus PCI, desc "Intel i82558B Ethernet"
name "i82559a", bus PCI, desc "Intel i82559A Ethernet"
name "i82559b", bus PCI, desc "Intel i82559B Ethernet"
name "i82559c", bus PCI, desc "Intel i82559C Ethernet"
name "i82559er", bus PCI, desc "Intel i82559ER Ethernet"
name "i82562", bus PCI, desc "Intel i82562 Ethernet"
name "i82801", bus PCI, desc "Intel i82801 Ethernet"
name "ne2k_isa", bus ISA
name "ne2k_pci", bus PCI
name "pcnet", bus PCI
name "rtl8139", bus PCI
name "usb-bt-dongle", bus usb-bus
name "usb-net", bus usb-bus
name "virtio-net-device", bus virtio-bus
name "virtio-net-pci", bus PCI, alias "virtio-net"
name "vmxnet3", bus PCI, desc "VMWare Paravirtualized Ethernet v3"

Comment 39 jason wang 2014-12-19 04:39:24 UTC
(In reply to Amos Kong from comment #38)
> Hi Mike,
> 
> Can you help to test and confirm if other emulated nics work after enabling
> net-debug?
> 
> Network devices:
> name "e1000", bus PCI, desc "Intel Gigabit Ethernet"  (doesn't work)
> name "e1000-82540em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> name "e1000-82544gc", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> name "e1000-82545em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> 
> name "i82550", bus PCI, desc "Intel i82550 Ethernet"
> name "i82551", bus PCI, desc "Intel i82551 Ethernet"
> name "i82557a", bus PCI, desc "Intel i82557A Ethernet"
> name "i82557b", bus PCI, desc "Intel i82557B Ethernet"
> name "i82557c", bus PCI, desc "Intel i82557C Ethernet"
> name "i82558a", bus PCI, desc "Intel i82558A Ethernet"
> name "i82558b", bus PCI, desc "Intel i82558B Ethernet"
> name "i82559a", bus PCI, desc "Intel i82559A Ethernet"
> name "i82559b", bus PCI, desc "Intel i82559B Ethernet"
> name "i82559c", bus PCI, desc "Intel i82559C Ethernet"
> name "i82559er", bus PCI, desc "Intel i82559ER Ethernet"
> name "i82562", bus PCI, desc "Intel i82562 Ethernet"
> name "i82801", bus PCI, desc "Intel i82801 Ethernet"
> name "ne2k_isa", bus ISA
> name "ne2k_pci", bus PCI
> name "pcnet", bus PCI
> name "rtl8139", bus PCI
> name "usb-bt-dongle", bus usb-bus
> name "usb-net", bus usb-bus
> name "virtio-net-device", bus virtio-bus
> name "virtio-net-pci", bus PCI, alias "virtio-net"
> name "vmxnet3", bus PCI, desc "VMWare Paravirtualized Ethernet v3"

Let's first try 8139 and virtio-net. Since they were officially supported.

Comment 40 Mike Cao 2014-12-19 05:04:19 UTC
(In reply to Amos Kong from comment #38)
> Hi Mike,
> 
> Can you help to test and confirm if other emulated nics work after enabling
> net-debug?
> 
> Network devices:
> name "e1000", bus PCI, desc "Intel Gigabit Ethernet"  (doesn't work)
> name "e1000-82540em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> name "e1000-82544gc", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> name "e1000-82545em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> 
> name "i82550", bus PCI, desc "Intel i82550 Ethernet"
> name "i82551", bus PCI, desc "Intel i82551 Ethernet"
> name "i82557a", bus PCI, desc "Intel i82557A Ethernet"
> name "i82557b", bus PCI, desc "Intel i82557B Ethernet"
> name "i82557c", bus PCI, desc "Intel i82557C Ethernet"
> name "i82558a", bus PCI, desc "Intel i82558A Ethernet"
> name "i82558b", bus PCI, desc "Intel i82558B Ethernet"
> name "i82559a", bus PCI, desc "Intel i82559A Ethernet"
> name "i82559b", bus PCI, desc "Intel i82559B Ethernet"
> name "i82559c", bus PCI, desc "Intel i82559C Ethernet"
> name "i82559er", bus PCI, desc "Intel i82559ER Ethernet"
> name "i82562", bus PCI, desc "Intel i82562 Ethernet"
> name "i82801", bus PCI, desc "Intel i82801 Ethernet"

According to http://msdn.microsoft.com/zh-cn/windows/hardware/dn337010(v=vs.90)
Can you provide a easy way to check the device id of above devices first?
> name "ne2k_isa", bus ISA
> name "ne2k_pci", bus PCI
> name "pcnet", bus PCI
> name "rtl8139", bus PCI
> name "usb-bt-dongle", bus usb-bus
> name "usb-net", bus usb-bus
> name "virtio-net-device", bus virtio-bus
> name "virtio-net-pci", bus PCI, alias "virtio-net"
> name "vmxnet3", bus PCI, desc "VMWare Paravirtualized Ethernet v3"

Comment 41 Mike Cao 2014-12-19 05:05:29 UTC
(In reply to jason wang from comment #39)
> (In reply to Amos Kong from comment #38)

> 
> Let's first try 8139 and virtio-net. Since they were officially supported.

RTL8139 and virtio-net are not in MSFT support list and according to QE's test it does not work for remote debug.

Comment 42 Amos Kong 2014-12-19 05:16:12 UTC
(In reply to Mike Cao from comment #40)
> (In reply to Amos Kong from comment #38)
> > Hi Mike,
> > 
> > Can you help to test and confirm if other emulated nics work after enabling
> > net-debug?
> > 
> > Network devices:
> > name "e1000", bus PCI, desc "Intel Gigabit Ethernet"  (doesn't work)
> > name "e1000-82540em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> > name "e1000-82544gc", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> > name "e1000-82545em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> > 
> > name "i82550", bus PCI, desc "Intel i82550 Ethernet"
> > name "i82551", bus PCI, desc "Intel i82551 Ethernet"
> > name "i82557a", bus PCI, desc "Intel i82557A Ethernet"
> > name "i82557b", bus PCI, desc "Intel i82557B Ethernet"
> > name "i82557c", bus PCI, desc "Intel i82557C Ethernet"
> > name "i82558a", bus PCI, desc "Intel i82558A Ethernet"
> > name "i82558b", bus PCI, desc "Intel i82558B Ethernet"
> > name "i82559a", bus PCI, desc "Intel i82559A Ethernet"
> > name "i82559b", bus PCI, desc "Intel i82559B Ethernet"
> > name "i82559c", bus PCI, desc "Intel i82559C Ethernet"
> > name "i82559er", bus PCI, desc "Intel i82559ER Ethernet"
> > name "i82562", bus PCI, desc "Intel i82562 Ethernet"
> > name "i82801", bus PCI, desc "Intel i82801 Ethernet"
> 
> According to
> http://msdn.microsoft.com/zh-cn/windows/hardware/dn337010(v=vs.90)
> Can you provide a easy way to check the device id of above devices first?

The device id can be found in qemu code.

[amos@air qemu]$ git grep "device_id = PCI_DEVICE_ID" hw/net/
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82551IT,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82551IT,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82557,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82551IT,
hw/net/eepro100.c:        .device_id = PCI_DEVICE_ID_INTEL_82551IT,
hw/net/ne2000.c:    k->device_id = PCI_DEVICE_ID_REALTEK_8029;
hw/net/pcnet-pci.c:    k->device_id = PCI_DEVICE_ID_AMD_LANCE;
hw/net/rtl8139.c:    k->device_id = PCI_DEVICE_ID_REALTEK_8139;
hw/net/vmxnet3.c:    c->device_id = PCI_DEVICE_ID_VMWARE_VMXNET3;
[amos@air qemu]$ 

include/hw/pci/pci_ids.h
include/hw/pci/pci.h

PCI_DEVICE_ID_VMWARE_VMXNET3     0x07B0
PCI_DEVICE_ID_REALTEK_8139       0x8139
PCI_DEVICE_ID_INTEL_82551IT      0x1209
PCI_DEVICE_ID_INTEL_82557        0x1229
PCI_DEVICE_ID_REALTEK_8029       0x8029
PCI_DEVICE_ID_AMD_LANCE          0x2000

> > name "ne2k_isa", bus ISA
> > name "ne2k_pci", bus PCI
> > name "pcnet", bus PCI
> > name "rtl8139", bus PCI
> > name "usb-bt-dongle", bus usb-bus
> > name "usb-net", bus usb-bus
> > name "virtio-net-device", bus virtio-bus
> > name "virtio-net-pci", bus PCI, alias "virtio-net"
> > name "vmxnet3", bus PCI, desc "VMWare Paravirtualized Ethernet v3"

Comment 43 jason wang 2014-12-19 05:48:34 UTC
(In reply to Mike Cao from comment #41)
> (In reply to jason wang from comment #39)
> > (In reply to Amos Kong from comment #38)
> 
> > 
> > Let's first try 8139 and virtio-net. Since they were officially supported.
> 
> RTL8139 and virtio-net are not in MSFT support list and according to QE's
> test it does not work for remote debug.

Can you confirm e1000 (82540EM) is officially supported by MS?

According to https://downloadcenter.intel.com/SearchResult.aspx?lang=eng&ProdId=983 82540EM is not supported by Intel for 2k12.

Thanks

Comment 44 Mike Cao 2014-12-19 06:09:58 UTC
(In reply to jason wang from comment #43)
> (In reply to Mike Cao from comment #41)
> > (In reply to jason wang from comment #39)
> > > (In reply to Amos Kong from comment #38)
> > 
> > > 
> > > Let's first try 8139 and virtio-net. Since they were officially supported.
> > 
> > RTL8139 and virtio-net are not in MSFT support list and according to QE's
> > test it does not work for remote debug.
> 
> Can you confirm e1000 (82540EM) is officially supported by MS?
> 
> According to
> https://downloadcenter.intel.com/SearchResult.aspx?lang=eng&ProdId=983
> 82540EM is not supported by Intel for 2k12.
> 
It is strange that I can not find submission info http://www.windowsservercatalog.com/results.aspx?bCatID=1466&cpID=1046&avc=10&ava=0&avq=0&OR=1&PGS=100&ready=0&PG=1 

But I remember signed driver check job pass when using e1000 which means the driver is digital signed on win2012R2 and supported .

BTW regarding to the original bug,I did exactly same test on RHEL6 host ,can not reproduce the original issue .

Comment 45 jason wang 2014-12-19 06:25:53 UTC
(In reply to Mike Cao from comment #44)
> (In reply to jason wang from comment #43)
> > (In reply to Mike Cao from comment #41)
> > > (In reply to jason wang from comment #39)
> > > > (In reply to Amos Kong from comment #38)
> > > 
> > > > 
> > > > Let's first try 8139 and virtio-net. Since they were officially supported.
> > > 
> > > RTL8139 and virtio-net are not in MSFT support list and according to QE's
> > > test it does not work for remote debug.
> > 
> > Can you confirm e1000 (82540EM) is officially supported by MS?
> > 
> > According to
> > https://downloadcenter.intel.com/SearchResult.aspx?lang=eng&ProdId=983
> > 82540EM is not supported by Intel for 2k12.
> > 
> It is strange that I can not find submission info
> http://www.windowsservercatalog.com/results.
> aspx?bCatID=1466&cpID=1046&avc=10&ava=0&avq=0&OR=1&PGS=100&ready=0&PG=1 
> 
> But I remember signed driver check job pass when using e1000 which means the
> driver is digital signed on win2012R2 and supported .
> 
> BTW regarding to the original bug,I did exactly same test on RHEL6 host ,can
> not reproduce the original issue .

Yes I can only see Intel (R) PRO/1000 Gigabit Server Adapter was certified with 2008 but not for the R2 and 2k12. 

The point probably is intel use a single driver for all its wired cards. It does not mean e1000(82540) is supported by them (see intel link above).

Probably we need run WHQL on real e1000(82540). But I suspect if we could still find a 82540 card since it was really old.

So the driver used on 82540EM is not officially supported. We do need to consider to switch to other card.

Comment 46 Amos Kong 2014-12-19 06:58:10 UTC
(In reply to Mike Cao from comment #44)

> BTW regarding to the original bug,I did exactly same test on RHEL6 host ,can
> not reproduce the original issue .

See https://bugzilla.redhat.com/show_bug.cgi?id=1090237#c9, QEMU changed, then problem appears.

Comment 47 Michael S. Tsirkin 2014-12-19 13:12:19 UTC
Is it possible that this driver intentionally
does not enable bus master, and reads packets from PBM?
Do you see PBM accesses? I thkink they are in the range
10000h - 1FFFCh.

Comment 48 Mike Cao 2015-01-19 07:33:36 UTC
(In reply to Amos Kong from comment #38)
> Hi Mike,
> 
> Can you help to test and confirm if other emulated nics work after enabling
> net-debug?
> 
> Network devices:
> name "e1000", bus PCI, desc "Intel Gigabit Ethernet"  (doesn't work)
> name "e1000-82540em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> name "e1000-82544gc", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> name "e1000-82545em", bus PCI, desc "Intel Gigabit Ethernet" (doesn't work)
> 
> name "i82550", bus PCI, desc "Intel i82550 Ethernet"
> name "i82551", bus PCI, desc "Intel i82551 Ethernet"
> name "i82557a", bus PCI, desc "Intel i82557A Ethernet"
> name "i82557b", bus PCI, desc "Intel i82557B Ethernet"
> name "i82557c", bus PCI, desc "Intel i82557C Ethernet"
> name "i82558a", bus PCI, desc "Intel i82558A Ethernet"
> name "i82558b", bus PCI, desc "Intel i82558B Ethernet"
> name "i82559a", bus PCI, desc "Intel i82559A Ethernet"
> name "i82559b", bus PCI, desc "Intel i82559B Ethernet"
> name "i82559c", bus PCI, desc "Intel i82559C Ethernet"
> name "i82559er", bus PCI, desc "Intel i82559ER Ethernet"
> name "i82562", bus PCI, desc "Intel i82562 Ethernet"
> name "i82801", bus PCI, desc "Intel i82801 Ethernet"
> name "ne2k_isa", bus ISA
> name "ne2k_pci", bus PCI
> name "pcnet", bus PCI
> name "rtl8139", bus PCI
> name "usb-bt-dongle", bus usb-bus
> name "usb-net", bus usb-bus
> name "virtio-net-device", bus virtio-bus
> name "virtio-net-pci", bus PCI, alias "virtio-net"
> name "vmxnet3", bus PCI, desc "VMWare Paravirtualized Ethernet v3"

Hi, Amos

Which version qemu-kvm-rhev are you using? I can only test w/ following models 
name "e1000", bus PCI, desc "Intel Gigabit Ethernet"
name "e1000-82540em", bus PCI, desc "Intel Gigabit Ethernet"
name "e1000-82544gc", bus PCI, desc "Intel Gigabit Ethernet"
name "e1000-82545em", bus PCI, desc "Intel Gigabit Ethernet"

All hit the issue decribed in comment #0.

Comment 49 Mike Cao 2015-01-19 07:39:01 UTC
(In reply to jason wang from comment #43)
> (In reply to Mike Cao from comment #41)
> > (In reply to jason wang from comment #39)
> > > (In reply to Amos Kong from comment #38)
> > 
> > > 
> > > Let's first try 8139 and virtio-net. Since they were officially supported.
> > 
> > RTL8139 and virtio-net are not in MSFT support list and according to QE's
> > test it does not work for remote debug.
> 
> Can you confirm e1000 (82540EM) is officially supported by MS?
> 
> According to
> https://downloadcenter.intel.com/SearchResult.aspx?lang=eng&ProdId=983
> 82540EM is not supported by Intel for 2k12.
> 
> Thanks

After thinking twice ,I wonder there is agreement between Intel & Microsoft ,they can use some kind of fast track way to certify driver on the new os. it will be signature only test and certification info will not displayed on HCL list but can be included in Microsoft OS tree.

I think it should be supported ,Can we reach Intel partner manager to confirm it ?

Comment 52 jason wang 2015-06-26 02:53:47 UTC
I tend to close this bug as CANTFIX. 82540EM is not supported by 2k12 at all. There's no need to workaround a buggy driver.


Note You need to log in before you can comment on or make changes to this bug.