RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1882157 - [Azure][RHEL-7]lshw command showing wrong memory information in azure m or mv2 series type of instances.
Summary: [Azure][RHEL-7]lshw command showing wrong memory information in azure m or mv...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: lshw
Version: 7.8
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: ltao
QA Contact: Jeff Bastian
URL:
Whiteboard:
Depends On:
Blocks: 1882619
TreeView+ depends on / blocked
 
Reported: 2020-09-23 22:51 UTC by rcheerla
Modified: 2021-03-17 06:20 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1882619 (view as bug list)
Environment:
Last Closed: 2021-03-17 06:20:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
The command which indicate the issue. (42.46 KB, application/xml)
2020-09-23 23:17 UTC, rcheerla
no flags Details
the command output in json format. (38.59 KB, text/plain)
2020-09-23 23:18 UTC, rcheerla
no flags Details
Showing correct result by disabling DMI as an option to the lshw. (87 bytes, text/plain)
2020-09-23 23:20 UTC, rcheerla
no flags Details

Description rcheerla 2020-09-23 22:51:40 UTC
Description of problem: lshw command showing wrong memory information in azure m or mv2 series type of instances.


Version-Release number of selected component (if applicable): lshw-B.02.18-14.el7.x86_64


How reproducible: Always.


Steps to Reproduce:
1. Install RHEL 7.8 in m or mv2 series type of instance in azure env(ex:- Standard_M32ts)
2. Install the lshw package if not alreay installed.
3. Then check # lshw -short -C memory

Actual results:

less 0070-lshw_json_C_memory | grep size
    "size" : 65536,
    "size" : 9790498482946048, < bytes >   <<<-----  9.7 Pib
        "size" : 1073741824
        "size" : 1958092821495808
        "size" : 1958092822544384
        "size" : 1958092823592960
        "size" : 1958092824641536
        "size" : 1958092825690112
        "size" : 33291239424



Expected results:

cat 0060-lshw_disableDMI_C_memory
  *-memory
       description: System memory
       physical id: 0
       size: 192GiB  <<--

Additional info:

Comment 3 rcheerla 2020-09-23 23:17:05 UTC
Created attachment 1716171 [details]
The command which indicate the issue.

Comment 4 rcheerla 2020-09-23 23:18:30 UTC
Created attachment 1716172 [details]
the command output in json format.

Comment 5 rcheerla 2020-09-23 23:20:02 UTC
Created attachment 1716173 [details]
Showing correct result by disabling DMI as an option to the lshw.

Comment 6 Yuxin Sun 2020-09-25 07:31:43 UTC
Can reproduce it in RHEL-7.9 and 8.3 if VM memory > 32G, e.g. D16s_v3(64G), E8s_v3(64G), NV6(56G), M32ts(192G)... Cannot reproduce in small sizes, e.g. E4_v3(32G),D8s_v3(32G),F1(2G)...
And tested it on Hyper-V, if VM memory >=36G can reproduce this issue.

**Hyper-V**:
If memory >= 36G, the "lshw -short-C memory" shows 1780TiB system memory:(e.g.40G VM output:)
# lshw -short -C memory
H/W path      Device     Class          Description
===================================================
/0/0                     memory         64KiB BIOS
/0/51                    memory         1780TiB System Memory
/0/51/0                  memory         3968MiB 
/0/51/1                  memory         1780TiB 
/0/51/2                  memory         4225MiB

If 72G, it shows 3561TiB ~= 2*1780TiB:
# lshw -short -C memory
H/W path      Device     Class          Description
===================================================
/0/0                     memory         64KiB BIOS
/0/51                    memory         3561TiB System Memory
/0/51/0                  memory         3968MiB 
/0/51/1                  memory         1780TiB 
/0/51/2                  memory         1780TiB 
/0/51/3                  memory         4226MiB 


**Azure**:
If 128G(E16s_v3): it shows 3*1780TiB:
# lshw -short -C memory
H/W path          Device      Class          Description
========================================================
/0/0                          memory         64KiB BIOS
/0/51                         memory         5342TiB System Memory
/0/51/0                       memory         1GiB 
/0/51/1                       memory         1780TiB 
/0/51/2                       memory         1780TiB 
/0/51/3                       memory         1780TiB 
/0/51/4                       memory         31GiB 

If 192G(M32ts): it shows 8904Tib ~= 5*1780TiB:
H/W path          Device      Class          Description
========================================================
/0/0                          memory         64KiB BIOS
/0/51                         memory         8904TiB System Memory
/0/51/0                       memory         1GiB 
/0/51/1                       memory         1780TiB 
/0/51/2                       memory         1780TiB 
/0/51/3                       memory         1780TiB 
/0/51/4                       memory         1780TiB 
/0/51/5                       memory         1780TiB 
/0/51/6                       memory         31GiB 

lshw packages:
RHEL-8.3: lshw-B.02.19.2-2.el8.x86_64
RHEL-7.9: lshw-B.02.18-17.el7.x86_64
RHEL-7.8: lshw-B.02.18-14.el7.x86_64

Comment 7 Yuxin Sun 2020-09-25 08:30:13 UTC
Xiliang helped to test it in AWS VM and didn't see this issue.

m5.12xlarge: 72G
kernel-3.10.0-1158.el7.x86_64
# lshw -C memory|more
  *-firmware                          
       description: BIOS
       vendor: Amazon EC2
       physical id: 0
       version: 1.0
       date: 10/16/2017
       size: 64KiB
       capacity: 64KiB
       capabilities: pci edd acpi virtualmachine
  *-memory  
       description: System memory
       physical id: 1
       size: 69GiB

m5.12xlarge: 192G
kernel-3.10.0-1160.2.1.el7.x86_64
[root@ip-10-116-2-133 ec2-user]# lshw -C memory 
  *-firmware                          
       description: BIOS
       vendor: Amazon EC2
       physical id: 0
       version: 1.0
       date: 10/16/2017
       size: 64KiB
       capacity: 64KiB
       capabilities: pci edd acpi virtualmachine
  *-memory  
       description: System memory
       physical id: 1
       size: 189GiB

Comment 10 Jeff Bastian 2021-01-08 21:37:10 UTC
I think I see the problem: the SMBIOS tables provided by Azure do not follow the spec [0] and it's confusing lshw.  Specifically, the SMBIOS identifies itself as following spec version 2.3, but it's using the "Extended Size" feature from spec version 2.7 to describe the DIMM size for the virtual DIMM in slot 1.  As a result, lshw is using the the ASCII strings appended to the Type 17 record as the Extended Size value and computes a garbage value for the DIMM size.

[0] https://www.dmtf.org/standards/smbios


The raw SMBIOS data:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[root@wala79e8sv301081516-vm1 ~]# xxd /sys/firmware/dmi/entries/17-1/raw
0000000: 111b 5700 5100 5000 ffff ffff ff7f 0200  ..W.Q.P.........
0000010: 0102 0104 0000 0003 0202 024d 3100 4e6f  ...........M1.No
0000020: 6e65 004d 6963 726f 736f 6674 0000       ne.Microsoft..

[root@wala79e8sv301081516-vm1 ~]# dmidecode -H 0x57 -u
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.3 present.
338 structures occupying 17307 bytes.
Table at 0x000F93D0.

Handle 0x0057, DMI type 17, 27 bytes
	Header and Data:
		11 1B 57 00 51 00 50 00 FF FF FF FF FF 7F 02 00
		01 02 01 04 00 00 00 03 02 02 02
	Strings:
		4D 31 00
		"M1"
		4E 6F 6E 65 00
		"None"
		4D 69 63 72 6F 73 6F 66 74 00
		"Microsoft"
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Looking at the first row of raw bytes:

		11 1B 57 00 51 00 50 00 FF FF FF FF FF 7F 02 00
                   ^^                               ^^^^^
                   |                                |
                   |                                + size of memory device
                   |
                   +-- length of the structure, 0x1B is used for SMBIOS spec 2.3


On the Size field, the spec says:

  Size of the memory device

  If the value is 0, no memory device is installed in the
  socket; if the size is unknown, the field value is
  FFFFh. If the size is 32 GB-1 MB or greater, the
  field value is 7FFFh and the actual size is stored in
  the Extended Size field.

As you can see, the size here is 0x7FFF which means refer to the Extended Size field which starts at offset 0x1C.  However, the table is only 0x1B bytes long, so offset 0x1C is the second byte of the appended ASCII strings.  lshw is interpreting the bytes 31 00 4E 6F (the ASCII chars "1", NUL, "N", and "o" from "M1" and "None" strings) as the Extended Size.

Looking at the lshw source code [1], the size is calculated in this scenario in src/core/dmi.cc:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$ nl -ba src/core/dmi.cc
...
  1568  // size
  1569            u = data[0x0D] << 8 | data[0x0C];
  1570            if(u == 0x7FFF) {
  1571               unsigned long long extendsize = (data[0x1F] << 24) | (data[0x1E] << 16) | (data[0x1D] << 8) | data[0x1C];
  1572               extendsize &= 0x7FFFFFFFUL;
  1573               size = extendsize * 1024ULL * 1024ULL;
  1574            }
...
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[1] https://github.com/lyonel/lshw/blob/master/src/core/dmi.cc#L1568


Simple test program with the raw data combined with the above chunk of code:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <stdio.h>
#include <stdint.h>

uint8_t raw[] = {
    0x11, 0x1B, 0x57, 0x00, 0x51, 0x00, 0x50, 0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x7F, 0x02, 0x00,
    0x01, 0x02, 0x01, 0x04, 0x00, 0x00, 0x00, 0x03, 0x02, 0x02, 0x02,
    0x4D, 0x31, 0x00,
    0x4E, 0x6F, 0x6E, 0x65, 0x00,
    0x4D, 0x69, 0x63, 0x72, 0x6F, 0x73, 0x6F, 0x66, 0x74, 0x00
};

int main(void)
{
    uint32_t u = 0;
    unsigned long long size = 0;
    uint8_t *data;

    data = raw;

// size
    u = data[0x0D] << 8 | data[0x0C];
    if(u == 0x7FFF) {
        unsigned long long extendsize = (data[0x1F] << 24) | (data[0x1E] << 16) | (data[0x1D] << 8) | data[0x1C];
        extendsize &= 0x7FFFFFFFUL;
        size = extendsize * 1024ULL * 1024ULL;
    }

    printf("size = %llu bytes\n", size);

    return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Running it indeed reports a gigantic size:

$ ./a.out
size = 1958092821495808 bytes


It seems this is not a bug in lshw, but rather Microsoft needs to update the SMBIOS table for Azure VMs to version 2.7 or newer of the spec and provide a proper Extended Size field.

Comment 13 Jeff Bastian 2021-01-08 23:16:20 UTC
I retract my comment 10: this is not a bug in the Azure SMBIOS tables after all.  Upon further reflection, I realized this chunk of code in lshw is not checking the SMBIOS version number.  If the version is less than 2.7, then it should treat the size value of 0x7FFF as a raw value and not a special code.

Here is a simple patch to add a version check:

diff --git a/src/core/dmi.cc b/src/core/dmi.cc
index 30b3ab3b995c..d33d4879bdca 100644
--- a/src/core/dmi.cc
+++ b/src/core/dmi.cc
@@ -1567,10 +1567,13 @@ int dmiversionrev)
 
 // size
           u = data[0x0D] << 8 | data[0x0C];
-          if(u == 0x7FFF) {
-             unsigned long long extendsize = (data[0x1F] << 24) | (data[0x1E] << 16) | (data[0x1D] << 8) | data[0x1C];
-             extendsize &= 0x7FFFFFFFUL;
-             size = extendsize * 1024ULL * 1024ULL;
+          if ((dmiversionmaj > 2)
+            || ((dmiversionmaj == 2) && (dmiversionmin >= 7))) {
+             if(u == 0x7FFF) {
+                unsigned long long extendsize = (data[0x1F] << 24) | (data[0x1E] << 16) | (data[0x1D] << 8) | data[0x1C];
+                extendsize &= 0x7FFFFFFFUL;
+                size = extendsize * 1024ULL * 1024ULL;
+             }
           }
 	  else
           if (u != 0xFFFF)



With this patch in place, lshw reports correct values:

[root@wala79e8sv301081516-vm1 ~]# rpm -q lshw
lshw-B.02.18-17.bz1882157.el7.x86_64

[root@wala79e8sv301081516-vm1 ~]# lshw -short -C memory
H/W path          Device      Class          Description
========================================================
/0/0                          memory         64KiB BIOS
/0/51                         memory         64GiB System Memory
/0/51/0                       memory         1GiB 
/0/51/1                       memory         31GiB 
/0/51/2                       memory         31GiB 
/0/51/3                       memory         [empty]
/0/51/4                       memory         [empty]
/0/51/5                       memory         [empty]
/0/51/6                       memory         [empty]
...

Comment 15 Jeff Bastian 2021-01-08 23:26:26 UTC
Upstream pull request:

https://github.com/lyonel/lshw/pull/60


Note You need to log in before you can comment on or make changes to this bug.