Bug 149177

Summary: rhn_register - unable to send hardware profile in RHEL4
Product: Red Hat Enterprise Linux 4 Reporter: Mike Ferris <mferris>
Component: up2dateAssignee: Bryan Kearney <bkearney>
Status: CLOSED ERRATA QA Contact: Brandon Perkins <bperkins>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: acarter, bkearney, bob.gautier, cperry, dcurtis, dduval, dmateo, fong, hpocock, kkathapu, pednekar, pnikam, rhn-bugs, sjejani, sstclair, tao, tdeanton
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0250 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-01 22:52:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 191074, 191079, 218639    
Attachments:
Description Flags
Snapshot of failed rhn_registration
none
lspci -v output
none
/sbin/lspci -v output attachment
none
tdeanton info from PYTHON command
none
tdeanton Python Output Aug 29th 2005 none

Description Mike Ferris 2005-02-20 05:02:23 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041215 Firefox/1.0 Red Hat/1.0-12.EL4

Description of problem:
rhn_register fails when attempting to send hardware profile for freshly installed RHEL4 machine. 

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Install clean RHEL4
2. Do not register with RHN on firstboot (did not attempt) 
3. run rhn_register (as root) 
4. select "use existing rhn-login"
rhn_register failed (see attached .png) when trying to send hardware profile.  
5. I then unchecked the hardware profile selection - and was able to send rpm database/complete registration.   I tried this several times (all from machine within Red Hat VPN btw) 
  

Actual Results:  Completed registration without sending Hardware profile on IBMT41 Thinkpad

Expected Results:  HW profile sent

Additional info:

Comment 1 Mike Ferris 2005-02-20 05:04:06 UTC
Created attachment 111236 [details]
Snapshot of failed rhn_registration

Comment 2 Dominic Duval 2005-02-20 20:30:27 UTC
I'm getting the exact same issue on my T41 as well. Probably a harware
-related problem.

Comment 4 Dennixx 2005-04-18 13:41:42 UTC
Same problem here, I've seen this on different (types of) systems, so I don't
think it has something to do with the hardware.
Uncommenting the 'send hardware' option works.
'up2date --hardware' afterwards also works without problems.
I tried installing the up2date errata (downloaded and copied from another RHEL4
system), but this didn't fix the problem.

Comment 5 David Tonhofer 2005-05-18 09:34:25 UTC
I have seen this on two completly different systems - a Fujitsu-Siemens L100 and
a Fujitsu-Siemens Scaleo 600. Exactly as described above. But note that this
was up2date 4.4.5-1, which is replaced once you update.

1. Install clean RHEL4
2. Do not register with RHN on firstboot (did not attempt) 
3. Run up2date --register
4. Select "use existing rhn-login", enter a "subscription number to activate"
   (I also had the problem were some numbers were deemed "invalid" but that
    is another problem") Tickboxes marked: "send package list", "send hardware
   information".
5. See the pop-up saying "Problem sending hardware profile".
6. Go to RHN - See the system has actually been added to your list of
   registered systems!! It's just the hardware profile that's not there.
   The hardware tab of the system says "Hardware not yet profiled."
   The overview tab of the system says "System is up to date" (which is wrong)
   "Hostname: unknown" "IP Address: unknown" "Kernel: unknown"
   "Registered: 2005-05-18 05:00:46 EDT" (correct; btw can we have that
   in *UTC*, please?) "Checked In: 2005-05-18 05:00:46 EDT" (correct)
   "Last Booted: unknown"

   The system is entitled for 'updates', which is nice.

   The *really* bad part is, if you don't check with RHN and assume that
   registration failed, and try "up2date --register" again, you will
   re-register the system - it will show up twice in your list of systems.
   With the *same* name but *different* system IDs. I though the name should
   be unique :-( 

   So I delete one of those systems. I hope I haven't lost a subscription 8-o
 
   Anyway, the registration should IMAO be a either a transaction - if it fails
   it fails completely or else the user should be told 'registration succeeded
   but your hardware profile could not be transmitted because of: <error
   details useful the 'zilla>' 
7. BREAK OFF the up2date process by clicking 'cancel' after having closed
   the popup window bearing ill news.
8. Run up2date --list in the shell. This now works. On the RHN, you
   see that the "checked in" date of the system you did not delete
   changes to "2005-05-18 05:25:08 EDT". Good. The "base channel"
   automagically changes to "Red Hat Enterprise Linux ES (v. 4 for 32-bit x86)"
   Good.
9. Run up2date again. It will now update up2date. After that you can 
   run up2date normally, schedule a "hardware refresh" in the RHN which will
   effect a correct hardware profile update.

   
   

   


Comment 6 Mihai Ibanescu 2005-07-29 15:28:16 UTC
Can you please send the output of:

/sbin/lspci -v

Comment 7 Dominic Duval 2005-07-30 20:46:59 UTC
Created attachment 117314 [details]
lspci -v output

Comment 8 Dominic Duval 2005-07-30 20:54:08 UTC
Concerning the output of lspci -v on my T41: Note that the system board on this
laptop was replaced *after* I saw this problem. The new board is supposed so be
identical to the original, but there might be a BIOS update.



Comment 10 Mihai Ibanescu 2005-08-01 21:17:04 UTC
Ok, that didn't help. Trying something else. Can you please run:

PYTHONPATH=/usr/share/rhn python -c "from up2date_client import hardware; print 
hardware.Hardware()" > /tmp/hardware-info.txt

and attach /tmp/hardware-info.txt

Comment 11 Thomas "Shea" DeAntonio 2005-08-05 04:33:47 UTC
Created attachment 117485 [details]
/sbin/lspci -v  output attachment

I am attaching my output of this as well in the hope that I can assist in
getting this resolved. NOTE of interest we get calls on this issue from users
with different systems.

Comment 12 Thomas "Shea" DeAntonio 2005-08-05 04:37:51 UTC
Created attachment 117486 [details]
tdeanton info from PYTHON command

Comment 13 Mihai Ibanescu 2005-08-26 18:11:59 UTC
Shea, can you please run the python command on a single line? Otherwise the
output doesn't seem to be of any use.

Comment 14 Thomas "Shea" DeAntonio 2005-08-29 00:52:02 UTC
Created attachment 118198 [details]
tdeanton Python Output Aug 29th 2005

Misa, 
I have attached a new file from the python output that has detail this time.
AFAIK The can not send hw profile is reproducible on any number of  systems
trying to register RHEL 4 to RHN.

Comment 15 David Tonhofer 2005-12-07 14:56:23 UTC
My prayer in May was of no use. I just noticed this problem basically cost me
two RH ES subscriptions... nice one... &/%!+!!

Comment 16 David Tonhofer 2005-12-07 16:12:26 UTC
SNAFU! Disregard the last mail. I was really confused by the subscription
unlocking process and the RHN maze, in the meantime I have found how to activate
the subscription in RHN, all my 'unused' subscriptions still seem to exist.
Sorry about the mess.


Comment 18 Fanny Augustin 2006-04-11 00:24:05 UTC
Blocking rhnupr4u4 and rhnupr3u8 to track the progress of the release

Comment 19 Fanny Augustin 2006-04-13 19:26:17 UTC
Moving bugs to the CanFix List

Comment 20 Fanny Augustin 2006-05-08 19:03:10 UTC
This bug did not make the code freeze and it will not be fiixed during this
release cycle.  Re-aligning bug to the next release

Comment 21 Fanny Augustin 2006-05-08 20:03:01 UTC
This bug did not make the code freeze.  It will not be fixed in this releasee 
Reea ligning to the next one.

Comment 22 James Bowes 2006-07-26 21:39:46 UTC
*** Bug 155630 has been marked as a duplicate of this bug. ***

Comment 23 James Bowes 2006-07-26 21:43:22 UTC
*** Bug 196493 has been marked as a duplicate of this bug. ***

Comment 24 Bob Gautier 2006-07-27 07:56:27 UTC
Will this be fixed in the next update?

Comment 25 Kyle Powell 2006-07-27 13:24:11 UTC
My previous bug was closed as a duplicate so I'm adding my comments / proposed
patch from that bug to this one:

The key piece of info from hardware.py is:
"Reading DMI info failed"

Looking inside hardware.py we find:
   try:
       import dmi
   except:
       return {}
   # Now try to parse DMI
   try:
       d = dmi.DMI()
   except: # failed to read/parse the DMI information
       print "Reading DMI info failed"
       return {}

I ran this:

[root@foo src]# python -c "import dmi;d = dmi.DMI()"
Traceback (most recent call last):
 File "<string>", line 1, in ?
dmi.AccessError: Could not parse the DMI data

dmi is provided by dmimodule.so. Examining dmimodule.c:
   if (ret != 0) {
       Py_DECREF(new_self);
       if (ret < 0)
           PyErr_SetString(AccessError, "Could not parse the DMI data");
       return NULL;
   }

So the only way to get that error is with a negative return value. Only one
place in dmimodule.c can produce a negative return value:

   while (fp < 0xFFFFF) {
       fp += 16;
       if (read(fd, buf, 16) != 16)
           return -1;

fd is /dev/mem. The error returned from read is -EPERM. The only place
/char/mem.c returns -EPERM is:

       if (!range_is_allowed(realp, realp+count))
               return -EPERM;

and range_is_allowed is:
static inline int range_is_allowed(unsigned long from, unsigned long to)
{
       unsigned long cursor;

       cursor = from >> PAGE_SHIFT;
       while ((cursor << PAGE_SHIFT) < to) {
               if (!devmem_is_allowed(cursor))
                       return 0;
               cursor++;
       }
       return 1;
}

and finally devmem_is_allowed() from arch/x86_64/mm/init.c:
/*
* devmem_is_allowed() checks to see if /dev/mem access to a certain address is
* valid. The argument is a physical page number.
*
*
* On x86-64, access has to be given to the first megabyte of ram because that area
* contains bios code and data regions used by X and dosemu and similar apps.
* Access has to be given to non-kernel-ram areas as well, these contain the PCI
* mmio resources as well as potential bios/acpi data regions.
*/
int devmem_is_allowed(unsigned long pagenr)
{
       if (pagenr <= 256)
               return 1;
       if (!page_is_ram(pagenr))
               return 1;
       return 0;
}

So the problem is the read in dmimodule is trying to access /dev/mem beyond 1M.

Examing dmimodule more closely you find:

           ret = dmi_table(self, fd, base, len, num);
           if (ret != 0)
               return ret;

and inside dmi_table():

   if (lseek(fd, (long) base, 0) == -1) {
       PyErr_SetString(AccessError, "dmi: lseek");
       return 1;
   }
   if (read(fd, buf, len) != len) {
       PyErr_SetString(AccessError, "dmi: read");
       return 1;

So when dmi_table() is called, we're changing the file posistion inside /dev/mem
yet not setting it back to where it was when we return from the call.
Consequently, even though the module is trying to keep itself from reading
/dev/mem beyond 1M, the fp variable that it uses it out of sync with the actual
file position inside /dev/mem after the call to dmi_table() returns. My proposed
1 line patch to fix this:
--- dmimodule.c 2006-06-22 13:15:43.000000000 -0400
+++ dmimodule.c.fixed   2006-06-23 14:03:57.000000000 -0400
@@ -358,6 +358,7 @@
           PyDict_SetItemString(info, "DMI", Py_BuildValue("s", buffer));

           ret = dmi_table(self, fd, base, len, num);
+            lseek(fd, fp+16, 0);
           if (ret != 0)
               return ret;
       }

I built and tested a modified dmimodule.so with the above patch and got the
following output:
[root@foo src]# /usr/share/rhn/up2date_client/hardware.py
'pcibus' : '16'

<snip>

'hostname' : 'foo.foo.com'
'ipaddr' : '192.168.1.1'
'class' : 'NETINFO'

'product' : '0WC983'
'vendor' : 'Dell Computer Corporation'
'bios_vendor' : 'Dell Computer Corporation'
'system' : 'PowerEdge 6850 '
'bios_release' : '10/17/2005'
'board' : 'Dell Computer Corporation'
'bios_version' : 'A02'
'class' : 'DMI'
'asset' : '(chassis: 6LGBC91) (board: ..CN124605CK0303.) (system: 6LGBC91) '

So it appears to work...

Comment 33 James Bowes 2006-11-21 18:01:38 UTC
Fixed in up2date-4.5.1-1.

We cannot ensure that we will always send the hardware information (the network
could go down, etc). But since at the point when we send the hardware info  we
have already registered the system, not sending hardware info is not fatal. So
what happens now is the progress window reports what step it is performing
("Sending hardware information", etc.), and reports if one of the steps fails,
then continues on.

Comment 35 Daniel Riek 2007-01-09 23:42:31 UTC
Changing product to RHEL and component to rhn_register.

Comment 39 Red Hat Bugzilla 2007-04-12 00:12:03 UTC
User bnackash's account has been closed

Comment 40 Red Hat Bugzilla 2007-05-01 22:52:29 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0250.html