Bug 149177
Summary: | rhn_register - unable to send hardware profile in RHEL4 | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Mike Ferris <mferris> | ||||||||||||
Component: | up2date | Assignee: | Bryan Kearney <bkearney> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Brandon Perkins <bperkins> | ||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||
Priority: | medium | ||||||||||||||
Version: | 4.4 | CC: | acarter, bkearney, bob.gautier, cperry, dcurtis, dduval, dmateo, fong, hpocock, kkathapu, pednekar, pnikam, rhn-bugs, sjejani, sstclair, tao, tdeanton | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | All | ||||||||||||||
OS: | Linux | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | RHBA-2007-0250 | Doc Type: | Bug Fix | ||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2007-05-01 22:52:29 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Bug Depends On: | |||||||||||||||
Bug Blocks: | 191074, 191079, 218639 | ||||||||||||||
Attachments: |
|
Description
Mike Ferris
2005-02-20 05:02:23 UTC
Created attachment 111236 [details]
Snapshot of failed rhn_registration
I'm getting the exact same issue on my T41 as well. Probably a harware -related problem. Same problem here, I've seen this on different (types of) systems, so I don't think it has something to do with the hardware. Uncommenting the 'send hardware' option works. 'up2date --hardware' afterwards also works without problems. I tried installing the up2date errata (downloaded and copied from another RHEL4 system), but this didn't fix the problem. I have seen this on two completly different systems - a Fujitsu-Siemens L100 and a Fujitsu-Siemens Scaleo 600. Exactly as described above. But note that this was up2date 4.4.5-1, which is replaced once you update. 1. Install clean RHEL4 2. Do not register with RHN on firstboot (did not attempt) 3. Run up2date --register 4. Select "use existing rhn-login", enter a "subscription number to activate" (I also had the problem were some numbers were deemed "invalid" but that is another problem") Tickboxes marked: "send package list", "send hardware information". 5. See the pop-up saying "Problem sending hardware profile". 6. Go to RHN - See the system has actually been added to your list of registered systems!! It's just the hardware profile that's not there. The hardware tab of the system says "Hardware not yet profiled." The overview tab of the system says "System is up to date" (which is wrong) "Hostname: unknown" "IP Address: unknown" "Kernel: unknown" "Registered: 2005-05-18 05:00:46 EDT" (correct; btw can we have that in *UTC*, please?) "Checked In: 2005-05-18 05:00:46 EDT" (correct) "Last Booted: unknown" The system is entitled for 'updates', which is nice. The *really* bad part is, if you don't check with RHN and assume that registration failed, and try "up2date --register" again, you will re-register the system - it will show up twice in your list of systems. With the *same* name but *different* system IDs. I though the name should be unique :-( So I delete one of those systems. I hope I haven't lost a subscription 8-o Anyway, the registration should IMAO be a either a transaction - if it fails it fails completely or else the user should be told 'registration succeeded but your hardware profile could not be transmitted because of: <error details useful the 'zilla>' 7. BREAK OFF the up2date process by clicking 'cancel' after having closed the popup window bearing ill news. 8. Run up2date --list in the shell. This now works. On the RHN, you see that the "checked in" date of the system you did not delete changes to "2005-05-18 05:25:08 EDT". Good. The "base channel" automagically changes to "Red Hat Enterprise Linux ES (v. 4 for 32-bit x86)" Good. 9. Run up2date again. It will now update up2date. After that you can run up2date normally, schedule a "hardware refresh" in the RHN which will effect a correct hardware profile update. Can you please send the output of: /sbin/lspci -v Created attachment 117314 [details]
lspci -v output
Concerning the output of lspci -v on my T41: Note that the system board on this laptop was replaced *after* I saw this problem. The new board is supposed so be identical to the original, but there might be a BIOS update. Ok, that didn't help. Trying something else. Can you please run: PYTHONPATH=/usr/share/rhn python -c "from up2date_client import hardware; print hardware.Hardware()" > /tmp/hardware-info.txt and attach /tmp/hardware-info.txt Created attachment 117485 [details]
/sbin/lspci -v output attachment
I am attaching my output of this as well in the hope that I can assist in
getting this resolved. NOTE of interest we get calls on this issue from users
with different systems.
Created attachment 117486 [details]
tdeanton info from PYTHON command
Shea, can you please run the python command on a single line? Otherwise the output doesn't seem to be of any use. Created attachment 118198 [details]
tdeanton Python Output Aug 29th 2005
Misa,
I have attached a new file from the python output that has detail this time.
AFAIK The can not send hw profile is reproducible on any number of systems
trying to register RHEL 4 to RHN.
My prayer in May was of no use. I just noticed this problem basically cost me two RH ES subscriptions... nice one... &/%!+!! SNAFU! Disregard the last mail. I was really confused by the subscription unlocking process and the RHN maze, in the meantime I have found how to activate the subscription in RHN, all my 'unused' subscriptions still seem to exist. Sorry about the mess. Blocking rhnupr4u4 and rhnupr3u8 to track the progress of the release Moving bugs to the CanFix List This bug did not make the code freeze and it will not be fiixed during this release cycle. Re-aligning bug to the next release This bug did not make the code freeze. It will not be fixed in this releasee Reea ligning to the next one. *** Bug 155630 has been marked as a duplicate of this bug. *** *** Bug 196493 has been marked as a duplicate of this bug. *** Will this be fixed in the next update? My previous bug was closed as a duplicate so I'm adding my comments / proposed patch from that bug to this one: The key piece of info from hardware.py is: "Reading DMI info failed" Looking inside hardware.py we find: try: import dmi except: return {} # Now try to parse DMI try: d = dmi.DMI() except: # failed to read/parse the DMI information print "Reading DMI info failed" return {} I ran this: [root@foo src]# python -c "import dmi;d = dmi.DMI()" Traceback (most recent call last): File "<string>", line 1, in ? dmi.AccessError: Could not parse the DMI data dmi is provided by dmimodule.so. Examining dmimodule.c: if (ret != 0) { Py_DECREF(new_self); if (ret < 0) PyErr_SetString(AccessError, "Could not parse the DMI data"); return NULL; } So the only way to get that error is with a negative return value. Only one place in dmimodule.c can produce a negative return value: while (fp < 0xFFFFF) { fp += 16; if (read(fd, buf, 16) != 16) return -1; fd is /dev/mem. The error returned from read is -EPERM. The only place /char/mem.c returns -EPERM is: if (!range_is_allowed(realp, realp+count)) return -EPERM; and range_is_allowed is: static inline int range_is_allowed(unsigned long from, unsigned long to) { unsigned long cursor; cursor = from >> PAGE_SHIFT; while ((cursor << PAGE_SHIFT) < to) { if (!devmem_is_allowed(cursor)) return 0; cursor++; } return 1; } and finally devmem_is_allowed() from arch/x86_64/mm/init.c: /* * devmem_is_allowed() checks to see if /dev/mem access to a certain address is * valid. The argument is a physical page number. * * * On x86-64, access has to be given to the first megabyte of ram because that area * contains bios code and data regions used by X and dosemu and similar apps. * Access has to be given to non-kernel-ram areas as well, these contain the PCI * mmio resources as well as potential bios/acpi data regions. */ int devmem_is_allowed(unsigned long pagenr) { if (pagenr <= 256) return 1; if (!page_is_ram(pagenr)) return 1; return 0; } So the problem is the read in dmimodule is trying to access /dev/mem beyond 1M. Examing dmimodule more closely you find: ret = dmi_table(self, fd, base, len, num); if (ret != 0) return ret; and inside dmi_table(): if (lseek(fd, (long) base, 0) == -1) { PyErr_SetString(AccessError, "dmi: lseek"); return 1; } if (read(fd, buf, len) != len) { PyErr_SetString(AccessError, "dmi: read"); return 1; So when dmi_table() is called, we're changing the file posistion inside /dev/mem yet not setting it back to where it was when we return from the call. Consequently, even though the module is trying to keep itself from reading /dev/mem beyond 1M, the fp variable that it uses it out of sync with the actual file position inside /dev/mem after the call to dmi_table() returns. My proposed 1 line patch to fix this: --- dmimodule.c 2006-06-22 13:15:43.000000000 -0400 +++ dmimodule.c.fixed 2006-06-23 14:03:57.000000000 -0400 @@ -358,6 +358,7 @@ PyDict_SetItemString(info, "DMI", Py_BuildValue("s", buffer)); ret = dmi_table(self, fd, base, len, num); + lseek(fd, fp+16, 0); if (ret != 0) return ret; } I built and tested a modified dmimodule.so with the above patch and got the following output: [root@foo src]# /usr/share/rhn/up2date_client/hardware.py 'pcibus' : '16' <snip> 'hostname' : 'foo.foo.com' 'ipaddr' : '192.168.1.1' 'class' : 'NETINFO' 'product' : '0WC983' 'vendor' : 'Dell Computer Corporation' 'bios_vendor' : 'Dell Computer Corporation' 'system' : 'PowerEdge 6850 ' 'bios_release' : '10/17/2005' 'board' : 'Dell Computer Corporation' 'bios_version' : 'A02' 'class' : 'DMI' 'asset' : '(chassis: 6LGBC91) (board: ..CN124605CK0303.) (system: 6LGBC91) ' So it appears to work... Fixed in up2date-4.5.1-1. We cannot ensure that we will always send the hardware information (the network could go down, etc). But since at the point when we send the hardware info we have already registered the system, not sending hardware info is not fatal. So what happens now is the progress window reports what step it is performing ("Sending hardware information", etc.), and reports if one of the steps fails, then continues on. Changing product to RHEL and component to rhn_register. User bnackash's account has been closed An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0250.html |