Bug 463915
Summary: | [5.3] SCP - dd: /dev/mem: Bad address | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Qian Cai <qcai> | ||||||||||
Component: | kexec-tools | Assignee: | Neil Horman <nhorman> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> | ||||||||||
Severity: | low | Docs Contact: | |||||||||||
Priority: | low | ||||||||||||
Version: | 5.3 | CC: | duck, riek, syeghiay | ||||||||||
Target Milestone: | rc | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | ia64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2009-01-20 21:00:37 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Qian Cai
2008-09-25 12:14:46 UTC
hmm, the unknown operand error location will depend on your init script in your initramfs, if you could attach that please it would be helpful. As for the dev/mem error, my guess is your testing on a system in which there is a memory hole early in ram. I'll attach a patch for that shortly. Created attachment 317712 [details]
patch to interrogate /proc/iomem to find a place to dd from dev/mem
Could you try this patch please and let me know if it fixes the /dev/mem access problem? Thanks!
I am afraid it won't work, - config file: net root.bos.redhat.com - mkdumprd snip: ... + mkdir -p /tmp/initrd.d10855/root + cp -a /root/.ssh /tmp/initrd.d10855/root/ + cp -a /etc/ssh /tmp/initrd.d10855/etc + mknod /tmp/initrd.d10855/dev/urandom c 1 9 + emit 'START_ADDR=`grep "System RAM" /proc/iomem | head -n 1 | cut -d"-" -f1`' + NONL= + '[' 'START_ADDR=`grep "System RAM" /proc/iomem | head -n 1 | cut -d"-" -f1`' == -n ']' + echo 'START_ADDR=`grep "System RAM" /proc/iomem | head -n 1 | cut -d"-" -f1`' + emit 'SKIP_COUNT=`dc -e"$START_ADDR 512 / 1 +`' + NONL= + '[' 'SKIP_COUNT=`dc -e"$START_ADDR 512 / 1 +`' == -n ']' + echo 'SKIP_COUNT=`dc -e"$START_ADDR 512 / 1 +`' + emit 'dd if=/dev/mem of=/dev/urandom count=1 bs=512 skip=$SKIP_COUNT' + NONL= + '[' 'dd if=/dev/mem of=/dev/urandom count=1 bs=512 skip=$SKIP_COUNT' == -n ']' + echo 'dd if=/dev/mem of=/dev/urandom count=1 bs=512 skip=$SKIP_COUNT' + emit 'ssh -q -o BatchMode=yes -o StrictHostKeyChecking=no root.65.108 mkdir /var/crash/10.16.64.220-$DATE' + NONL= + '[' 'ssh -q -o BatchMode=yes -o StrictHostKeyChecking=no root.65.108 mkdir /var/crash/10.16.64.220-$DATE' == -n ']' ... - run SysRq-C ... Activating logical volumes 2 logical volume(s) in volume group "VolGroup00" now active hwclock: Could not access RTC: No such file or directory mapping eth0 to eth0 udhcpc (v1.2.0) started udhcpc[1104]: udhcpc (v1.2.0) started Sending discover... udhcpc[1104]: Sending discover... Sending discover... udhcpc[1104]: Sending discover... tg3: eth0: Link is up at 1000 Mbps, full duplex. tg3: eth0: Flow control is off for TX and off for RX. Sending discover... udhcpc[1104]: Sending discover... Sending select for 10.16.64.220... udhcpc[1104]: Sending select for 10.16.64.220... Lease of 10.16.64.220 obtained, lease time 86400 udhcpc[1104]: Lease of 10.16.64.220 obtained, lease time 86400 deleting routers route: SIOC[ADD|DEL]RT: No such process adding dns 10.16.255.2 adding dns 10.16.255.3 Saving to remote location root.bos.redhat.com dd: invalid number `' [: 1147: unknown operand [: 1147: unknown operand.85 MB [: 1147: unknown operand.85 MB [: 1147: unknown operand.85 MB [: 1147: unknown operand.85 MB ... - init from the initramfs snip: ... echo Saving to remote location root.bos.redhat.com START_ADDR=`grep "System RAM" /proc/iomem | head -n 1 | cut -d"-" -f1` SKIP_COUNT=`dc -e"$START_ADDR 512 / 1 +` dd if=/dev/mem of=/dev/urandom count=1 bs=512 skip=$SKIP_COUNT ssh -q -o BatchMode=yes -o StrictHostKeyChecking=no root.65.108 mkdir /var/crash/10.16.64.220-$DATE VMCORE=/var/crash/10.16.64.220-$DATE/vmcore export VMCORE monitor_scp_progress root.65.108 /var/crash/10.16.64.220-$DATE/vmcore-incomplete & scp -q -o BatchMode=yes -o StrictHostKeyChecking=no /proc/vmcore root.65.108:$VMCORE-incomplete exitcode=$? if [ $exitcode == 0 ] ... Created attachment 317804 [details]
new patch to fix dd errors
sorry, this one should take care of it. I tested it myself and it corrected both errors that you were seeing. Please test and confirm, and I'll check it in asap
Neil, those "unknown operand" errors are gone, but the dd part is still failed here. - run SysRq-C ... Saving to remote location root.bos.redhat.com dd: invalid number `5.85953e+06' Copied 317.062 MB / 7323.19 MB ... - init from the initramfs snip: ... echo Saving to remote location root.bos.redhat.com START_ADDR=`grep "System RAM" /proc/iomem | head -n 1 | cut -d"-" -f1` SKIP_COUNT=`echo "$START_ADDR 512 / 1 + p" | dc` dd if=/dev/mem of=/dev/urandom count=1 bs=512 skip=$SKIP_COUNT ... If I run those manually, I got the following. # START_ADDR=`grep "System RAM" /proc/iomem | head -n 1 | cut -d"-" -f1` # echo $START_ADDR 3000080000 SKIP_COUNT=`echo "$START_ADDR 512 / 1 + p" | dc` # echo $SKIP_COUNT 5859532 - If use busybox version of "dc", it could reproduce the original error. SKIP_COUNT=`echo "$START_ADDR 512 / 1 + p" | busybox dc` # echo $SKIP_COUNT 5.85953e+06 - Even if use the seems correct offset, it is still failed here. # dd if=/dev/mem of=/dev/urandom count=1 bs=512 skip=5859532 dd: reading `/dev/mem': Bad address 0+0 records in 0+0 records out 0 bytes (0 B) copied, 0.00036695 seconds, 0.0 kB/s ugh, why can't anything ever be easy with ia64? :) Do me a favor, attach the /proc/iomem file from the ia64 system in question? Thanks! Created attachment 317878 [details]
/proc/iomem from altix4.rhts.bos.redhat.com
Cai, I just noticed that it looks like you've been using several different machines here to test this. The /proc/iomem file is different from the system that you got the last failure on. While thats not a big deal, we probablyneed to be consistent in the machine we test on to make sure that we don't get confused on the data that we're looking at. I think the test results that you got on ndnc-1.lab.bos.redhat.com, are indicative of a minor problem in my use of teh dc command. I need to tweak it so that I don't output numbers in sci notation. I'll attach a corrected patch shortly Created attachment 317971 [details]
new version of patch
Heres a new version of the patch, it sets the output precision of dc such that it should not use sci notation for anything, enabling this to work properly.
Neil, the "dd: reading `/dev/mem': Bad address" error could be reproduced on both two of IA64 systems I have tested so far. One of them is altix4.rhts.bos.redhat.com . ndnc-1.lab.bos.redhat.com is just a Kdump scp target. Anyway, with the patched version, I got this on altix4.rhts.bos.redhat.com, ... deleting routers route: SIOC[ADD|DEL]RT: No such process adding dns 10.16.255.2 adding dns 10.16.255.3 Saving to remote location root.bos.redhat.com dc: k: syntax error. dd: invalid number `' Copied 56.2344 MB / 7323.19 MB ... Run the generated code from Kdump initramfs maunally, # START_ADDR=`grep "System RAM" /proc/iomem | head -n 1 | cut -d"-" -f1` # SKIP_COUNT=`echo "100 k $START_ADDR 512 / 1 + p" | busybox dc` dc: k: syntax error. Can I get on that system to poke around with this? It appears you (or someone else is actively working on it at the moment)? Sure, it is running RHEL 4.7 at the moment. Do you need to install RHEL 5.3? If so, I will cancel my reservation and reserve it for you. I am signing off for today now. I have made a reservation of this machine for you, so it should be ready for you to have a look soon. Thank you Cai, I'll have this working by the time you get back Ok, bad news/good news. The bad news is that /dev/mem on ia64 seems to have a problem. No matter where I seek to, it seems to reply with an EFAULT return code. I'll need to dig into that further. The good news is that after testing it, The system in question doesn't actually need the additional entropy in /dev/urandom for ssh to work properly. As such, I can work around the problem by simply supressing the error. So Thats how we'll be handling this for now. I'll dig into why we keep getting EFAULT asap. This is fixed in -44.el5. Thanks! An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0105.html |