Bug 147710 - new dump-0.4b39/rmt-0.4b39 fails
new dump-0.4b39/rmt-0.4b39 fails
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: dump (Show other bugs)
2
x86_64 Linux
medium Severity medium
: ---
: ---
Assigned To: Jindrich Novy
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-02-10 14:16 EST by Philip Goisman
Modified: 2013-07-02 19:06 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-03-02 02:45:23 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
Patch fixing the -s/-d issue (826 bytes, patch)
2005-03-01 17:17 EST, Stelian Pop
no flags Details | Diff

  None (edit)
Description Philip Goisman 2005-02-10 14:16:28 EST
Description of problem:
dump-0.4b39 asks for another tape rather than perform dump on x86_64

Version-Release number of selected component (if applicable):
version 0.4b39 of January 21, 2005

How reproducible:
Below is an example initiated from the prompt:

[root@m3 bin]# uname -a
Linux m3 2.6.10-1.12_FC2 #1 Wed Feb 2 01:10:26 EST 2005 x86_64 x86_64
x86_64 GNU/Linux

[root@m3 bin]# /sbin/rdump 1usbdf 160000 64 81633 m5:/dev/nst0 /
  DUMP: Connection to m5 established.
  DUMP: Date of this level 1 dump: Thu Feb 10 10:41:33 2005
  DUMP: Date of last level 0 dump: Sun Jan 30 02:02:55 2005
  DUMP: Dumping /dev/hda2 (/) to /dev/nst0 on host m5
  DUMP: Label: /
  DUMP: Writing 64 Kilobyte records
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 834182 blocks on 0.01 tape(s).
  DUMP: Volume 1 started with block 1 at: Thu Feb 10 10:43:15 2005
  DUMP: Closing /dev/nst0
  DUMP: Volume 1 completed at: Thu Feb 10 10:43:30 2005
  DUMP: Volume 1 64 blocks (0.06MB)
  DUMP: Volume 1 took 0:00:15
  DUMP: Volume 1 transfer rate: 4 kB/s
  DUMP: Change Volumes: Mount volume #2
  DUMP: Is the new volume mounted and ready to go?: ("yes" or "no") yes
  DUMP: Volume 2 started with block 65 at: Thu Feb 10 10:44:23 2005
  DUMP: Volume 2 begins with blocks from inode 4751982
  DUMP: Closing /dev/nst0
  DUMP: Volume 2 completed at: Thu Feb 10 10:44:24 2005
  DUMP: Volume 2 64 blocks (0.06MB)
  DUMP: Volume 2 took 0:00:01
  DUMP: Volume 2 transfer rate: 64 kB/s
  DUMP: Change Volumes: Mount volume #3
  DUMP: Is the new volume mounted and ready to go?: ("yes" or "no") yes
  DUMP: Volume 3 started with block 129 at: Thu Feb 10 10:44:36 2005
  DUMP: Volume 3 begins with blocks from inode 4751982
  DUMP: dumping (Pass III) [directories]
  DUMP: Closing /dev/nst0
  DUMP: Volume 3 completed at: Thu Feb 10 10:44:37 2005
  DUMP: Volume 3 64 blocks (0.06MB)
  DUMP: Volume 3 took 0:00:01
  DUMP: Volume 3 transfer rate: 64 kB/s
  DUMP: Change Volumes: Mount volume #4
  DUMP: Is the new volume mounted and ready to go?: ("yes" or "no") no
  DUMP: Do you want to abort?: ("yes" or "no") yes
  DUMP: The ENTIRE dump is aborted.

Steps to Reproduce:
1. Just enter the same commands as I did above
2.
3.
  
Actual results:
no dump either from script or prompt

Expected results:
dump the named file

Additional info:
Comment 1 Stelian Pop 2005-02-14 04:41:00 EST
I have a x86_64 machine and I'm unable to reproduce your problem:

# cp /root/.ssh/id_dsa.pub /root/.ssh/authorized_keys
# export RSH=ssh
# dump 0f localhost:/tmp/test /etc/
  DUMP: Connection to localhost established.
  DUMP: Date of this level 0 dump: Mon Feb 14 10:34:32 2005
  DUMP: Dumping /dev/hdb1 (/ (dir etc)) to /tmp/test on host localhost
  DUMP: Label: /1
  DUMP: Writing 10 Kilobyte records
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 42043 blocks.
  DUMP: Volume 1 started with block 1 at: Mon Feb 14 10:34:36 2005
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: Closing /tmp/test
  DUMP: Volume 1 completed at: Mon Feb 14 10:35:22 2005
  DUMP: Volume 1 49670 blocks (48.51MB)
  DUMP: Volume 1 took 0:00:46
  DUMP: Volume 1 transfer rate: 1079 kB/s
  DUMP: 49670 blocks (48.51MB) on 1 volume(s)
  DUMP: finished in 45 seconds, throughput 1103 kBytes/sec
  DUMP: Date of this level 0 dump: Mon Feb 14 10:34:32 2005
  DUMP: Date this dump completed:  Mon Feb 14 10:35:22 2005
  DUMP: Average transfer rate: 1079 kB/s
  DUMP: DUMP IS DONE
# 

While it could be possible that there is a real bug involving the tape
 drive access, I suspect this is not the case. Please verify that
you're able to access the tape drive locally first, then verify the
version of the 'rmt' server you have on this machine, then verify that
you're able to send data to the tape drive remotely (using tar
remote:/dev/nst0 for example). 

Stelian.
Comment 2 Philip Goisman 2005-02-14 12:47:09 EST
I have three dual-processor x86_64 platforms and one single processor
x86_64 platform.  Since I didn't receive feedback immediately, I
reinstalled dump-0.4b33-3.x86_64.rpm which works.  I left rmt at version
0.4b39.  These are all AMD Athlon's as follows:

Updating m3:cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 4
model name      : AMD Athlon(tm) 64 Processor 3200+
stepping        : 10
cpu MHz         : 2010.015
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3956.73
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp

DONE
Updating turandot:cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 244
stepping        : 1
cpu MHz         : 1792.693
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3522.56
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 244
stepping        : 1
cpu MHz         : 1792.693
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3604.48
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp

DONE
Updating space:cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 242
stepping        : 10
cpu MHz         : 1592.980
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3129.34
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 242
stepping        : 10
cpu MHz         : 1592.980
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3178.49
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp

DONE
Updating kramers:cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 242
stepping        : 10
cpu MHz         : 1594.053
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3129.34
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp

processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 242
stepping        : 10
cpu MHz         : 1594.053
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm
3dnowext 3dnow
bogomips        : 3186.68
TLB size        : 1088 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp

DONE

For now I've turned off automatic yum updates for dump and rmt
as my dumps and restores again work.

Thanks,

      Phil
Comment 3 Stelian Pop 2005-02-14 14:44:09 EST
Could you give a little more information about the conditions in which
the problem occur ?

If I undestand correctly, dump-0.4b39 from 'm3' (which is a x86_64
machine, but not listed above ?) to 'm5' (x86_64 UP) fails when having
rmt-0.4b39 installed on 'm5'

But dump-0.4b33 from 'm3' to 'm5' + rmt-0.4b39 works ok ?

Am I right ?

Could you also tell me:
* does dump-0.4b39 work when dumping to a local file ?
* does tar work when dumping to 'm5' + rmt-0.4b39 ?
Comment 4 Philip Goisman 2005-02-15 08:22:53 EST
m3 is in the list I gave you.  It's the first system.

You understand correctly that dump-0.4b33 + rmt-0.4b39 from 'm3' to
'm5' works ok as that configuration does on all four systems in that
list.  By the way, m5 is not an x86_64 system.  It's an AMD Athlon(tm)
XP 2400+ running RH9 with kernel 2.4.20-31.9.

The configuration with dump-0.4b39 + rmt-0.4b39 doesn't work on those
AMD systems in the list above.  The  dump-0.4b39 + rmt-0.4b39
configuration does work on pentiums and intels.

I won't be able to do the tests you request until at least next week.
Comment 5 Stelian Pop 2005-02-15 09:17:55 EST
Sorry, I wanted to say that 'm5' was not in your list. 

Also, I think you're a bit confused about what 'rmt' is. The dump
package contains the 'dump' and 'restore' binaries that can access a
backup either localy (directly) or remotely (by connecting to a server
*which has a /etc/rmt binary*). This binary is provided by the 'rmt'
package.

In other words, you need to install the 'dump' package only on the
systems you want to backup, and 'rmt' package only on the tape server.

I suppose then that the combination which failed is dump-0.4b39 on the
client (x86_64) and rmt-0.4b28 on the tape server (i386). Is this
correct ?

I tried downgrading the rmt server on my test machine and I didn't
find any problem.

You will have to do the tests I requested:
* try on 'm3' a dump to a local file using dump-0.4b39
* try on 'm3' a tar cf m5:/dev/nst0 
* also try upgrading rmt on m5 to rmt-0.4b39 and see if that fixes the
 problem

You can also generate some debug output from rmt simply by replacing
the '/etc/rmt' symlink with a shell script which calls /sbin/rmt with
an extra parameter which is the debug file:

# rm /etc/rmt
# echo "#!/bin/sh" > /etc/rmt
# echo "exec /sbin/rmt /tmp/rmtlog" >> /etc/rmt
# chmod 755 /etc/rmt

Then relaunch your remote dump and post the logs in /tmp/rmtlog.

Stelian. 
Comment 6 Philip Goisman 2005-02-21 18:52:08 EST
dump to a local file using dump-0.4b39 looks like it works as follows:

[root@m3 goisman]# dump -0f - / > backup_4
  DUMP: Date of this level 0 dump: Mon Feb 21 16:17:41 2005
  DUMP: Dumping /dev/hda2 (/) to standard output
  DUMP: Label: /
  DUMP: Writing 10 Kilobyte records
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 19603188 blocks.
  DUMP: Volume 1 started with block 1 at: Mon Feb 21 16:18:31 2005
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: 9.41% done at 6150 kB/s, finished in 0:48
  DUMP: Interrupt received.
  DUMP: Interrupt received.
  DUMP: Interrupt received.
  DUMP: Do you want to abort dump?: ("yes" or "no") yes
  DUMP: The ENTIRE dump is aborted.
[root@m3 goisman]#

I killed the dump as I didn't want it filling my disk.  But it
appears it would've completed.


The tar test also worked fine:

[root@m3 www_stuff]# tar zcf root@m5:/dev/nst0 .
   Made a test dir and then retored with 
[root@m3 test]# tar zxf root@m5:/dev/nst0

Which leaves "upgrading rmt on m5 to rmt-0.4b39" to see if that
fixes the problem.  However, when I look for rmt-0.4b39 in redhat 9,
I don't see that version available.

Do you have any suggestions?

Thanks,

Phil


Comment 7 Stelian Pop 2005-02-25 05:42:02 EST
Just retrieve the dump-0.4b39 source rpm and rebuild it on a redhat 9,
it should build just fine.

Stelian.
Comment 8 Philip Goisman 2005-02-25 15:31:36 EST
Thanks for the suggestion.  I did build and install dump-0.4b39 from
source.  However, from x86_64 systems only the problem reported above 
remains.

Below is the most recent log m3 -> m5:

LABEL  m3 
START: 1316
Backup host m3 0
/sbin/rdump 0usbdf 160000 64 81633 m5:/dev/nst0 /dev/hda2
  DUMP:   DUMP: Date of this level 0 dump: Fri Feb 25 13:16:05 2005
  DUMP: Dumping /dev/hda2 (/) to /dev/nst0 on host m5
Connection to m5 established.
  DUMP: Label: /
  DUMP: Writing 64 Kilobyte records
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 22189946 blocks on 0.27 tape(s).
  DUMP: Volume 1 started with block 1 at: Fri Feb 25 13:16:56 2005
  DUMP: Closing /dev/nst0
  DUMP: Volume 1 completed at: Fri Feb 25 13:17:10 2005
  DUMP: Volume 1 64 blocks (0.06MB)
  DUMP: Volume 1 took 0:00:14
  DUMP: Volume 1 transfer rate: 4 kB/s
  DUMP: Change Volumes: Mount volume #2
  DUMP: fopen on /dev/tty fails: No such device or address
  DUMP: The ENTIRE dump is aborted.
/sbin/rdump 0usbdf 160000 64 81633 m5:/dev/nst0 /dev/hda1
  DUMP:   DUMP: Connection to m5 established.
Date of this level 0 dump: Fri Feb 25 13:17:11 2005
  DUMP: Dumping /dev/hda1 (/boot) to /dev/nst0 on host m5
  DUMP: Label: /boot
  DUMP: Writing 64 Kilobyte records
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 24644 blocks on 0.00 tape(s).
  DUMP: Volume 1 started with block 1 at: Fri Feb 25 13:17:11 2005
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: Closing /dev/nst0
  DUMP: Volume 1 completed at: Fri Feb 25 13:17:16 2005
  DUMP: Volume 1 24640 blocks (24.06MB)
  DUMP: Volume 1 took 0:00:05
  DUMP: Volume 1 transfer rate: 4928 kB/s
  DUMP: 24640 blocks (24.06MB) on 1 volume(s)
  DUMP: finished in 3 seconds, throughput 8213 kBytes/sec
  DUMP: Date of this level 0 dump: Fri Feb 25 13:17:11 2005
  DUMP: Date this dump completed:  Fri Feb 25 13:17:16 2005
  DUMP: Average transfer rate: 4928 kB/s
  DUMP: DUMP IS DONE
FINISH: 1317

I've removed dump-0.4b39 on m3 and reinstaled dump-0.4b33-3.x86_64.
Except for a few lines of "DUMP: ACLs in inode #xxxxxxx won't be
dumped," the dump works.  

Philip
Comment 9 Stelian Pop 2005-02-25 16:03:17 EST
Could you generate the rmt logs as I shown in a previous comment ?

Best would be if you generate the logs twice: once with the working
(b33) dump, and once with the b39.

Thanks.

Stelian.
Comment 10 Philip Goisman 2005-02-25 16:52:12 EST
Since my full backups are now presently running, I'll have to
generate rmt logs Monday.  My sincere apology for this further delay.

Philip
Comment 11 Philip Goisman 2005-02-28 13:25:12 EST
Hi,

I have generated the rmt logs.  However, both seem too large for this 
format.  I can send them as attachments, or otherwise as you suggest.

Philip
Comment 12 Stelian Pop 2005-02-28 14:14:37 EST
Sure, send them privately to me by mail after you compress them with
bzip2 -9.

Stelian.
Comment 13 Philip Goisman 2005-03-01 17:13:18 EST
Hi,

Thanks to a patch from Stelian, the problem described above is
resolved.  Great job, Stelian.  I appreciate all the work you
did to resolve this issue very much.

Best regards,

Philip
Comment 14 Stelian Pop 2005-03-01 17:16:46 EST
Ok, for the record I have been able to find out what was wrong.

The problem was not in dump/rmt communication but in the arguments
Philip was using for dump: -s/-d. Dump (for quite some time and
versions now) contained a bug when calculating the tape size based on
these parameters.

In Philip's case, dump did calculate a negative size, which in turn
caused dump to think enough data was already written to the current
tape and ask the operator to change tapes.

Those arguments are rarely used today (almost everybody uses -B now to
specify the size in KB of a tape instead of giving the density and the
tape length in meters), and for whatever dark reason the bug seems to
occur only on 64 bits.

I'll attach a patch fixing the issue to this bug, this patch will be
in the next upstream version.

Stelian.
Comment 15 Stelian Pop 2005-03-01 17:17:53 EST
Created attachment 111557 [details]
Patch fixing the -s/-d issue
Comment 16 Jindrich Novy 2005-03-02 02:45:23 EST
dump-0.4b39-3 with this patch applied is now built, thanks.

Note You need to log in before you can comment on or make changes to this bug.