Bug 1157463
Description
Battista Vaccaro
2014-10-27 10:00:23 UTC
Created attachment 950928 [details]
zip file with error logs for amrecover tool
We tried the workaround documentend in this forum without success: http://forums.zmanda.com/showthread.php?3691-amrecover-segmentation-fault The same bug was on Debian/Ubuntu also and was fixed there three years ago. Here are the bug reports for reference: https://bugs.launchpad.net/ubuntu/+source/amanda/+bug/1074574 https://bugs.launchpad.net/ubuntu/+source/amanda/+bug/1077105 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=655025 (In reply to Fabio Pedretti from comment #4) > The same bug was on Debian/Ubuntu also and was fixed there three years ago. > Here are the bug reports for reference: > https://bugs.launchpad.net/ubuntu/+source/amanda/+bug/1074574 > https://bugs.launchpad.net/ubuntu/+source/amanda/+bug/1077105 > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=655025 I have checked amanda version and those corrections are already applied. You are right indeed, that was a similar but different issue. Is there some work being done for this? I see this bug is blocking another one. As a customer we also opened a support ticket for this but there is no news on that one, also. Thanks. I have informed upstream about this problem and sent them all relevant links with fixes mentioned in this bugzilla. I am waiting for response from upstream, though. Hi Battista, could you please attach logs from amidxtaped daemon? It seems that the daemon crashes. Thank you in advance. Greetings Petr Link to upstream mail: https://groups.yahoo.com/neo/groups/amanda-hackers/conversations/messages/7003 Created attachment 1010046 [details]
amidxtaped daemon log
I attach the log you asked me for the amidxtaped daemon
Regards
I repeated the amrecover operation with the amidxtaped file patched as told in the forum post I wrote you about before... I attach in a bundle zip file the amidxtaped file patched and the amidxtaped log related to this new amrecover operation The error shown is different: Extracting files using tape drive changer on host localhost. Load tape catdb_backup-12 now Continue [?/Y/n/s/d]? y Got no header and data from server, check in amidxtaped.*.debug and amandad.*.debug files on server amrecover> quit Created attachment 1010060 [details]
amidxtaped patched with related log
amidxtaped patched with related log
Created attachment 1010089 [details]
amandad debug file of last amrecover command execution
I forgot to attach the amandad debug file for the last amrecover
Created attachment 1010534 [details]
Patch for chop problem.
Patch for chop troubles.
Upstream asks whether you applied the patch for amidxtaped before logs or after. But it seems that patch which you applied is different from upstream. See discussion https://groups.yahoo.com/neo/groups/amanda-hackers/conversations/topics/7003 The last logs I gave you are after the file was patched.. Created attachment 1010568 [details] amanda output with patch from comment 14 Hi, we tried the patch in comment 14, and we get the same behavior of the patch get from the forum. The console output is attached. This means that the patch probably fixes the original issue and should probably be committed upstream and redhat package updated. However, as you can see from amanda output, we are still unable to recover our files, probably due to a different issue. We'll attach the debug files as suggested by console output. Created attachment 1010570 [details] amidxtaped debug file amidxtaped debug file with patch from comment 14 still crash, a backtrace is shown. We tried again without patch (original redhat) and with patch from comment 14, this is a summary: - console output without patch is what we reported in comment 0 - console output with patch from comment 14 is different, and is attached in comment 17 - amidxtaped crashes however in the same way (according to amidxtaped debug file) with or without the patch Let me know if you need some other information to further debug this. Thanks. The console output doesn't tell us much -- just "amidxtaped broke", without any information as to how. So it's not terribly useful here. The amidxtaped logs are useful, though. The amidxtaped logs are where you see what went wrong (it'd be great if there was more, like a perl traceback, but SWIG doesn't give us that). So: no patch => attachment 1010046 [details] comment 14 / attachment 1010534 [details] => attachment 1010570 [details] Note that the two results are *NOT* the same; in the second, amidxtaped has done a good deal more stuff before failing. The last few bits involve Amanda::Recovery::Scan running find_volume. It first tries to use the changer's `inventory`, but Amanda::Changer::compat doesn't support inventory (incidentally, the compat changer has been removed in later versions of Amanda.. it's not 1992 anymore!), so https://github.com/zmanda/amanda/blob/99d83be9e9e69d903fb0b3b36c613dddd1cd1707/perl/Amanda/Recovery/Scan.pm#L267 step get_first_inventory => sub { Amanda::Debug::debug("find_volume labeled '$label'"); $scan_running = 1; $self->{'chg'}->inventory(inventory_cb => $steps->{'got_first_inventory'}); }; step got_first_inventory => sub { (my $err, $inventory) = @_; if ($err && $err->notimpl) { #inventory not implemented return $self->_find_volume_no_inventory(%params); gets an $err because it's not implemented, and calls _find_volume_no_inventory: https://github.com/zmanda/amanda/blob/99d83be9e9e69d903fb0b3b36c613dddd1cd1707/perl/Amanda/Recovery/Scan.pm#L653 which eventually tries to load the "current" slot step load_label => sub { return $self->{'chg'}->load(relative_slot => "current", res_cb => $steps->{'load_label_cb'}); }; invoking compat's `load`: https://github.com/zmanda/amanda/blob/99d83be9e9e69d903fb0b3b36c613dddd1cd1707/perl/Amanda/Changer/compat.pm#L97 which calls the script with -info, as we see logged https://github.com/zmanda/amanda/blob/99d83be9e9e69d903fb0b3b36c613dddd1cd1707/perl/Amanda/Changer/compat.pm#L113 ..and then it fails. So the best way to narrow this down is to insert some more debug prints and see what the last one is. Created attachment 1010651 [details] rh1157463.patch Please apply this patch along with that in comment 14, re-run, and post the amidxtaped log file. Created attachment 1010652 [details] amidxtaped debug file with patch from comment 14 + debug patch from comment 21 Oh, darn, my Perl is *very* rusty! All of those debug(..) should have a trailing semicolon. Sorry about that :( Created attachment 1010663 [details] amidxtaped debug file with patch from comment 14 + debug patch (with semicolons) from comment 21 Here it is! Thanks! Well, that's certainly disappointing -- it only printed one of them (the one right before the debug(..) we already saw). Sorry to put you through it, but I'll put up another patch. Created attachment 1010687 [details]
rh1157463-debug-2.patch
Created attachment 1010691 [details] amidxtaped debug file with patch from comment 14 + debug patch from comment 26 Well, that puts us squarely at + debug('WHERE MF 3'); debug("Amanda::Changer::compat: Got response '$child_output' with exit status $exitval"); + debug('WHERE MF 4'); which means that somehow interpolation into that string is producing a non-string. So let's try one more thing, with http://perldoc.perl.org/Data/Dumper.html Add use Data::Dumper; at the top of the file with the other use declarations, and then replace the middle line above with debug(Dumper("Amanda::Changer::compat: Got response '$child_output' with exit status $exitval")); and let's see what the result is. Also, what version of Perl is this? I wonder if Perl has changed its behavior somehow, as interpolating undefs and integers produces a string on my version. Created attachment 1010692 [details] amidxtaped debug file with patch from comment 14 + debug patch as below I did this: --- /usr/lib64/perl5/vendor_perl/Amanda/Changer/compat.pm.originale 2015-04-03 19:54:38.230525839 +0200 +++ /usr/lib64/perl5/vendor_perl/Amanda/Changer/compat.pm 2015-04-03 20:29:14.152506042 +0200 @@ -19,6 +19,8 @@ package Amanda::Changer::compat; +use Data::Dumper; + use strict; use warnings; use vars qw( @ISA ); @@ -579,7 +581,9 @@ my @child_output = split '\n', $child_output; my $exitval = POSIX::WEXITSTATUS($child_exit_status); - debug("Amanda::Changer::compat: Got response '$child_output' with exit status $exitval"); + debug('x'); + debug(Dumper("Amanda::Changer::compat: Got response '$child_output' with exit status $exitval")); + debug('y'); if (@child_output < 1) { $run_cb->(2, undef, "Malformed output from changer script -- no output"); return; and got what attached. Also: # perl -v ~ This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi (with 28 registered patches, see perl -V for more detail) ... # rpm -qa | grep perl-5 perl-5.16.3-285.el7.x86_64 Well, that's just weird. Dumper() should always return a string, but from what you've shown it's returning a non-string which is being handed back to debug(..). I haven't worked with perl since 5.12 -- is there a different string type now, perhaps? Unicode? In other words, I'm kind of stuck here. Commenting that debug call let the file properly recover. So here is again an update and TODO for Red Hat: - patch in comment 14 should be added to redhat package, or the package should be updated to a newer upstream that already include the patch; - investigate why the debug call in comment 28 and comment 29 fails, in the meantime as a workaround add a patch to comment out that call to be able to properly recover files. As discussed with Ravi on support portal, I can confirm the status of this bug is still the same as explained in comment 31. I'm here to confirm this problem still exists with amanda-3.3.3-13. here's the debug log from amidxtaped: Thu Mar 10 23:12:30 2016: thd-0xa88200: amidxtaped: pid 4087 ruid 33 euid 33 version 3.3.3: start at Thu Mar 10 23:12:30 2016 Thu Mar 10 23:12:30 2016: thd-0xa88200: amidxtaped: critical (fatal): TypeError in method 'debug', a rgument 1 of type 'char *' amidxtaped: TypeError in method 'debug', argument 1 of type 'char *' /lib64/libamanda-3.3.3.so(+0x2b707)[0x7fb92af81707] /lib64/libglib-2.0.so.0(g_logv+0x1b1)[0x7fb929fca7a1] /lib64/libglib-2.0.so.0(g_log+0x8f)[0x7fb929fcaa4f] /usr/lib64/perl5/vendor_perl/auto/Amanda/MainLoop/libMainLoop.so(+0x44f1)[0x7fb925b484f1] /lib64/libglib-2.0.so.0(+0x4a1c3)[0x7fb929fc41c3] /lib64/libglib-2.0.so.0(g_main_context_dispatch+0x15a)[0x7fb929fc37aa] /lib64/libglib-2.0.so.0(+0x49af8)[0x7fb929fc3af8] /lib64/libglib-2.0.so.0(g_main_loop_run+0x6a)[0x7fb929fc3dca] /usr/lib64/perl5/vendor_perl/auto/Amanda/MainLoop/libMainLoop.so(_wrap_run_c+0x50)[0x7fb925b48710] /usr/lib64/perl5/CORE/libperl.so(Perl_pp_entersub+0x58f)[0x7fb92ce3b39f] /usr/lib64/perl5/CORE/libperl.so(Perl_runops_standard+0x16)[0x7fb92ce33b16] /usr/lib64/perl5/CORE/libperl.so(perl_run+0x355)[0x7fb92cdd0925] /usr/bin/perl[0x400d99] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fb92ba50b15] /usr/bin/perl[0x400dd1] This appears to be a very old bug from 2011, as discussed here: http://forums.zmanda.com/showthread.php?3691-amrecover-segmentation-fault More details are discussed in that thread, but the fix recommended at posst #14: https://forums.zmanda.com/showthread.php?3691-amrecover-segmentation-fault&p=15431#post15431 indeed fixes the problem for me. Please note that the recommended fix per the above thread is *different* than the patch posted in comment 14. The occurrence of the problematic $chopped is also found in function 'sub sendctlline'. The patch in comment 14 omits 1 of the 3 occurrences. Please consider applying this fix to the next update release as this is now 5 years old and a show stopper. (a backup tool is pointless if you can't do basic file recovery) I am planning to apply a patch from comment 14. And of course patch from https://forums.zmanda.com/showthread.php?3691-amrecover-segmentation-fault&p=15431#post15431 The whole patch looks like: diff --git a/server-src/amidxtaped.pl b/server-src/amidxtaped.pl index 79ea978..5a4106e 100644 --- a/server-src/amidxtaped.pl +++ b/server-src/amidxtaped.pl @@ -890,7 +890,8 @@ sub getline { last if $c eq "\n"; } - my $chopped = $line; + $line =~ /^(.*)$/; + my $chopped = $1; $chopped =~ s/[\r\n]*$//g; debug("CTL << $chopped"); @@ -914,8 +915,8 @@ sub getline_async { return $async_read_cb->($err, undef) if $err; $buf .= $data; - if ($buf =~ /\r\n$/) { - my $chopped = $buf; + if ($buf =~ /^(.*\r\n)$/) { + my $chopped = $1; $chopped =~ s/[\r\n]*$//g; debug("CTL << $chopped"); @@ -951,8 +952,8 @@ sub sendctlline { my $self = shift; my ($msg, $async_write_cb) = @_; - my $chopped = $msg; - $chopped =~ s/[\r\n]*$//g; + $msg =~ /^(.*)$/; + my $chopped = $1; if ($self->{'ctl_stream'}) { debug("CTL >> $chopped"); Can you please let me know if patch in #c37 works fine? (In reply to Petr Hracek from comment #38) > Can you please let me know if patch in #c37 works fine? Hi, thanks for the support, the patch however fails to apply on current amidxtaped of pakcage amanda-libs-3.3.3-13.el7.x86_64: File to patch: amidxtaped-redhat-patched patching file amidxtaped-redhat-patched Hunk #1 succeeded at 890 with fuzz 1. Hunk #2 FAILED at 915. Hunk #3 succeeded at 952 with fuzz 1. (In reply to Fabio Pedretti from comment #39) > (In reply to Petr Hracek from comment #38) > > Can you please let me know if patch in #c37 works fine? > > Hi, thanks for the support, the patch however fails to apply on current > amidxtaped of pakcage amanda-libs-3.3.3-13.el7.x86_64: > > File to patch: amidxtaped-redhat-patched > patching file amidxtaped-redhat-patched > Hunk #1 succeeded at 890 with fuzz 1. > Hunk #2 FAILED at 915. > Hunk #3 succeeded at 952 with fuzz 1. Correction: the patch is OK, just some tabs fixup was needed when copying from the comment. Will try to recover and report soon. I've tried just now the amrecover tool with the patched version. File recover worked fine, so the issue seems to be solved. Regards amanda-3.3.3-16.el7.src.rpm-x86_64-rebuild.log.gz Patch applied according to build log. => Verified. SanityOnly Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-2209.html |