Bug 2089298

Summary: tar silently is padding with zeros (for Read error at byte...), should say it is Padding with zeros [rhel-9.1.0]
Product: Red Hat Enterprise Linux 9 Reporter: Lukas Javorsky <ljavorsk>
Component: tarAssignee: Lukas Javorsky <ljavorsk>
Status: CLOSED ERRATA QA Contact: Vaclav Danek <vdanek>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.1CC: databases-maint, jheger, ljavorsk, pkubat, vdanek, zmiklank
Target Milestone: rcKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-11-15 11:21:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Lukas Javorsky 2022-05-23 11:30:50 UTC
This bug was initially created as a copy of Bug #1913566

I am copying this bug because: 
The same bug appears in the RHEL-9 version of tar package.


Description of problem:

One of the customer faced I/O errors while archiving a huge file 11 TB and observed that after Tar had hit read I/O error due to xfs filesystem, it still continue writing 0's to the file using strace. However there was no indication for tar that it was writing 0's when the error occurred.

Later it was found that it is expected behavior to write 0's as the file header is already written. Hence, it need to be padded with 0's.


Using the reproducing steps provided by customer, we can see this behavior.

Padding 0's is expected behavior however it does so silently (for Read error at byte...), it should say it is Padding with zeros similar to how it reports "File Shrank , padding with zeroes"

Version-Release number of selected component (if applicable):
# rpm -q tar libtar
tar-1.30-5.el8.x86_64
libtar-1.2.20-15.el8.x86_64


How reproducible:
100%
The below script may not give you read error on every run however when Read error occurs, we do see that tar is writing 0's in strace.


Steps to Reproduce:
==================
The following steps provided by customer I can reproduce the issue. It setup a device with dmsetup using error target.

#!/bin/bash
# Reproducer "tardust"
#
# When "tar create" reads a file there are several shortcomings when it hits read error
#
# 1) When read() returns 0 bytes due to read error, then this happens
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# read(4, 0x563adef7b000, 3584) = -1 EIO (Input/output error)
# write(2, "tar: ", 5tar: ) = 5
# write(2, "/mntx/testfile: Read error at by"..., 70/mntx/testfile: Read error at byte 260653056, while reading 3584 bytes) = 70
# write(2, ": Input/output error", 20: Input/output error) = 20
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# Actual behaviour: it prints a message about "Read error", but it conceals the information it will pad the output with zeros
# Expected behaviour: it should also print the information "padding with zero"
# 2) There is a 2nd shortcoming about tar not differentiate between "read error" and "file shrinkage"
# That means when it sees a short read due to read error, it does not report read error.
# It looks like this:
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 2560
# write(2, "tar: ", 5tar: ) = 5
# write(2, "/mntx/testfile: File shrank by 5"..., 65/mntx/testfile: File shrank by 53927936 bytes; padding with zeros) = 65
# write(2, "\n", 1
# ) = 1
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# Summary: A read error is not reported here. At least it now says "padding with zeros"
# Expected behaviour: it should report a read error, so the user knows what it going on.
#
# 3) Side-Note:
# The blocking factor is applied to the output. When reading a file, all reads are misaligned by 512 bytes.
# This is because it writes a 512 header for every archived file.
# That means the first read from file is 512bytes too short:
# Running with tar-blocking-factor=7
# fstat(1, {st_mode=S_IFREG|0644, st_size=17827, ...}) = 0
# write(1, "/mntx/testfile\n", 15/mntx/testfile
# ) = 15
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3072) = 3072 #1st read 512bytes too short
# write(3, "mntx/testfile\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
# write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3584) = 3584
#
# 4) Reproducer overview:
# - Create a 500MB testimage, then create a testfile1 in the image
# - Use losetup/dmsetup with the "dust" target type
# - you can inject IO errors at specified block number in "dust"
# - You must hit a 4K boundary to see EIO, so use tar-blocking-factor=7 and
# - vary the bad blocknumber to find the case (1)
echo Step 1 Create disk image
dd if=/dev/zero of=/tmp/testimage bs=1M count=500 || exit
echo Step 2 Create XFS in image
mkfs.xfs /tmp/testimage || exit
echo Step 3 Use losetup so the file can be used a block device
losetup /dev/loop1 /tmp/testimage || exit
losetup
echo Step 6 Now create the testfile, this will have read error injected later
mkdir /mntx
mount /dev/loop1 /mntx || exit
dd if=/dev/zero of=/mntx/testfile bs=1M count=300 || exit
umount /mntx
echo Step7 Now iterating through bad blocks
echo As result, there are strace output file a1000 ... a1040
for i in `seq 1000 1 1040`
do
echo
echo Badblock $i
let ERR=i
let ERR1=i+1
let NUMSECTOR2=1024000-ERR1
#echo ERR1 is $ERR1
#echo NUMSECTOR2 is $NUMSECTOR2
dmsetup create tardust <<EOF
0 $ERR linear /dev/loop1 0
$ERR 1 error 
$ERR1 $NUMSECTOR2 linear /dev/loop1 $ERR1
EOF
#dmsetup ls
#dmsetup status
#dmsetup table
mount /dev/mapper/tardust /mntx || exit
strace tar cvbf 7 /tmp/tardust.tar /mntx/testfile >&/tmp/a$i
umount /mntx 
dmsetup remove tardust
grep -e error -e shrank /tmp/a$i
done
echo "Done: inspect the strace output file for error behaviour (grep error ; Look at last read()-call )"
losetup -d /dev/loop1
==================

Actual results:
- tar can report a disk read error as "Read error at byte", that is correct when it happens
- then it writes zeros (aka padding) according to initial file size (but does not print that message)
- the padding itself it correct again


Expected results:
- In addition it SHALL print "Padding with zeros". This is missing currently.
By correctly recognizing what the root cause is, the admin can take the right actions immediately.

Additional info:

Comment 2 Lukas Javorsky 2022-07-01 13:18:44 UTC
PR merged, and the package was built.

C9S build: https://kojihub.stream.rdu2.redhat.com/koji/buildinfo?buildID=23443
RHEL-9 build: https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2068532

Comment 10 errata-xmlrpc 2022-11-15 11:21:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (tar bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8369