RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 806570 - EXT4 Data Validation error
Summary: EXT4 Data Validation error
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-03-24 21:34 UTC by jack
Modified: 2012-07-10 15:22 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-10 15:22:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description jack 2012-03-24 21:34:12 UTC
Description of problem:

Vdbench detects data validation error with ext4 filesystem.

Version-Release number of selected component (if applicable):

# uname -a
Linux VBIT-R415-60 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.0 (Santiago)

# mount | grep sdb1
/dev/sdb1 on /mnt/sdb1 type ext4 (rw)

# df -h | grep sdb1
/dev/sdb1             197G   30G  158G  16% /mnt/sdb1

How reproducible:


Steps to Reproduce:
1. Create a clean 197G ext4 filesystem on /dev/sdb1.  Mount to /mnt/sdb1.
   # mkfs.ext4 /dev/sdb1
   # mount /dev/sdb1 /mnt/sdb1
2. Create a 10G file /mnt/sdb1/2 using dd.
   # dd if=/dev/zero of=/mnt/sdb1/2 bs=1M count=1000; sync
3. Run 'vdbench' with the following parameter file and the validation option "-vr".
   # vdbench -f ext4-conf -vr

[ext4-conf]
sd=sd1,lun=/mnt/sdb1/2,openflags=o_direct,hitarea=40M,align=4096
wd=wd1,sd=sd*,rdpct=50,rhpct=20,xfersize=9k,seekpct=20 
rd=run1,wd=wd1,iorate=8000,elapsed=100000,interval=1,forthreads=(30) 

4. Withing minutes, vdbench complains data validation errors.
  
Actual results:

<pre>
16:16:00.059 input argument scanned: '-fext4-conf'
16:16:00.061 input argument scanned: '-vr'
16:16:00.061 Vdbench distribution: vdbench502
16:16:00.061
16:16:00.064 java.vendor                   Oracle Corporation
16:16:00.064 java.home                     /usr/java/jre1.7.0_02
16:16:00.064 java.vm.specification.version 1.7
16:16:00.064 java.vm.version               22.0-b10
16:16:00.064 java.vm.vendor                Oracle Corporation
16:16:00.064 java.specification.version    1.7
16:16:00.065 java.class.version            51.0
16:16:00.065 user.name                     root
16:16:00.065 user.dir                      /root/vdbench
16:16:00.065 java.class.path               /home/jackh/svn/phast/utils/vdbench/:/home/jackh/svn/phast/utils/vdbench/classes:/home/jackh/svn/phast/utils/vdbench/vdbench.jar
16:16:00.065 os.name                       Linux
16:16:00.065 os.arch                       amd64
16:16:00.065 sun.arch.data.model           64
16:16:00.065 os.version                    2.6.32-71.el6.x86_64
16:16:00.113
16:16:00.113
16:16:00.146 Starting slave: /home/jackh/svn/phast/utils/vdbench/vdbench SlaveJvm -m localhost -n localhost-10-120324-16.16.00.036 -l localhost-0 -p 5570
16:16:00.147 Setting shared library to: /home/jackh/svn/phast/utils/vdbench/linux/linux64.so
16:16:00.454 Slave localhost-0 connected
16:16:00.500 All slaves are now connected
16:16:00.596 sd=sd1,lun=/mnt/sdb1/2 lun size: 10485760000 bytes; 9.7656 GB (1024**3); 10.4858 GB (1000**3)

Link to Run Definitions:         <A HREF="#_2028358271">run1 For loops: threads=30.0</A>

16:16:00.640 Sending to localhost-0  wd=wd1,sd=sd1,lun=/mnt/sdb1/2
16:16:00.692 SlaveList.printWorkForSlaves()
16:16:00.692 slv=localhost-0 rd=run1   wd=wd1    sd=sd1    rd= 50 sk= 20 sw=100.00 xf=   9216 rh= 20 hi=41943040 th=30
16:16:00.692
16:16:00.693 Waiting for synchronization of all slaves
16:16:01.094 Synchronization of all slaves complete
16:16:02.002 Starting RD=run1; I/O rate: 8000; elapsed=100000; For loops: threads=30.0

<a name="_2028358271"></a><i><b>16:16:02.002 Starting RD=run1; I/O rate: 8000; elapsed=100000; For loops: threads=30.0</b></i>


Mar 24, 2012  interval        i/o   MB/sec   bytes   read     resp     resp     resp    cpu%  cpu%
                             rate  1024**2     i/o    pct     time      max   stddev sys+usr   sys
16:16:03.047         1     798.00     7.01    9216  64.66   33.585  268.157   52.092     3.9   0.7
16:16:04.051         2     646.00     5.68    9216  66.10   47.907 1023.302   92.697     2.0   0.5
16:16:05.050         3     695.00     6.11    9216  67.05   42.071  658.773   86.608     1.2   0.2
16:16:06.049         4     661.00     5.81    9216  67.17   45.698  589.837   91.253     3.5   0.6
16:16:07.049         5     633.00     5.56    9216  66.19   47.130  826.167   92.532     0.9   0.3
16:16:08.048         6     325.00     2.86    9216  68.31   93.471  790.425  203.780     1.3   0.4
16:16:09.048         7     761.00     6.69    9216  66.10   37.443  576.405   81.603     0.8   0.2
16:16:10.048         8     673.00     5.92    9216  65.23   44.593  543.782   92.037     1.7   0.5
16:16:11.049         9     707.00     6.21    9216  66.20   45.951  665.933   94.492     0.6   0.2
16:16:12.049        10     653.00     5.74    9216  65.08   41.806  545.097   76.255     1.1   0.5
16:16:13.047        11     375.00     3.30    9216  65.33   83.685  838.302  205.073     3.0   0.3
16:16:14.047        12     808.00     7.10    9216  67.08   35.722  647.399   72.863     1.3   0.5
16:16:15.047        13     681.00     5.99    9216  65.93   48.958  698.005  114.482     1.0   0.5
16:16:16.047        14     623.00     5.48    9216  65.65   38.505  445.840   75.733     1.8   0.3
16:16:17.046        15     689.00     6.06    9216  65.75   47.685  774.575  107.320     0.5   0.2
16:16:18.046        16     358.00     3.15    9216  69.55   88.520  793.665  202.905     1.9   0.5
16:16:19.046        17     714.00     6.28    9216  68.63   39.860  477.947   75.923     1.8   0.3
16:16:20.046        18     686.00     6.03    9216  67.06   42.047  444.021   75.931     0.8   0.4
16:16:21.045        19     637.00     5.60    9216  66.72   50.795  866.817  108.397     1.2   0.3
16:16:22.045        20     667.00     5.86    9216  66.72   44.259  841.352   95.014     0.6   0.4
16:16:23.045        21     478.00     4.20    9216  66.11   64.765  732.467  154.815     0.4   0.2
16:16:23.740 localhost-0:
16:16:23.740 localhost-0: At least one Data Validation error detected.
16:16:23.740 localhost-0: All data in the disk block (assumed to be 512 bytes long) will
16:16:23.740 localhost-0: be printed in the errorlog; lines that do not compare will be
16:16:23.740 localhost-0: marked with '*'.
16:16:23.740 localhost-0:
16:16:23.740 localhost-0: Byte 0x00 -  0x07: Byte offset of this disk block
16:16:23.740 localhost-0: Byte 0x08 -  0x0f: Timestamp: number of microseconds since 1/1/1970
16:16:23.740 localhost-0: Byte 0x10        : Data Validation key from 1 - 126
16:16:23.740 localhost-0: Byte 0x11        : Checksum of timestamp
16:16:23.740 localhost-0: Byte 0x12 -  0x13: Reserved
16:16:23.740 localhost-0: Byte 0x14 -  0x1b: SD or FSD name in ASCII hexadecimal
16:16:23.740 localhost-0: Byte 0x1c -  0x1f: Reserved
16:16:23.740 localhost-0: Byte 0x20 - 0x1ff: 480 bytes of data
16:16:23.741 localhost-0:
16:16:23.741 localhost-0: On the left: the data that was expected ('.' marks unknown value).
16:16:23.741 localhost-0: On the right: the data that was found.
16:16:23.741 localhost-0: (When the data compare of the 480 data bytes results in a mismatch ONLY because of a wrong
16:16:23.741 localhost-0:  Data Validation key and/or wrong lba the reporting of that data will be suppressed.)
16:16:23.741 localhost-0:
16:16:23.741 localhost-0: dvpost: /mnt/sdb1/2 sd1 garbage  0x00000000 0x107edf000 9216 0xa 0x0 0x1d 0x1 0x0 0x0
16:16:23.741 localhost-0:
16:16:23.741 localhost-0:         Data Validation error for sd=sd1,lun=/mnt/sdb1/2; block lba: 0x107ee0400; xfersize: 9216; relative sector in block: 0x0a (10)
16:16:23.741 localhost-0:         ===> Data Validation Key miscompare.
16:16:23.741 localhost-0:         ===> Logical byte address miscompare.
16:16:23.741 localhost-0:         ===> SD or FSD name miscompare.
16:16:23.741 localhost-0:         ===> Data miscompare.
16:16:23.741 localhost-0:         The sector below was written Wednesday, December 31, 1969 19:00:00.000
16:16:23.742 localhost-0: 0x000*  00000001 07ee0400 ........ ........   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x010*  01..0000 73643120 20202020 00000000   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x020*  033b3780 019d9bc0 00cecde0 006766f0   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x030*  0033b378 0019d9bc 000cecde 0006766f   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x040*  8b0155ef ce82c42f ec430ccf fd23e8bf   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x050*  f5939a87 f1cba39b f3e7bf15 f2f1b152   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x060*  7978d8a9 b7be028c 5bdf0146 2def80a3   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x070*  9df5ae89 c5f8b99c 62fc5cce 317e2e67   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x080*  93bd79eb c2dcd22d ea6c07ce 753603e7   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x090*  b1996f2b d3ced94d e2e5027e 7172813f   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x0a0*  b3bb2e47 d2dff9fb e26d9225 fa34a7ca   00000000 00000000 00000000 00000000
16:16:23.742 localhost-0: 0x0b0*  7d1a53e5 b58f472a 5ac7a395 a661bf12   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x0c0*  5330df89 a29a011c 514d008e 28a68047   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x0d0*  9f512efb c4aaf9a5 e957120a 74ab8905   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x0e0*  b157aa5a 58abd52d a757844e 53abc227   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x0f0*  a2d78fcb da69a93d e636ba46 731b5d23   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x100*  b28fc049 d2458efc 6922c77e 349163bf   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x110*  914adf07 c3a7015b ead1ee75 fe6a99e2   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x120*  7f354cf1 b498c8a0 5a4c6450 2d263228   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x130*  16931914 0b498c8a 05a4c645 89d00dfa   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x140*  44e806fd a9766da6 54bb36d3 a15ff5b1   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x150*  dbad9400 6dd6ca00 36eb6500 1b75b280   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x160*  0dbad940 06dd6ca0 036eb650 01b75b28   00000000 00000000 00000000 00000000
16:16:23.743 localhost-0: 0x170*  00dbad94 006dd6ca 0036eb65 8b191b6a   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x180*  458c8db5 a9c42802 54e21401 a17364d8   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x190*  50b9b26c 285cd936 142e6c9b 81155895   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x1a0*  cb88c292 65c46149 b9e05e7c 5cf02f3e   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x1b0*  2e78179f 9c3e6517 c51d5c53 e98cc0f1   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x1c0*  ffc40ea0 7fe20750 3ff103a8 1ff881d4   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x1d0*  0ffc40ea 07fe2075 88fd7ee2 447ebf71   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x1e0*  a93d3160 549e98b0 2a4f4c58 1527a62c   00000000 00000000 00000000 00000000
16:16:23.744 localhost-0: 0x1f0*  0a93d316 0549e98b 89a69a1d cfd123d6   00000000 00000000 00000000 00000000
. . . (errors repeat) . . .

Expected results:

No data validation error.

Additional info:

Key factors seem to be : 
    sd=sd1,...,openflags=o_direct,...,align=4096
    wd=wd1,...,xfersize=9k,...

Comment 2 jack 2012-03-25 01:49:35 UTC
How reproducible:

100%

Comment 3 Ondrej Vasik 2012-03-25 06:08:26 UTC
Thanks for report, but it has nothing to do with filesystem component - which just creates and owns set of the basic system directories. Ext4 filesystem issues in most cases belongs to kernel - reassigning.

Comment 4 Ric Wheeler 2012-03-27 19:25:52 UTC
Hi Jack,

Please work issues through your Red Hat support team first. They help open the BZ and gather any needed data for us.

If you are not a Red Hat customer, you should report the issues to the appropriate upstream mailing list (linux-ext4 for this one).

Thanks!


Note You need to log in before you can comment on or make changes to this bug.