Bug 1369891

Summary:

mdadm fails to create >128 mds or long custom device names

Product:

Red Hat Enterprise Linux 7

Reporter:

Robert LeBlanc <robert>

Component:

mdadm

Assignee:

Jes Sorensen <Jes.Sorensen>

Status:

CLOSED ERRATA

QA Contact:

Zhang Yi <yizhan>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

7.2

CC:

dledford, Jes.Sorensen, robert, xiaotzha, xni, yizhan

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

mdadm-3.4-12.el7

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2016-11-04 00:09:34 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1274397

Attachments:

Description	Flags
Patch to RPM source that includes relevant upstream and additional patches.	none

Description Robert LeBlanc 2016-08-24 15:51:07 UTC

Created attachment 1193682 [details]
Patch to RPM source that includes relevant upstream and additional patches.

Description of problem:
Trying to create more than 128 mds causes mdadm to fail. After allocating /dev/md0 mdadm tries to allocate /dev/md1048575 and fails for two reasons. First there is an overflow of the int type causing negative major and minor values for the device. Second, the kernel module does not automatically register md devices over 511 when an appropriate device is created to help prevent race conditions with udev.

The suggested way forward is to add "CREATE names=yes" to /etc/mdadm.conf and then specify md names (mdadm --create /dev/md/<name> that will be created as /dev/md_<name>. However, for long names this also fails.

Version-Release number of selected component (if applicable):
mdadm-3.3.2-7.el7_2.1.x86_64

How reproducible:
Always

Steps to Reproduce:
1. truncate 10G /root/junk
2. losetup -f --show /root/junk
3. mdadm --create /dev/md1048575 --level=1 --raid-devices=2 /dev/loop0 missing
4. mdadm --create /dev/md/my_hip_awesome_cool_md_name --level=1 --raid-devices=2 /dev/loop0 missing


Actual results:
# mdadm --create /dev/md1048575 --level=1 --raid-devices=2 /dev/loop0 missingmdadm: /dev/loop0 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Tue Aug 23 17:15:10 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? y
mdadm: unexpected failure opening /dev/md1048575

# mdadm --create /dev/md/my_hip_awesome_cool_md_name --level=1 --raid-devices=2 /dev/loop0 missing
mdadm: /dev/loop0 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Tue Aug 23 17:15:10 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? y
*** buffer overflow detected ***: mdadm terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7fdc07ebd597]
/lib64/libc.so.6(+0x10c750)[0x7fdc07ebb750]
/lib64/libc.so.6(+0x10bc59)[0x7fdc07ebac59]
/lib64/libc.so.6(_IO_default_xsputn+0xbc)[0x7fdc07e27a2c]
/lib64/libc.so.6(_IO_vfprintf+0x151d)[0x7fdc07df7a6d]
/lib64/libc.so.6(__vsprintf_chk+0x88)[0x7fdc07ebace8]
/lib64/libc.so.6(__sprintf_chk+0x7d)[0x7fdc07ebac3d]
mdadm[0x42fd12]
mdadm[0x41b38b]
mdadm[0x404905]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fdc07dd0b15]
mdadm[0x40789d]
======= Memory map: ========
00400000-0047a000 r-xp 00000000 103:01 102605400                         /usr/sbin/mdadm
00679000-0067a000 r--p 00079000 103:01 102605400                         /usr/sbin/mdadm
0067a000-00681000 rw-p 0007a000 103:01 102605400                         /usr/sbin/mdadm
00681000-00695000 rw-p 00000000 00:00 0 
01c74000-01c95000 rw-p 00000000 00:00 0                                  [heap]
7fdc07b99000-7fdc07bae000 r-xp 00000000 103:01 102606943                 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fdc07bae000-7fdc07dad000 ---p 00015000 103:01 102606943                 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fdc07dad000-7fdc07dae000 r--p 00014000 103:01 102606943                 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fdc07dae000-7fdc07daf000 rw-p 00015000 103:01 102606943                 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
7fdc07daf000-7fdc07f66000 r-xp 00000000 103:01 101579594                 /usr/lib64/libc-2.17.so
7fdc07f66000-7fdc08166000 ---p 001b7000 103:01 101579594                 /usr/lib64/libc-2.17.so
7fdc08166000-7fdc0816a000 r--p 001b7000 103:01 101579594                 /usr/lib64/libc-2.17.so
7fdc0816a000-7fdc0816c000 rw-p 001bb000 103:01 101579594                 /usr/lib64/libc-2.17.so
7fdc0816c000-7fdc08171000 rw-p 00000000 00:00 0 
7fdc08171000-7fdc08192000 r-xp 00000000 103:01 104974756                 /usr/lib64/ld-2.17.so
7fdc08377000-7fdc0837a000 rw-p 00000000 00:00 0 
7fdc0838f000-7fdc08392000 rw-p 00000000 00:00 0 
7fdc08392000-7fdc08393000 r--p 00021000 103:01 104974756                 /usr/lib64/ld-2.17.so
7fdc08393000-7fdc08394000 rw-p 00022000 103:01 104974756                 /usr/lib64/ld-2.17.so
7fdc08394000-7fdc08395000 rw-p 00000000 00:00 0 
7ffef35ff000-7ffef3620000 rw-p 00000000 00:00 0                          [stack]
7ffef37a0000-7ffef37a2000 r--p 00000000 00:00 0                          [vvar]
7ffef37a2000-7ffef37a4000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Aborted


Expected results:
md RAID devices created and started successfully.

Additional info:
I'm attaching a patch for the rpm build. It includes three patches already in upstream: 8554b77e300eb16d345176a1d41aaffe98bbf06a, 6e6e98746dba7e900f23e92bbb0da01fe7a169da, and 46149dc0a71bc2f7061c0192540b44b43c1a450c. It also includes a new patch (764b4282553559a99c3f115fb463f4d14f8222df) that I have submitted to upstream, but hasn't been included yet which fixes a buffer overrun error still left after the previous three patches are included.

Comment 2 Jes Sorensen 2016-08-24 19:14:27 UTC

Hi,

I'll add the two devname related patches. Applying a multi-patch like that
with everything bungled together obviously doesn't work.

I believe the two relevant upstream patches should solve this problem.

Jes

Comment 3 Jes Sorensen 2016-08-25 12:25:32 UTC

This should be addressed by mdadm-3.4-11.el7

Comment 5 Jes Sorensen 2016-08-25 18:11:43 UTC

Applied an additional upstream fix to handle the buffer overrun case.
mdadm-3.4-12.el7 should do the trick now.

Comment 7 Zhang Xiaotian 2016-08-29 08:15:28 UTC

Hi,Robert

I use the above command,just reproduced the first and second didn't reprodeced .can you give some more details.


# uname -r
3.10.0-327.el7.x86_64
# rpm -qa | grep mdadm
mdadm-3.3.2-7.el7_2.1.x86_64

# truncate 10G /root/junk
# losetup -f --show /root/junk
/dev/loop0
# mdadm --create /dev/md1048575 --level=1 --raid-devices=2 /dev/loop0 missing
mdadm: /dev/loop0 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Mon Aug 29 03:26:53 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? y
mdadm: unexpected failure opening /dev/md1048575


# mdadm --create /dev/md/my_hip_awesome_cool_md_name1 --level=1 --raid-devices=2 /dev/loop0 missing
mdadm: /dev/loop0 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Mon Aug 29 03:26:53 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/my_hip_awesome_cool_md_name1 started.

Comment 8 Robert LeBlanc 2016-08-31 19:17:03 UTC

I don't understand what additional information is needed. Please expound.

Comment 9 Zhang Xiaotian 2016-09-02 09:43:36 UTC

Hi Robert

From your description, below command will be failed, but it passed on my side, could you help check whether I missed something? 

# mdadm --create /dev/md/my_hip_awesome_cool_md_name1 --level=1 --raid-devices=2 /dev/loop0 missing
mdadm: /dev/loop0 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Mon Aug 29 03:26:53 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/my_hip_awesome_cool_md_name1 started.

Thanks
xiaotzha

Comment 10 Zhang Xiaotian 2016-09-22 11:38:26 UTC

I tried both package [1] and [2], step[3] failed with [1][2], and step [4] passed with [1][2].

[1]
mdadm-3.3.2-7.el7_2.1.x86_64
[2]
mdadm-3.4-13.el7
[3]
# mdadm --create /dev/md1048575 --level=1 --raid-devices=2 /dev/loop0 missing
mdadm: /dev/loop0 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Mon Aug 29 03:26:53 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? y
mdadm: unexpected failure opening /dev/md1048575

[4]
mdadm --create /dev/md/my_hip_awesome_cool_md_name1 --level=1 --raid-devices=2 /dev/loop0 missing
mdadm: /dev/loop0 appears to be part of a raid array:
       level=raid1 devices=2 ctime=Mon Aug 29 03:26:53 2016
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/my_hip_awesome_cool_md_name1 started.

But from your "Actual results" description, [3] and [4] should failed with package[1], and pass with package [2].

Now I'm confused with your steps and not sure what's the bug actually fixed?

Could you give more info about it?

Thanks
xiaotzha

Comment 13 Robert LeBlanc 2016-09-23 22:38:54 UTC

OK, I see what is going on here. Before 2e466cce45ac2397ea426a765c829c621901664b, once mdadm decremented from 127 to 0, it would start decrementing from 1048575, but the kernel wouldn't open an md device so I added lines to kill mdadm after creating the dev to make sure it was created correctly even though the kernel would not open it. This made it really easy to test as we could specify an md that would trigger the bug. Now that things are fixed, we have to do it the hard way.

To test 2e466cce45ac2397ea426a765c829c621901664b, we need to make sure that it wraps to md511 after reaching md0 and we can do that by [1]. I believe udev has a race here and it make take two or more tries to actually delete the md devices, hence the while loop.

[1]
for i in {10..138}; do truncate -s 10M /tmp/junk_${i} && \
losetup /dev/loop${i} /tmp/junk_${i} && \
mdadm --create /dev/md/junk${i} --run --level 1 --raid-device=2 \
/dev/loop${i} missing; done
[ -b /dev/md511 ]
GIT_2E466CC=$?
JUNK_MDS=$(for i in $(ls -l /dev/md/junk*); do \
echo ${i} | sed -rn 's|^.*/(md[[:digit:]]+)$|\1|p'; done)
CNT=$(echo $JUNK_MDS | wc -w); while :; do for i in $JUNK_MDS; \
do if [ -b /dev/${i} ]; then mdadm --stop /dev/${i}; ((CNT++)); fi; done; \
if (( $CNT == 0 )); then break 3; else sleep 1; CNT=0; fi; done
for i in {10..138}; do losetup -d /dev/loop${i} && rm -rf /tmp/junk_${i}; done
unset JUNK_MDS
if (( $GIT_2E466CC == 0 )); then echo PASS; else echo FAIL; fi; \
unset GIT_2E466CC

I'm still working on how to test 13db17bd1fcd68b5e5618fcd051ff4137f1ea413, it is taking a long time to trigger the issue.

Comment 14 Zhang Yi 2016-09-25 09:37:25 UTC

Reproduced the "mdadm fails to create >128" issue with [1], after create the >128 mds, it would start decrementing from 1048575 and failed.
[1]
mdadm-3.3.2-7.el7_2.1.x86_64
3.10.0-327.41.1.el7.x86_64

Verified with [2], after create the >128 mds, it would start decrementing from 511 and pass, also pass mdadm regression test with [2]
[2]
mdadm-3.4-13.el7.x86_64
3.10.0-509.el7.x86_64


Change to VERIFIED.

Thanks
Yi

Comment 16 Robert LeBlanc 2016-09-28 21:31:31 UTC

To test 13db17bd1fcd68b5e5618fcd051ff4137f1ea413, we have to get the dev minor number above 1<<19 - 1. Using the "CREATE names=yes" option in /etc/mdadm.conf, we start at minor 512, which means we have to create 523,775 md devices on a freshly booted box. Luckily, md doesn't reuse the minor numbers over 511, so we just have to create and stop a single md until we get to the 524287 minor. This will take a few days to run the first time, but will be fast on subsequent tests. (Because this took so long to run, I haven't been able to verify the script will run without issues, so you may have to adjust as needed.)

grep -q "CREATE names=yes" /etc/mdadm.conf
REMOVEMDCREATE=$?
if (( $REMOVEMDCREATE == 1 )); then \
echo "CREATE names=yes" >> /etc/mdadm.conf; fi
truncate -s 10M /tmp/junk
losetup /dev/loop10 /tmp/junk
mdadm --create /dev/md/junk --run --level 1 --raid-devices=2 /dev/loop10 missing
while (( $(ls -lh /dev/md_junk | cut -d' ' -f6) < 524287 )); do \
MINOR=$(ls -lh /dev/md_junk | cut -d' ' -f6); \
echo "minor: ${MINOR}, $((524287-MINOR)) left to go"; \
mdadm --stop /dev/m
d_junk 2>/dev/null; mdadm --create /dev/md/junk --run --level 1 \
--raid-devices=2 /dev/loop10 missing 2>/dev/null; done
mdadm --create /dev/md/junk --run --level 1 --raid-devices=2 /dev/loop10 missing
GIT_13DB17B=$?
if (( GIT_13DB17B == 0 )); then echo "PASS"; else echo "FAIL"; fi; \
unset GIT_13DB17B
if (( $REMOVEMDCREATE == 1 )); then \
sed -i '/CREATE names=yes/d' /etc/mdadm.conf; fi
unset REMOVEMDCREATE

Comment 18 errata-xmlrpc 2016-11-04 00:09:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2182.html