Bug 1348925
Summary: | mdadm -IRs returns 1 (0 expected ) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | nikhil kshirsagar <nkshirsa> | ||||||||
Component: | mdadm | Assignee: | XiaoNi <xni> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | guazhang <guazhang> | ||||||||
Severity: | medium | Docs Contact: | Milan Navratil <mnavrati> | ||||||||
Priority: | unspecified | ||||||||||
Version: | 6.7 | CC: | bhu, dledford, Jes.Sorensen, jjose, mnavrati, nkshirsa, xni, yizhan | ||||||||
Target Milestone: | rc | Keywords: | Reopened | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | mdadm-3.3.4-7.el6 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
An incorrect exit status of "mdadm -IRs" no longer causes error messages at boot time
Previously, the load_container() function was incorrectly trying to load a container from the member array. As a consequence, the "mdadm -IRs" command incorrectly returned the `1` exit status, which led to error messages at boot time. The load_container() function has been modified to prevent loading a container from a member array. As a result, error messages at boot time no longer occur.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2017-03-21 08:58:38 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1269194, 1346447 | ||||||||||
Attachments: |
|
Created attachment 1170679 [details]
raid snapshot
Created attachment 1170680 [details]
boot logs image
Reading this bug report it is completely unclear what the problem is here. Per /proc/mdstat it looks like the arrays are assembled correctly. Why do they have a manually mangled /etc/mdadm.conf? Where is the full dmesg boot log? md0 is the IMSM container - normally that would be md127. Please find the update from customer === -- We would need more steps in detail, that is the exact steps followed. Ans: As mentioned in previous mail, that was the exact steps followed as straight forward installation after doing HW RAID 1 configuration on intel SSDs. After the OS drive selection from RAID tab, we created the volume partitions, and fdisk -l shows as below, [root@node1 ~]# fdisk -l | grep -i dev Disk /dev/sda: 120.0 GB, 120034123776 bytes Device Boot Start End Blocks Id System /dev/sda1 * 1 64 512000 83 Linux /dev/sda2 64 1109 8388608 82 Linux swap / Solaris /dev/sda3 1109 13864 102456320 8e Linux LVM Disk /dev/sdb: 120.0 GB, 120034123776 bytes Device Boot Start End Blocks Id System /dev/sdb1 * 1 64 512000 83 Linux /dev/sdb2 64 1109 8388608 82 Linux swap / Solaris /dev/sdb3 1109 13864 102456320 8e Linux LVM Disk /dev/sdc: 4000.8 GB, 4000787030016 bytes Disk /dev/sdd: 4000.8 GB, 4000787030016 bytes Disk /dev/sde: 4000.8 GB, 4000787030016 bytes Disk /dev/sdf: 4000.8 GB, 4000787030016 bytes Disk /dev/md126: 114.0 GB, 114030542848 bytes Device Boot Start End Blocks Id System /dev/md126p1 * 257 128256 512000 83 Linux /dev/md126p2 128257 2225408 8388608 82 Linux swap / Solaris /dev/md126p3 2225409 27839488 102456320 8e Linux LVM Disk /dev/mapper/sys-root: 11.0 GB, 10997465088 bytes Disk /dev/mapper/sys-tmp: 11.0 GB, 10997465088 bytes Disk /dev/mapper/sys-backup: 5498 MB, 5498732544 bytes Disk /dev/mapper/sys-opt: 42.9 GB, 42949672960 bytes Disk /dev/mapper/sys-usr: 11.0 GB, 10997465088 bytes Disk /dev/mapper/sys-var: 11.0 GB, 10997465088 bytes Disk /dev/mapper/sys-home: 11.0 GB, 10997465088 bytes -- Also in which step you ran mdadm -IRs? Ans: We didn’t run mdadm -IRs after installation, it was asked during the support case only that’s why we ran the command as advised for output. -- Did you run this command due to installation failure? Ans: The installation was successful, for iostat we get below results. Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 6.33 0.00 56.00 0 168 sdb 6.00 0.00 56.00 0 168 sdc 0.00 0.00 0.00 0 0 sdd 0.00 0.00 0.00 0 0 sde 0.00 0.00 0.00 0 0 sdf 0.00 0.00 0.00 0 0 md126 8.00 0.00 56.00 0 168 dm-0 0.00 0.00 0.00 0 0 dm-1 2.67 0.00 21.33 0 64 dm-2 0.00 0.00 0.00 0 0 dm-3 4.33 0.00 34.67 0 104 dm-4 0.00 0.00 0.00 0 0 dm-5 0.00 0.00 0.00 0 0 dm-6 0.00 0.00 0.00 0 0 ^C [root@node1 ~]# Let me know if you need further details. No sure if the HW raid array is any issue here, we are using this SuperMicro HW for MapR. Details were sent in previous mail. ==== Development Management has reviewed and declined this request. You may appeal this decision by reopening this request. Hello I have tested mdadm-3.3.4-1.el6 and reproduce this bug ,and then install mdadm-3.3.4-7.el6 can not reproduce this bug ,so this bug was fixed Moved to "verified" Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0569.html |
Created attachment 1170678 [details] strace output Description of problem: mdadm -IRs returns 1 (and hence fails in rc.sysinit ) Version-Release number of selected component (if applicable): mdadm-3.3.2-5.el6.x86_64 How reproducible: I am still getting the exact repro scenario from the customer but will paste all the details and how they managed to get this error on their system. The customer says "What we did is, configured the HW RAID 1 in the 2 SSDs (for OS) from Intel RAID controller and then installed the OS. During OS installation we did select the advanced drivers and selected the partition that came under the RAID tab. May be that’s the cause." The raid hardware is RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05) RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02) Actual results: mdadm -IRs returns 0 The exact error seen is "mdadm: array /dev/md/OSVOLUME0 now has 2 devices (0 new) FAILED" during bootup in the rc.sysinit script. The exact command that fails is [ -r /proc/mdstat -a -r /dev/md/md-device-map ] && action $"Run MD devices: " /sbin/mdadm -IRs (see attached image of boot logs) Expected results: mdadm -IRs should return 0 to shell/caller Additional info: strace output attached. # cat proc/mdstat Personalities : [raid1] md126 : active raid1 sda[1] sdb[0] 111357952 blocks super external:/md0/0 [2/2] [UU] md0 : inactive sda[1](S) sdb[0](S) 6306 blocks super external:imsm unused devices: <none> We see mdadm.conf is MAILADDR root AUTO +imsm +1.x -all ARRAY /dev/md0 UUID=823faff3:7d076a53:ea497222:7734106d #ARRAY /dev/md127 UUID=501d37d2:5a0ce9b0:71148990:32a9d147 the md127 device was something we asked the customer to comment since we dont see it in the /proc/mdstat output but even with it commented we have the same problem. # cat proc/cmdline ro root=/dev/mapper/sys-root rd_MD_UUID=a52bcfbe:68cfe56b:27a6195d:6581c824 rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=sys/root SYSFONT=latarcyrheb-sun16 crashkernel=128M rd_MD_UUID=3acb860a:5da1daa7:9c522152:fc97b5af KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet # grep "RAID bus" lspci 00:1f.2 RAID bus controller: Intel Corporation C600/X79 series chipset SATA RAID Controller (rev 05) 02:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02) Analysis so far: I'm looking at mdadm-3.3.2 code. look at strace output, (group_exit(1)) has happened without any additional logs (pr_err() ) thrown on screen. So whatever has happened, has returned 1 to the OS/shell without having any pr_err logs prior to the exit(1). We use this to focus our search and also remember the stat commands at the end of the strace output. Lets start by searching for those in the known code path we have. assemble_container_content() is where we see the "array /dev/md/OSVOLUME0 now has 2 devices (0 new)" message and then we see no more messages, and the strace shows exit(1) strace ends this way rename("/dev/md/md-device-map.new", "/dev/md/md-device-map") = 0 open("/sys/block/md126/md//array_state", O_WRONLY) = 7 write(7, "readonly", 8) = -1 EBUSY (Device or resource busy) close(7) = 0 write(2, "mdadm: array /dev/md/OSVOLUME0 n"..., 48mdadm: array /dev/md/OSVOLUME0 now has 2 devices) = 48 <-- printing the output seen write(2, " (0 new)", 8 (0 new)) = 8 write(2, "\n", 1) = 1 close(6) = 0 stat("/sys/dev/block/8:16/partition", 0x7ffce60cd2d0) = -1 ENOENT (No such file or directory) stat("/sys/dev/block/8:0/partition", 0x7ffce60cd2d0) = -1 ENOENT (No such file or directory) unlink("/dev/md/md-device-map.lock") = 0 close(3) = 0 munmap(0x7fd80e2ba000, 4096) = 0 exit_group(1) = ? +++ exited with 1 +++ So lets examine the code for whatever might be causing the stat calls to be seen in the system calls. We see the code in assemble_container_content() assemble_container_content() itself doesnt do much more after printing the output we see. From Assemble.c line 1986 assemble_container_content() function ends this way, } else if (c->verbose >= 0) { if (err) <--------------------------------------------- err is 1, code path goes into this if case pr_err("array %s now has %d device%s", chosen_name, working + preexist, working + preexist == 1 ? "":"s"); else pr_err("Started %s with %d device%s", chosen_name, working + preexist, working + preexist == 1 ? "":"s"); if (preexist) fprintf(stderr, " (%d new)", working); if (expansion) fprintf(stderr, " ( + %d for expansion)", expansion); fprintf(stderr, "\n"); <---- output we see ends here } if (!err) wait_for(chosen_name, mdfd); <-- wont go in this function since err is not 0 return err; /* FIXME should have an O_EXCL and wait for read-auto */ } #endif The call to assemble_container_content function will be in Incremental.c (since its the -I parameter passed to mdadm) mdadm 3.3.2 Incremental.c Line 1602 assemble_container_content(st, mdfd, ra, c, chosen_name, &result); close(mdfd); <---- see in the strace output before the stat calls } if (c->export && result) { char sep = '='; printf("MD_STARTED"); if (result & INCR_NO) { printf("%cno", sep); sep = ','; } if (result & INCR_UNSAFE) { printf("%cunsafe", sep); sep = ','; } if (result & INCR_ALREADY) { printf("%calready", sep); sep = ','; } if (result & INCR_YES) { printf("%cyes", sep); sep = ','; } printf("\n"); } /* don't move spares to container with volume being activated when all volumes are blocked */ if (ra_all == ra_blocked) return 0; /* Now move all suitable spares from spare container */ domains = domain_from_array(list, st->ss->name); memcpy(suuid, uuid_zero, sizeof(int[4])); if (domains && (smp = map_by_uuid(&map, suuid)) != NULL && If we trace the source from here looking for stat calls, to get a clue of what might have happened, lets look at domain_from_array() domains = domain_from_array(list, st->ss->name); ( [root@nkshirsa mdadm-3.3.2]# grep domain_from_array * Incremental.c: dl = domain_from_array(sra, st2->ss->name); Incremental.c: domains = domain_from_array(list, st->ss->name); mdadm.h:extern struct domainlist *domain_from_array(struct mdinfo *mdi, grep: misc: Is a directory policy.c:struct domainlist *domain_from_array(struct mdinfo *mdi, const char *metadata) <--- definition here grep: systemd: Is a directory grep: tests: Is a directory [root@nkshirsa mdadm-3.3.2]# Lets see policy.c struct domainlist *domain_from_array(struct mdinfo *mdi, const char *metadata) { struct domainlist *domlist = NULL; if (!mdi) return NULL; for (mdi = mdi->devs ; mdi ; mdi = mdi->next) domainlist_add_dev(&domlist, makedev(mdi->disk.major, mdi->disk.minor), metadata); return domlist; } Lets see domainlist_add_dev void domainlist_add_dev(struct domainlist **dom, int devid, const char *metadata) { struct dev_policy *pol = devid_policy(devid); domain_merge(dom, pol, metadata); dev_policy_free(pol); } Lets see devid_policy() struct dev_policy *devid_policy(int dev) { struct mdinfo disk; disk.disk.major = major(dev); disk.disk.minor = minor(dev); return disk_policy(&disk); } Lets see disk_policy(&disk); /* * disk_policy() gathers policy information for the * disk described in the given mdinfo (disk.{major,minor}). */ struct dev_policy *disk_policy(struct mdinfo *disk) { char *path = NULL; char *type = disk_type(disk); <--- contains stat system call struct dev_policy *pol = NULL; if (config_rules_has_path) path = disk_path(disk); <--- contains stat system call pol = path_policy(path, type); free(path); return pol; } static char *disk_type(struct mdinfo *disk) { char buf[30+20+20]; struct stat stb; sprintf(buf, "/sys/dev/block/%d:%d/partition", disk->disk.major, disk->disk.minor); if (stat(buf, &stb) == 0) <--------------------------- return type_part; else return type_disk; } static char *disk_path(struct mdinfo *disk) { struct stat stb; int prefix_len; DIR *by_path; char symlink[PATH_MAX] = "/dev/disk/by-path/"; char nm[PATH_MAX]; struct dirent *ent; int rv; by_path = opendir(symlink); if (by_path) { prefix_len = strlen(symlink); while ((ent = readdir(by_path)) != NULL) { if (ent->d_type != DT_LNK) continue; strncpy(symlink + prefix_len, ent->d_name, sizeof(symlink) - prefix_len); if (stat(symlink, &stb) < 0) <------------------------------------- continue; if ((stb.st_mode & S_IFMT) != S_IFBLK) continue; if (stb.st_rdev != makedev(disk->disk.major, disk->disk.minor)) continue; closedir(by_path); return xstrdup(ent->d_name); } closedir(by_path); } /* A NULL path isn't really acceptable - use the devname.. */ sprintf(symlink, "/sys/dev/block/%d:%d", disk->disk.major, disk->disk.minor); rv = readlink(symlink, nm, sizeof(nm)-1); if (rv > 0) { char *dname; nm[rv] = 0; dname = strrchr(nm, '/'); if (dname) return xstrdup(dname + 1); } return xstrdup("unknown"); } Are these the two stat system calls we see in the strace end ? close(6) = 0 stat("/sys/dev/block/8:16/partition", 0x7ffce60cd2d0) = -1 ENOENT (No such file or directory) stat("/sys/dev/block/8:0/partition", 0x7ffce60cd2d0) = -1 ENOENT (No such file or directory) unlink("/dev/md/md-device-map.lock") = 0 close(3) = 0 munmap(0x7fd80e2ba000, 4096) = 0 exit_group(1) = ? +++ exited with 1 +++ Anyway, going forward, assuming so, after this stat we see a sudden exit_group without any pr_err message. The only way this seems to happen is if static char *disk_path(struct mdinfo *disk) { struct stat stb; int prefix_len; DIR *by_path; char symlink[PATH_MAX] = "/dev/disk/by-path/"; char nm[PATH_MAX]; struct dirent *ent; int rv; by_path = opendir(symlink); if (by_path) { prefix_len = strlen(symlink); while ((ent = readdir(by_path)) != NULL) { if (ent->d_type != DT_LNK) continue; strncpy(symlink + prefix_len, ent->d_name, sizeof(symlink) - prefix_len); if (stat(symlink, &stb) < 0) continue; if ((stb.st_mode & S_IFMT) != S_IFBLK) continue; if (stb.st_rdev != makedev(disk->disk.major, disk->disk.minor)) continue; closedir(by_path); return xstrdup(ent->d_name); <------------------------- this function seems to have a exit without any pr_err message } closedir(by_path); } /* A NULL path isn't really acceptable - use the devname.. */ sprintf(symlink, "/sys/dev/block/%d:%d", disk->disk.major, disk->disk.minor); rv = readlink(symlink, nm, sizeof(nm)-1); if (rv > 0) { char *dname; nm[rv] = 0; dname = strrchr(nm, '/'); if (dname) return xstrdup(dname + 1); } return xstrdup("unknown"); <-------------------------------------- this function seems to have a exit without any pr_err message } char *xstrdup(const char *str) { char *rv = strdup(str); char *msg; if (rv) return rv; msg = Name ": memory allocation failure - aborting\n"; exit(4+!!write(2, msg, strlen(msg))); <--------------- what is this ? Could this result in exit(1) ? } When i ran this code in a test program, it exits with 5, not 1. Its an unlikely case anyway, why would we get memory allocation failure at bootup anyway ? But what else could cause an exit(1) from mdadm without any errors/warnings/info/messages logged ? It is supposed to return 0 to the shell when called, right ? I tried to grep for exit(1) to see if that will give any clue.. [root@nkshirsa mdadm-3.3.2]# grep -ri "exit(1)" swap_super.c: exit(1); swap_super.c: exit(1); swap_super.c: exit(1); swap_super.c: exit(1); swap_super.c: exit(1); swap_super.c: exit(1); swap_super.c: exit(1); util.c.ddf: exit(1); util.c.ddf: exit(1); util.c: exit(1); util.c: exit(1); Grow.c: exit(1); restripe.c: exit(1); restripe.c: exit(1); restripe.c: exit(1); restripe.c: exit(1); Incremental.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdadm.c: exit(1); mdmon.c: exit(1); No case in mdadm.c has an exit(1) without a prior pr_err call, except this one, and I dont think we have this case, since we passed -IRs to mdadm , if (mode == MANAGE || mode == BUILD || mode == CREATE || mode == GROW || (mode == ASSEMBLE && ! c.scan)) { if (devs_found < 1) { pr_err("an md device must be given in this mode\n"); exit(2); } if ((int)ident.super_minor == -2 && c.autof) { pr_err("--super-minor=dev is incompatible with --auto\n"); exit(2); } if (mode == MANAGE || mode == GROW) { mdfd = open_mddev(devlist->devname, 1); if (mdfd < 0) exit(1); <--------------------------- only possibility, unlikely to be executed in our case, right? } else /* Both the exit(1)'s in util.c are for mdmon, not mdadm util.c: exit(1); util.c: exit(1); the one in incremental.c is static void run_udisks(char *arg1, char *arg2) { int pid = fork(); int status; if (pid == 0) { execl("/usr/bin/udisks", "udisks", arg1, arg2, NULL); <--- we dont see this system call reflected in strace execl("/bin/udisks", "udisks", arg1, arg2, NULL); exit(1); } I dont think we will be executing code in this case in Grow.c: exit(1); restripe.c: exit(1); restripe.c: exit(1); restripe.c: exit(1); restripe.c: exit(1); All the exit(1)'s in swap_super.c have a prior perror. So what is causing the mdadm binary to exit with a return code of 1 ?