Bug 969182
Summary: | DeviceCreateError: ('Could not commit to disk /dev/mapper/mpatha, (py_ped_disk_commit)', 'mpatha3') | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Mark Hamzy <hamzy> | ||||||||||
Component: | python-blivet | Assignee: | David Lehman <dlehman> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||
Priority: | unspecified | ||||||||||||
Version: | 19 | CC: | anaconda-maint-list, awilliam, bcl, bmarzins, dlehman, dshea, g.kaviyarasu, jonathan, mkolman, robatino, sbueno, vanmeeuwen+fedora | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | ppc64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | AcceptedBlocker | ||||||||||||
Fixed In Version: | pykickstart-1.99.32-1.fc19 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2013-06-18 06:18:09 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 834090, 920770 | ||||||||||||
Attachments: |
|
Description
Mark Hamzy
2013-05-30 20:21:52 UTC
Created attachment 755036 [details]
anaconda traceback
Created attachment 755038 [details]
anaconda.log
Created attachment 755039 [details]
storage.log
Created attachment 755040 [details]
program.log
I see the following in the console window: device-mapper: remove ioctl on mpatha1 failed: Device or resource busy device-mapper: create ioctl on mpatha1 failed: Device or resource busy device-mapper: remove ioctl on mpatha1 failed: Device or resource busy Proposing as a FinalBlocker: The installer must be able to complete an installation using any network-attached storage devices (e.g. iSCSI, FCoE, Fibre Channel) Ben, Do you know what might be causing this? [anaconda root@riley6 ~]# pvs [anaconda root@riley6 ~]# lvscan No volume groups found [anaconda root@riley6 ~]# lsof | grep mpath kmpathd 84 root cwd DIR 253,3 1024 2 / kmpathd 84 root rtd DIR 253,3 1024 2 / kmpathd 84 root txt unknown /proc/84/exe kmpath_ha 85 root cwd DIR 253,3 1024 2 / kmpath_ha 85 root rtd DIR 253,3 1024 2 / kmpath_ha 85 root txt unknown /proc/85/exe kmpath_rd 227 root cwd DIR 253,3 1024 2 / kmpath_rd 227 root rtd DIR 253,3 1024 2 / kmpath_rd 227 root txt unknown /proc/227/exe multipath 769 root mem REG 253,3 69984 59354 /usr/lib64/libmpathpersist.so.0 multipath 769 772 root mem REG 253,3 69984 59354 /usr/lib64/libmpathpersist.so.0 multipath 769 774 root mem REG 253,3 69984 59354 /usr/lib64/libmpathpersist.so.0 multipath 769 775 root mem REG 253,3 69984 59354 /usr/lib64/libmpathpersist.so.0 multipath 769 822 root mem REG 253,3 69984 59354 /usr/lib64/libmpathpersist.so.0 multipath 769 823 root mem REG 253,3 69984 59354 /usr/lib64/libmpathpersist.so.0 multipath 769 824 root mem REG 253,3 69984 59354 /usr/lib64/libmpathpersist.so.0 [anaconda root@riley6 ~]# ls -l /dev/mapper/ total 0 crw-------. 1 root root 10, 236 May 31 19:30 control lrwxrwxrwx. 1 root root 7 May 31 19:30 live-rw -> ../dm-3 lrwxrwxrwx. 1 root root 7 May 31 19:40 mpatha -> ../dm-0 lrwxrwxrwx. 1 root root 7 May 31 19:40 mpatha1 -> ../dm-1 lrwxrwxrwx. 1 root root 7 May 31 19:40 mpatha2 -> ../dm-4 lrwxrwxrwx. 1 root root 7 May 31 19:53 mpatha3 -> ../dm-5 lrwxrwxrwx. 1 root root 7 May 31 19:40 mpatha4 -> ../dm-2 [anaconda root@riley6 ~]# ls -l /dev/dm-* brw-rw----. 1 root disk 253, 0 May 31 19:40 /dev/dm-0 brw-rw----. 1 root disk 253, 1 May 31 19:40 /dev/dm-1 brw-rw----. 1 root disk 253, 2 May 31 19:40 /dev/dm-2 brw-rw----. 1 root disk 253, 3 May 31 19:30 /dev/dm-3 brw-rw----. 1 root disk 253, 4 May 31 19:40 /dev/dm-4 brw-rw----. 1 root disk 253, 5 May 31 19:53 /dev/dm-5 [anaconda root@riley6 ~]# lsof | grep '253,[01452]' [anaconda root@riley6 ~]# Discussed at 2013-06-03 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-06-03/f19final-blocker-review-2.2013-06-03-16.00.log.txt . We're waiting for further information on exactly what's going wrong here (and whether it may be user error) before making a decision on blocker status. In regards to comment #5 and #7, it looks like something has mpatha1 mounted. I've just checked on my f18 box, and this doesn't cause the device to show up in lsof [root@ask-08 multipath-tools-130222]# df Filesystem 1K-blocks Used Available Use% Mounted on ... /dev/mapper/3600d0230000000000e13955cc3757802p1 277183928 60444 263043320 1% /mnt/test [root@ask-08 multipath-tools-130222]# ls -l /dev/mapper/3600d0230000000000e13955cc3757802p1 lrwxrwxrwx. 1 root root 8 Jun 3 07:48 /dev/mapper/3600d0230000000000e13955cc3757802p1 -> ../dm-11 [root@ask-08 multipath-tools-130222]# ls -l /dev/dm-11 brw-rw----. 1 root disk 253, 11 Jun 3 07:48 /dev/dm-11 [root@ask-08 multipath-tools-130222]# lsof | grep 253,11 [root@ask-08 multipath-tools-130222]# If it is still mounted, a df would show that. In general, I'm not sure what I'm looking at here. It's pretty obvious that mpatha3 does exist, and for at least part of the install is working $ grep mpatha3 program.log 20:01:54,105 INFO program: Running... dumpe2fs -h /dev/mapper/mpatha3 20:01:54,138 INFO program: Running... resize2fs -P /dev/mapper/mpatha3 20:01:55,963 INFO program: Running... mount -t ext4 -o defaults,ro /dev/mapper/mpatha3 /mnt/sysimage 15:03:31,737 INFO program: Running... wipefs -f -a /dev/mapper/mpatha3 15:03:31,777 INFO program: /dev/mapper/mpatha3: 2 bytes were erased at offset 0x00000438 (ext4): 53 ef It does look like the device was resized. From storage.log 15:03:25,807 DEBUG blivet: adding 29401025 (14355MB) to 21 (mpatha3) 15:03:25,807 DEBUG blivet: new grow amount for request 21 (mpatha3) is 29401025 units, or 14355MB 15:03:25,807 DEBUG blivet: request 23 (mpatha1) growth: 0 (0MB) size: 4MB 15:03:25,808 DEBUG blivet: request 22 (mpatha2) growth: 0 (0MB) size: 500MB 15:03:25,808 DEBUG blivet: request 21 (mpatha3) growth: 29401025 (14355MB) size: 14855MB 15:03:25,808 DEBUG blivet: disk /dev/mapper/mpatha growth: 29401025 (14355MB) I assume that this was done with parted. Do you know if the device worked after this. Can you read and write it? Does it appear to be updated? I'm not sure what commands were run to hit the original error: DeviceCreateError: ('Could not commit to disk /dev/mapper/mpatha, (py_ped_disk_commit)', 'mpatha3') Is this something that can be run on an already installed system, so that we can try to recreate it an see where it's going wrong? [anaconda root@riley6 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/live-rw 999112 891188 107924 90% / devtmpfs 708224 0 708224 0% /dev tmpfs 760128 0 760128 0% /dev/shm tmpfs 760128 86272 673856 12% /run tmpfs 760128 0 760128 0% /sys/fs/cgroup /dev/sr0 4833900 4833900 0 100% /run/install/repo tmpfs 760128 35648 724480 5% /tmp [anaconda root@riley6 ~]# mkdir /mnt/tmp [anaconda root@riley6 ~]# mount /dev/mapper/mpatha3 /mnt/tmp mount: wrong fs type, bad option, bad superblock on /dev/mapper/mpatha3, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. Ben, This system is still running (after the point of failure) and I am available to run commands on it if you join #fedora-ppc on freenode. After this happens can you run parted on mpatha and create a new partition? The py_ped_disk_commit is from pyparted/parted and seems to indicate that it couldn't actually change the disk. [anaconda root@riley6 ~]# parted /dev/mapper/mpatha print Model: Linux device-mapper (multipath) (dm) Disk /dev/mapper/mpatha: 16.1GB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1049kB 5243kB 4194kB primary prep 2 5243kB 530MB 524MB primary 3 530MB 1054MB 524MB primary ext2 4 1054MB 16.1GB 15.1GB extended 5 1055MB 1579MB 524MB logical ext4 6 1580MB 16.1GB 14.5GB logical [anaconda root@riley6 ~]# parted /dev/mapper/mpatha GNU Parted 3.1 Using /dev/mapper/mpatha Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) rm 6 (parted) mkpart logical ext2 1580MB 16.1GB (parted) quit Information: You may need to update /etc/fstab. Ben, I tried http://ppc.koji.fedoraproject.org/stage/f19-20130604-GA-TC1/ppc64/iso/Fedora-19-ppc64-netinst.iso with: anaconda-19.25-1.fc19 python-blivet-0.13-1.fc19 instead and had the same failure: Traceback (most recent call first): File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 789, in create raise DeviceCreateError(str(e), self.name) File "/usr/lib/python2.7/site-packages/blivet/deviceaction.py", line 272, in execute self.device.create() File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 237, in processActions action.execute() File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 310, in doIt self.devicetree.processActions() File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 169, in turnOnFilesystems storage.doIt() File "/usr/lib64/python2.7/site-packages/pyanaconda/install.py", line 126, in doInstall turnOnFilesystems(storage, mountOnly=flags.flags.dirInstall) File "/usr/lib64/python2.7/threading.py", line 764, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 168, in run threading.Thread.run(self, *args, **kwargs) DeviceCreateError: ('Could not commit to disk /dev/mapper/mpatha, (py_ped_disk_commit)', 'mpatha3') Here's a possibility. Looking in the parted source ped_disk_commit -> ped_disk_commit_to_os -> linux_disk_commit -> _dm_reread_part_table -> _dm_remove_parts This removes all the existing partitions so that _dm_reread_part_table can recreate them. However you can't delete a dm-partition device that is already mounted. So does this: 15:03:34,879 INFO blivet: executing action: [19] Create Format ext4 filesystem m ounted at /boot on partition mpatha2 (id 22) actually mount the device on /boot? It's hard to be sure from the python exception that it fails in _dm_remove_parts, but it looks possible. If _dm_remove_parts fails, it throws a warning exception saying ped_exception_throw (PED_EXCEPTION_WARNING, PED_EXCEPTION_IGNORE, _("parted was unable to re-read the partition " "table on %s (%s). This means Linux won't know " "anything about the modifications you made. "), dev->path, strerror (errno)); However, pyparted's partedExnHandler doesn't raise python exceptions for PED_EXCEPTION_WARNING, so you would just get the boilerplate "Could not commit to disk" message. The only thing that I can't explain is why this worked in the past, if this is the actual problem. Manually trying to create a new partition with parted while I have another partition mounted gives me: (parted) mkpart primary ext2 31.0GB -1 device-mapper: remove ioctl on 3600d0230000000000e13955cc3757803p1 failed: Device or resource busy Warning: parted was unable to re-read the partition table on /dev/mapper/3600d0230000000000e13955cc3757803 (Device or resource busy). This means Linux won't know anything about the modifications you made. device-mapper: create ioctl on 3600d0230000000000e13955cc3757803p1 failed: Device or resource busy device-mapper: remove ioctl on 3600d0230000000000e13955cc3757803p1 failed: Device or resource busy So, is /boot mounted when mpatha3 is being created? Also, could we try running with a augmented version of pyparted that printed the PED_EXCEPTION_WARNING messages (assuming I'm missing someplace where they do get printed in the existing code)? This seems like it would make it easier to track down just where the parted code is failing. I tried the following patch: <mock-chroot>[root@bluebill ~]# (cd ~/build/BUILD/pyparted-3.10/; git diff) diff --git a/src/_pedmodule.c b/src/_pedmodule.c index efdf831..1b3d9a6 100644 --- a/src/_pedmodule.c +++ b/src/_pedmodule.c @@ -372,7 +372,9 @@ static PedExceptionOption partedExnHandler(PedException *e) { * otherwise ignore */ case PED_EXCEPTION_INFORMATION: case PED_EXCEPTION_WARNING: +#if 0 if (e->options == PED_EXCEPTION_YES_NO) { +#endif partedExnRaised = 1; partedExnMessage = strdup(e->message); @@ -385,10 +387,12 @@ static PedExceptionOption partedExnHandler(PedException *e) { * exception to our caller */ return PED_EXCEPTION_NO; +#if 0 } else { partedExnRaised = 0; return PED_EXCEPTION_IGNORE; } +#endif /* Set global flags so parted module methods can raise specific * exceptions with the message. and this changed the exception reported by anaconda to: anaconda 19.30.3-1 exception report Traceback (most recent call first): File "/usr/lib/python2.7/site-packages/blivet/devices.py", line 789, in create raise DeviceCreateError(str(e), self.name) File "/usr/lib/python2.7/site-packages/blivet/deviceaction.py", line 272, in execute self.device.create() File "/usr/lib/python2.7/site-packages/blivet/devicetree.py", line 237, in processActions action.execute() File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 310, in doIt self.devicetree.processActions() File "/usr/lib/python2.7/site-packages/blivet/__init__.py", line 169, in turnOnFilesystems storage.doIt() File "/usr/lib64/python2.7/site-packages/pyanaconda/install.py", line 138, in doInstall turnOnFilesystems(storage, mountOnly=flags.flags.dirInstall) File "/usr/lib64/python2.7/threading.py", line 764, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 168, in run threading.Thread.run(self, *args, **kwargs) DeviceCreateError: ("parted was unable to re-read the partition table on /dev/mapper/mpatha (Device or resource busy). This means Linux won't know anything about the modifications you made. ", u'mpatha4') however, I don't see /boot mounted with: [anaconda root@riley6 ~]# mount proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel) devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=708096k,nr_inodes=11064,mode=755) securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime) selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime) tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,seclabel,gid=5,mode=620,ptmxmode=000) tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755) tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,seclabel,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) /dev/sr0 on /run/install/repo type iso9660 (ro,relatime) /dev/mapper/live-rw on / type ext4 (rw,relatime,seclabel,data=ordered) rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime) systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=28,pgrp=1,timeout=300,minproto=5,maxproto=5,direct) debugfs on /sys/kernel/debug type debugfs (rw,relatime) configfs on /sys/kernel/config type configfs (rw,relatime) mqueue on /dev/mqueue type mqueue (rw,relatime,seclabel) hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,seclabel) tmpfs on /tmp type tmpfs (rw,seclabel) The new filesystem on mpatha2 is not mounted at the time of the failure. If it were, there would be a mount command below the mke2fs command in program.log. I wonder if the dd to mpatha (to remove stale metadata prior to adding the new partition) is triggering a "change" uevent which in turn triggers some udev helper program(s) which in turn make the device busy. I see 4ms between the return from dd and the attempted parted commit. (In reply to David Lehman from comment #16) > I wonder if the dd to mpatha (to remove stale metadata prior to adding the > new partition) is triggering a "change" uevent which in turn triggers some > udev helper program(s) which in turn make the device busy. I see 4ms between > the return from dd and the attempted parted commit. That looks very possible. Looking at 'man udev', if a udev devices is created with OPTIONS+="watch" watch Watch the device node with inotify; when the node is closed after being opened for writing, a change uevent is synthesized. And, it looks like the 13-dm-disk.rules should set this for multipath devices. So dd clearly opens the file for writing and then closes it. This seems like it would trigger the watch rule, causing a change event and kicking off blkid. So the answer may be yet another udevsettle in blivet in devices.PartitionDevice._wipe() the following patch to python-blivet diff --git a/blivet/devices.py b/blivet/devices.py index 1587b91..4f502b5 100644 --- a/blivet/devices.py +++ b/blivet/devices.py @@ -1480,6 +1480,11 @@ class PartitionDevice(StorageDevice): util.run_program(cmd) except OSError as e: log.error(str(e)) + else: + # If a udev devices is created with the watch option, then + # a change uevent is synthesized and we need to wait for + # things to settle. + udev_settle() def _create(self): """ Create the device. """ successfully installed and rebooted into the installed system! Discussed at 2013-06-10 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-06-10/f19final-blocker-review-4.2013-06-10-16.01.log.txt . Accepted as a blocker per criterion "The installer must be able to complete an installation using any network-attached storage devices (e.g. iSCSI, FCoE, Fibre Channel)" (the fix indicates this is not PPC-specific). FWIW, I think Lennart considers udev_settle to be basically the spawn of the devil and that anything that needs it ought to be Fixed Properly, but you can probably discuss that with him post-f19 release when we have more time... https://lists.fedorahosted.org/pipermail/anaconda-patches/2013-June/004510.html "Looks okay to me. I'll push this tomorrow." pykickstart-1.99.32-1.fc19,python-blivet-0.16-1.fc19,anaconda-19.30.6-1.fc19 has been submitted as an update for Fedora 19. https://admin.fedoraproject.org/updates/pykickstart-1.99.32-1.fc19,python-blivet-0.16-1.fc19,anaconda-19.30.6-1.fc19 Package anaconda-19.30.7-1.fc19, pykickstart-1.99.32-1.fc19, python-blivet-0.16-1.fc19: * should fix your issue, * was pushed to the Fedora 19 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing anaconda-19.30.7-1.fc19 pykickstart-1.99.32-1.fc19 python-blivet-0.16-1.fc19' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-10913/pykickstart-1.99.32-1.fc19,python-blivet-0.16-1.fc19,anaconda-19.30.7-1.fc19 then log in and leave karma (feedback). pykickstart-1.99.32-1.fc19, python-blivet-0.16-1.fc19, anaconda-19.30.8-1.fc19 has been pushed to the Fedora 19 stable repository. If problems still persist, please make note of it in this bug report. |