Description of problem: Occurred during installation to an iSCSI target from F22 Final TC3 Workstation x86_64 live. Preparation was OK except for an AVC - that's #1220948 - but when actual installation started, it crashed before package installation (I'm guessing after creating the target partitions, when trying to mount them). Version-Release number of selected component: anaconda-core-22.20.12-1.fc22.x86_64 The following was filed automatically by anaconda: anaconda 22.20.12-1 exception report Traceback (most recent call first): File "/usr/lib/python2.7/site-packages/blivet/formats/fs.py", line 656, in mount raise FSError("mount failed: %s" % rc) File "/usr/lib/python2.7/site-packages/blivet/formats/fs.py", line 893, in setup return self.mount(**kwargs) File "/usr/lib/python2.7/site-packages/blivet/osinstall.py", line 595, in mountFilesystems chroot=rootPath) File "/usr/lib/python2.7/site-packages/blivet/blivet.py", line 1407, in mountFilesystems readOnly=readOnly, skipRoot=skipRoot) File "/usr/lib/python2.7/site-packages/blivet/osinstall.py", line 1066, in turnOnFilesystems storage.mountFilesystems() File "/usr/lib64/python2.7/site-packages/pyanaconda/install.py", line 196, in doInstall turnOnFilesystems(storage, mountOnly=flags.flags.dirInstall, callbacks=callbacks_reg) File "/usr/lib64/python2.7/threading.py", line 766, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib64/python2.7/site-packages/pyanaconda/threads.py", line 244, in run threading.Thread.run(self, *args, **kwargs) FSError: mount failed: 32 Additional info: cmdline: /usr/bin/python2 /sbin/anaconda --liveinst --method=livecd:///dev/mapper/live-base cmdline_file: BOOT_IMAGE=vmlinuz0 initrd=initrd0.img root=live:CDLABEL=Fedora-Live-WS-x86_64-22-T3 rootfstype=auto ro rd.live.image quiet rhgb rd.luks=0 rd.md=0 rd.dm=0 executable: /sbin/anaconda hashmarkername: anaconda kernel: 4.0.1-300.fc22.x86_64 other involved packages: python-libs-2.7.9-6.fc22.x86_64, python-blivet-1.0.9-1.fc22.noarch product: Fedora release: Fedora release 22 (Twenty Two) type: anaconda version: 22
Created attachment 1024796 [details] File: anaconda-tb
Created attachment 1024797 [details] File: anaconda.log
Created attachment 1024798 [details] File: environ
Created attachment 1024799 [details] File: journalctl
Created attachment 1024800 [details] File: lsblk_output
Created attachment 1024801 [details] File: nmcli_dev_list
Created attachment 1024802 [details] File: os_info
Created attachment 1024803 [details] File: program.log
Created attachment 1024804 [details] File: storage.log
Created attachment 1024805 [details] File: ifcfg.log
Proposing as a Final blocker, criterion "The installer must be able to detect (if possible) and install to supported network-attached storage devices...Supported network-attached storage types include iSCSI, Fibre Channel and Fibre Channel over Ethernet (FCoE)." - https://fedoraproject.org/wiki/Fedora_22_Final_Release_Criteria#Network_attached_storage . I'll check if this occurs on netinst as well.
Yeah, same error with non-live (TC3 Server netinst).
So, with Server netinst it fails slightly differently in fact. Mounting /dev/mapper/fedora00-root to /mnt/sysimage succeeds - note that's an xfs partition. Mounting /dev/sda1 to /mnt/sysimage/boot fails - that's an ext4 partition. After trying the mount manually, journal shows "JBD2: no valid journal superblock found" and "EXT4-fs (sda1): error loading journal". It's possible my test hardware is screwing this up, I guess...
Note from the journal: May 12 18:42:48 localhost kernel: sd 4:0:0:0: [sda] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE May 12 18:42:48 localhost kernel: sd 4:0:0:0: [sda] Sense Key : Not Ready [current] May 12 18:42:48 localhost kernel: sd 4:0:0:0: [sda] Add. Sense: Logical unit communication failure May 12 18:42:48 localhost kernel: sd 4:0:0:0: [sda] CDB: Write(10) 2a 00 00 04 80 00 00 40 00 00 May 12 18:42:48 localhost kernel: blk_update_request: I/O error, dev sda, sector 294912 May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34816, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34817, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34818, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34819, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34820, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34821, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34822, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34823, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34824, lost async page write May 12 18:42:48 localhost kernel: Buffer I/O error on dev sda1, logical block 34825, lost async page write May 12 18:42:48 localhost kernel: ------------[ cut here ]------------ May 12 18:42:48 localhost kernel: WARNING: CPU: 0 PID: 2474 at fs/block_dev.c:57 __blkdev_put+0xc1/0x220() May 12 18:42:48 localhost kernel: Modules linked in: be2iscsi bnx2i cnic uio cxgb3i cxgb3 mdio libcxgbi btrfs fcoe libfcoe libfc scsi_transport_fc xfs libcrc32c iscsi_ibft iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sha256_ssse3 dm_crypt vfat fat dm_round_robin dm_multipath raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 uinput bnep bluetooth rfkill fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security May 12 18:42:48 localhost kernel: iptable_raw snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec ppdev snd_hwdep snd_seq snd_seq_device snd_pcm parport_pc serio_raw parport snd_timer acpi_cpufreq i2c_piix4 snd soundcore nfsd auth_rpcgss nfs_acl lockd grace isofs squashfs 8021q garp stp llc mrp virtio_net virtio_balloon virtio_scsi virtio_console virtio_blk qxl drm_kms_helper ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm sym53c8xx ghash_clmulni_intel scsi_transport_spi ata_generic virtio_pci virtio_ring virtio pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua sunrpc loop May 12 18:42:48 localhost kernel: CPU: 0 PID: 2474 Comm: systemd-udevd Not tainted 4.0.1-300.fc22.x86_64 #1 May 12 18:42:48 localhost kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 May 12 18:42:48 localhost kernel: 0000000000000000 00000000eaaf914f ffff88000ce5bdb8 ffffffff817819a8 May 12 18:42:48 localhost kernel: 0000000000000000 0000000000000000 ffff88000ce5bdf8 ffffffff8109c60a May 12 18:42:48 localhost kernel: 0000000000000000 ffff88007a826bb8 ffff88007a826a40 ffff88007a826b30 May 12 18:42:48 localhost kernel: Call Trace: May 12 18:42:48 localhost kernel: [<ffffffff817819a8>] dump_stack+0x45/0x57 May 12 18:42:48 localhost kernel: [<ffffffff8109c60a>] warn_slowpath_common+0x8a/0xc0 May 12 18:42:48 localhost kernel: [<ffffffff8109c73a>] warn_slowpath_null+0x1a/0x20 May 12 18:42:48 localhost kernel: [<ffffffff81257c21>] __blkdev_put+0xc1/0x220 May 12 18:42:48 localhost kernel: [<ffffffff81258210>] blkdev_put+0x50/0x130 May 12 18:42:48 localhost kernel: [<ffffffff81258315>] blkdev_close+0x25/0x30 May 12 18:42:48 localhost kernel: [<ffffffff8121edaf>] __fput+0xdf/0x1f0 May 12 18:42:48 localhost kernel: [<ffffffff8121ef0e>] ____fput+0xe/0x10 May 12 18:42:48 localhost kernel: [<ffffffff810b9927>] task_work_run+0xa7/0xe0 May 12 18:42:48 localhost kernel: [<ffffffff81013d0d>] do_notify_resume+0x9d/0xa0 May 12 18:42:48 localhost kernel: [<ffffffff817882e3>] int_signal+0x12/0x17 May 12 18:42:48 localhost kernel: ---[ end trace dfe9715d9b6f0912 ]--- my netinst test has a similar block.
yeah, those kernel errors seem reproducible every time I try this. Not sure if it's a hardware/NAS firmware issue or a kernel bug. Kernel folks, any thoughts? kernel is 4.0.1-300.fc22.
I don't have anything off the top of my head that would cause this. If it's always the same range of blocks, I'd start to suspect a disk going bad. I've CC'd the block guys just in case. It would be excellent if we could get this tested on some different hardware. Additionally, if you could try an F21 install to the same hardware that might be helpful as well I guess.
(In reply to Adam Williamson from comment #14) I've seen the blk_update_request: I/O error, Buffer I/O error stuff before, bug 1204569. There may be a commonality. In virt-manager, if you click on the disk that presents as /dev/sda in the guest, can you change Advanced options>Performance options>Cache mode to unsafe and retest? Does the problem happen?
In bug 1220970 I've reproduce the identical kernel call trace as in comment 14, but it's based on the sequence in, and ultimately a dup of bug 1204569. I think there's a common cause.
This is not a virtual disk. It's an actual iSCSI target on real disks on a real machine. I'm using a NAS box with a RAID-6 array, I created a small iSCSI target within the array for this form of testing. The array is clean.
And there is no btrfs involved.
(In reply to Chris Murphy from comment #17) > (In reply to Adam Williamson from comment #14) > I've seen the blk_update_request: I/O error, Buffer I/O error stuff before, > bug 1204569. There may be a commonality. That message just means that in both cases there was a write error. It doesn't say anything about the root cause of that write error, so even without looking at the context, it's probably something different if you see the message in a different bug report. And with the context of this specific bug, as Adam already said, it's clear that it must be something entirely different.
Adam, I'd try what Josh sugguested: do an f21 install and see if that works. If it does, then I would start debugging at the scsi layer, given the reported 'communication failure' additional sense code.
I just attempted to reproduce this bug with F22 Server Final TC3 as well. I created a LUN on a Synology DS214 NAS (no authentication) and booted a VM and attached to it via iSCSI and completed a successful installation. (I doubt it makes a difference, but the implementation of the LUN was done with a thinly-provisioned LVM partition on physical hardware on the NAS box).
Sorry, forgot to note: I'm -1 blocker on this, since I can't reproduce it and there's a chance it might be a hardware problem on Adam's side.
Good news that sgallagh can't reproduce, but there seems to be a real problem with my case, as it doesn't occur on F21...so there is something screwy going on with F22 kernel, I guess? I'll try and debug a bit.
> FSError: mount failed: 32 Ok, what was the kernel error when that happened? I tried to sift through the tens of thousands of lines in a dozen or so attached log files but not quite sure where to look. ;) The block-level / ATA type errors do seem to indicate that it's unlikely to be a filesystem problem as a root cause... -Eric
(In reply to Adam Williamson from comment #25) > Good news that sgallagh can't reproduce, but there seems to be a real > problem with my case, as it doesn't occur on F21...so there is something > screwy going on with F22 kernel, I guess? I'll try and debug a bit. Adam, can you try booting the f22 kernel on your f21 install?
I've tried quite a few different partitioning structures to reproduce this (With iSCSI / on XFS, on EXT4, with / shared between iSCSI and a local disk, etc.). I can't reproduce this issue, and not from lack of trying.
I see some buffer I/O errors showing up on the console while installing the 4.0.1 kernel on the F21 install, it does boot successfully, though. tflink says an iSCSI install works for him, too, so we can probably chalk this up to flakiness of some sort on my NAS' behalf, I guess...
I did installs of F22 TC3 using both the server x86_64 DVD and the workstation x86_64 live, using an iscsi target for / in both installs. Both of them went off with no problems and I was not able to reproduce this issue.
Discussed at the 2015-05-14 blocker review meeting.[1] Voted as RejectedBlocker - so far this appears to be some kind of configuration-specific issue and may be caused by the NAS, other iSCSI tests have been successful [1] http://meetbot.fedoraproject.org/fedora-blocker-review/2015-05-14
*********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 22 kernel bugs. Fedora 22 has now been rebased to 4.2.3-200.fc22. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 23, and are still experiencing this issue, please change the version to Fedora 23. If you experience different issues, please open a new bug report for those.
*********** MASS BUG UPDATE ************** This bug is being closed with INSUFFICIENT_DATA as there has not been a response in over 4 weeks. If you are still experiencing this issue, please reopen and attach the relevant data from the latest kernel you are running and any data that might have been requested previously.