Created attachment 960630 [details] failed to set install bootloader.png Description of problem: Exception occurred when install RHEV-H "Failed to set install bootloader" Version-Release number of selected component (if applicable): rhev-hypervisor6-6.6-20141119.0 ovirt-node-3.0.1-19.el6.24.noarch How reproducible: 10% Steps to Reproduce: 1. Boot from PXE. 2. TUI install RHEV-H. 3. Finish the installation with correct steps. 4. Focus on "Install RHEV-H" page. Actual results: Exception occurred when install RHEV-H "Failed to set install bootloader" Expected results: Install can succeed without exception. Additional info:
Created attachment 960631 [details] failed log.png
Hi fabiand, I can't provide more log for you debug at present. 1. I try many times but still can't reproduce 2. Consider to test it after below patches merge. http://gerrit.ovirt.org/#/c/35490/ http://gerrit.ovirt.org/#/c/35491/ I will provide more info or test ENV for you ASAP if I can reproduce it again. Thanks!
Chen, Ryan provided the new rhevh 6.6 3.4.z build in bug 1158044#c40, please give a try on this build whether this issue is gone or not. Thanks. https://bugzilla.redhat.com/show_bug.cgi?id=1158044#c40
(In reply to Ying Cui from comment #4) > Chen, Ryan provided the new rhevh 6.6 3.4.z build in bug 1158044#c40, please > give a try on this build whether this issue is gone or not. Thanks. > > https://bugzilla.redhat.com/show_bug.cgi?id=1158044#c40 Hi ycui, Seem above 2 path are for 3.4, but actually, this build is for 3.5 but not 3.4, so I guess I don't need do any test with this build. Test version: Red Hat Enterprise Virtualization Hypervisor release 6.6 (20141125.0.el6ev) ovirt-node-3.1.0-0.24.20141104git70ba2b0.el6.noarch Thanks!
Chen, yes, I have updated the bug to ask Ryan to confirm that, let wait Ryan response, thanks. https://bugzilla.redhat.com/show_bug.cgi?id=1158044#c41
need info for comment 10
Created attachment 963560 [details] bootloader failed
Created attachment 963561 [details] ovirt.log
Created attachment 963562 [details] ovirt-node.log
Two things: 1. multipath is claiming the device - even if just one path is used 2. a udev race appears when partprobe is run [root@hp-bl465cg5-02 ~]# multipath -ll Dec 10 11:10:31 | multipath.conf line 3, invalid keyword: find_multipath 3600508b1001036303020202020200005 dm-2 HP,LOGICAL VOLUME size=68G features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=1 status=active `- 0:0:0:0 cciss!c0d0 104:0 active ready running [root@hp-bl465cg5-02 ~]# blkid -L RootNew /dev/mapper/3600508b1001036303020202020200005p3 Rediscover partitions: [root@hp-bl465cg5-02 ~]# partprobe device-mapper: remove ioctl on 3600508b1001036303020202020200005p3 failed: Device or resource busy … Warning: parted was unable to re-read the partition table on /dev/mapper/3600508b1001036303020202020200005 (Device or resource busy). This means Linux won't know anything about the modifications you made. … [root@hp-bl465cg5-02 ~]# blkid -L RootNew /dev/mapper/3600508b1001036303020202020200005p3 [root@hp-bl465cg5-02 ~]# partprobe device-mapper: remove ioctl on 3600508b1001036303020202020200005p3 failed: Device or resource busy … Warning: parted was unable to re-read the partition table on /dev/mapper/3600508b1001036303020202020200005 (Device or resource busy). This means Linux won't know anything about the modifications you made. … [root@hp-bl465cg5-02 ~]# blkid -L RootNew /dev/cciss/c0d0p3 The device changed, raw device is now used, the p3 symlink did not appear: [root@hp-bl465cg5-02 ~]# ls /dev/mapper/ 3600508b1001036303020202020200005 3600508b1001036303020202020200005p2 HostVG-Config HostVG-Logging control live-rw 3600508b1001036303020202020200005p1 3600508b1001036303020202020200005p4 HostVG-Data HostVG-Swap live-osimg-min Adding udevadm settle does not help. Easiest is to prevent multipath from claiming it. The other question is why the p3 symlink was not created.
The file relevant for those rules is: /lib/udev/rules.d/10-dm.rules:ENV{DM_UDEV_DISABLE_DM_RULES_FLAG}!="1", ENV{DM_NAME}=="?*", SYMLINK+="mapper/$env{DM_NAME}" Owned by: # rpm -qf /lib/udev/rules.d/10-dm.rules device-mapper-1.02.90-2.el6_6.1.x86_64 Peter, can you tell us why some symlinks do not appear?
(In reply to Fabian Deutsch from comment #28) > Two things: > > 1. multipath is claiming the device - even if just one path is used > 2. a udev race appears when partprobe is run > > > [root@hp-bl465cg5-02 ~]# multipath -ll > Dec 10 11:10:31 | multipath.conf line 3, invalid keyword: find_multipath ...that should be "find_multipaths" (the "s" is missing at the end of the keyword). > Rediscover partitions: > > [root@hp-bl465cg5-02 ~]# partprobe > device-mapper: remove ioctl on 3600508b1001036303020202020200005p3 failed: > Device or resource busy > … ...if you comment out all OPTIONS+="watch" rules in /lib/udev/rules.d (grep for them), does that change the situation in any way? (...I'm not sure now, but I think you need to resetar udev daemon to take the new rules so also call systemctl restart systemd-udevd.service after modifying the rules). So let's try this for starters.
(In reply to Peter Rajnoha from comment #30) > (In reply to Fabian Deutsch from comment #28) > > Two things: > > > > 1. multipath is claiming the device - even if just one path is used > > 2. a udev race appears when partprobe is run > > > > > > [root@hp-bl465cg5-02 ~]# multipath -ll > > Dec 10 11:10:31 | multipath.conf line 3, invalid keyword: find_multipath > > ...that should be "find_multipaths" (the "s" is missing at the end of the > keyword). Yep I fixed right after pasting the snippet. > > Rediscover partitions: > > > > [root@hp-bl465cg5-02 ~]# partprobe > > device-mapper: remove ioctl on 3600508b1001036303020202020200005p3 failed: > > Device or resource busy > > … > > ...if you comment out all OPTIONS+="watch" rules in /lib/udev/rules.d (grep > for them), does that change the situation in any way? (...I'm not sure now, > but I think you need to resetar udev daemon to take the new rules so also > call systemctl restart systemd-udevd.service after modifying the rules). > > So let's try this for starters. No: [root@hp-bl465cg5-02 udev]# diff -ur rules.d.orig/ rules.d diff -ur rules.d.orig/60-persistent-storage.rules rules.d/60-persistent-storage.rules --- rules.d.orig/60-persistent-storage.rules 2014-07-24 13:48:42.000000000 +0000 +++ rules.d/60-persistent-storage.rules 2014-12-10 15:08:37.000000000 +0000 @@ -86,9 +86,9 @@ KERNEL=="xvd*", ENV{DEVTYPE}=="partition", IMPORT{program}="/sbin/blkid -o udev -p $tempnode" # watch for future changes -KERNEL!="xvd*|sr*", OPTIONS+="watch" -KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch" -KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch" +#KERNEL!="xvd*|sr*", OPTIONS+="watch" +#KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch" +#KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch" # by-label/by-uuid links (filesystem metadata) ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}" [root@hp-bl465cg5-02 udev]# udevadm control --reload-rules [root@hp-bl465cg5-02 udev]# partprobe device-mapper: remove ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy Warning: parted was unable to re-read the partition table on /dev/mapper/3600508b1001036303020202020200005 (Device or resource busy). This means Linux won't know anything about the modifications you made. device-mapper: create ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy device-mapper: remove ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy
Ah, sorry, I've noticed the link to the test machine. So I looked at it a bit... (In reply to Fabian Deutsch from comment #28) > 1. multipath is claiming the device - even if just one path is used I've noticed there's /etc/multipath/wwids file present which contains 3600508b1001036303020202020200005 as valid wwid for the path (which is the wwid of the cciss!c0d0). Once this file is present, we don't need to wait for another path to appear for the mpath to claim the device (because mpath already knows the wwid so it can compare with that). The only question now is when and under what circumstances the wwid file got written first - was it during installation (anaconda?) or was it just copied from somewhere else or just the first multipath -c call in udev rules did not recognize this properly (and it wrote incorrect wwid file)? So we need to find out... > 2. a udev race appears when partprobe is run > > > [root@hp-bl465cg5-02 ~]# multipath -ll > Dec 10 11:10:31 | multipath.conf line 3, invalid keyword: find_multipath > 3600508b1001036303020202020200005 dm-2 HP,LOGICAL VOLUME > size=68G features='0' hwhandler='0' wp=rw > `-+- policy='round-robin 0' prio=1 status=active > `- 0:0:0:0 cciss!c0d0 104:0 active ready running > > > > [root@hp-bl465cg5-02 ~]# blkid -L RootNew > /dev/mapper/3600508b1001036303020202020200005p3 > > Rediscover partitions: > > [root@hp-bl465cg5-02 ~]# partprobe > device-mapper: remove ioctl on 3600508b1001036303020202020200005p3 failed: > Device or resource busy During my test I got (just p4 partition instead of p3 compared to Fabian's log): [root@hp-bl465cg5-02 ~]# partprobe device-mapper: remove ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy Warning: parted was unable to re-read the partition table on /dev/mapper/3600508b1001036303020202020200005 (Device or resource busy). This means Linux won't know anything about the modifications you made. device-mapper: create ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy device-mapper: remove ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy So 3600508b1001036303020202020200005p4 can't be removed because it's still open - it's used as a PV: # lsblk /dev/mapper/3600508b1001036303020202020200005p4 3600508b1001036303020202020200005p4 (dm-6) 253:6 0 67.6G 0 part ├─HostVG-Swap (dm-7) 253:7 0 7.9G 0 lvm ├─HostVG-Config (dm-8) 253:8 0 8M 0 lvm /config ├─HostVG-Logging (dm-9) 253:9 0 2G 0 lvm /var/log └─HostVG-Data (dm-10) 253:10 0 5.8G 0 lvm /data # pvs PV VG Fmt Attr PSize PFree /dev/mapper/3600508b1001036303020202020200005p4 HostVG lvm2 a-- 67.62g 51.98g Also, parted shows 4 partitions on /dev/cciss/c0d0: (parted) p Model: Compaq Smart Array (cpqarray) Disk /dev/cciss/c0d0: 143305920s Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 2048s 499711s 497664s primary bios_grub 2 499712s 999423s 499712s ext2 primary boot 3 999424s 1499135s 499712s ext2 primary 4 1499136s 143304703s 141805568s primary lvm Which seems to correspond with the device-mapper tables: # dmsetup table 3600508b1001036303020202020200005p1 3600508b1001036303020202020200005p1: 0 497664 linear 253:2 2048 3600508b1001036303020202020200005p2: 0 499712 linear 253:2 499712 3600508b1001036303020202020200005p3: 0 499712 linear 253:2 999424 3600508b1001036303020202020200005p4: 0 141805568 linear 253:2 1499136 So actually partprobe doesn't need to remove these mappings and recreate them - why does it do so? But the original source of the problem is that the device is claimed by mpath because the wwid is written in /etc/multipath/wwids file - we need to resolve this first.
So two things we need to answer: - how did the wwid got into /etc/multipath/wwids even if that wwid is not a path but just an ordinary device? - why partprobe tries to recreate the partition mappings if they're already correct?
(In reply to Peter Rajnoha from comment #33) > So two things we need to answer: > > - how did the wwid got into /etc/multipath/wwids even if that wwid is not > a path but just an ordinary device? My current idea is to use find_multipaths yes as a default, like we'll be doing on RHEV-H 7.0. I believe that a multipath call wrote the wwwids file, because we do not write that file ourselfs. Anaconda is not used by us. > - why partprobe tries to recreate the partition mappings if they're > already correct? Yes. That is the question I'd like to discuss. AFAIU partprobe is not recreating the partitions, but rather sending udev events, udev is then responsible for updating the symlinks. And there are two questions (at least!): 1. Should partprobe send udev events if nothing changed? 2. Should udev recretae the symlinks if nothing changed? 3. What kind of events does partprobe send? And what would the correct behavior be?
(In reply to Fabian Deutsch from comment #34) > Yes. That is the question I'd like to discuss. AFAIU partprobe is not > recreating the partitions, but rather sending udev events, udev is then > responsible for updating the symlinks. > > And there are two questions (at least!): > 1. Should partprobe send udev events if nothing changed? > 2. Should udev recretae the symlinks if nothing changed? > > 3. What kind of events does partprobe send? And what would the correct > behavior be? Partprobe itself does not send the events, it's the action done on dm device which triggers the event generation in kernel: - device creation - device removal - device rename - device suspend+resume - or artificial events based on watch rule (but we ruled them out in comment #31
Adding Brian to CC for the partprobe part: - we're seeing partprobe trying to recreate the partition mappings even if those mappings exist and they are correct already (see comment #32).
I think we can rule out interaction with udev since if I kill udev daemon, I still get the same errors: [root@hp-bl465cg5-02 ~]# killall udevd [root@hp-bl465cg5-02 ~]# ps aux | grep udevd root 28996 0.0 0.0 7908 812 pts/2 S+ 13:01 0:00 grep udevd [root@hp-bl465cg5-02 ~]# partprobe device-mapper: remove ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy Warning: parted was unable to re-read the partition table on /dev/mapper/3600508b1001036303020202020200005 (Device or resource busy). This means Linux won't know anything about the modifications you made. device-mapper: create ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy device-mapper: remove ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy I'd say it's OK that the remove on 3600508b1001036303020202020200005p4 fails since it's open (and used as PV), but partprobe probably shouldn't try to create this device again (create ioctl on 3600508b1001036303020202020200005p4 failed) when it already exists (and it failed to remove before).
This looks like bug 1136966 where dm-multipath udev rule is always running kpartx.
(In reply to bcl from comment #38) > This looks like bug 1136966 where dm-multipath udev rule is always running > kpartx. I verified several things: According to the 6.6 z-stream of the 6.7 bug 1136966: bug 1162265, the build device-mapper-multipath-0.4.9-80.el6_6.1.x86_64 fixes this issue. The build on the host is: [root@hp-bl465cg5-02 ~]# rpm -q device-mapper-multipath device-mapper-multipath-0.4.9-80.el6_6.1.x86_64 means the patch is in. To be on the safe side, I also manually inspected the 40-multipath.rules file, and can confirm that the rule is in. Then the question remains: What is the problem here?
Is partprobe really part of the problem? Even if it fails, as long as the device nodes are there it shouldn't matter. It looks like the real issue is that bootloader failure. It may be that whatever is causing that problem is also causing the partprobe issue. partprobe not being able to notify the kernel should not be causing the bootloader problem.
The message comes from our custom installer. And that one is calling partprobe behind the scenes. So here is what is happening in the background: 1. partition (several calls to partprobe) 2. write image to disk 3. install botloader Looking at the logs we see that it actually fails in 1
I have no idea what's going on then. Those ioctl errors look exactly like bug 1136966, parted hasn't changed much in 6.6 -- It slowed down rereading the part table (bug 1074069) and it fixed an assumption it was making about major:minor being sequential (bug 1018075). Maybe some other udev rule is interfering?
(In reply to bcl from comment #42) > I have no idea what's going on then. Those ioctl errors look exactly like > bug 1136966, parted hasn't changed much in 6.6 -- It slowed down rereading > the part table (bug 1074069) and it fixed an assumption it was making about > major:minor being sequential (bug 1018075). > > Maybe some other udev rule is interfering? I've switched off udev completely and it's still reproducible.
(In reply to Peter Rajnoha from comment #43) > I've switched off udev completely and it's still reproducible. (I mean not the problem in bug #1136966 - that was a race with kpartx run from udev. But with udev switched off, we can rule out udev interference...)
Brian, running strace kpartx shows: … ioctl(4, DM_DEV_REMOVE, 0x12823c0) = -1 EBUSY (Device or resource busy) write(2, "device-mapper: remove ioctl on 3"..., 98device-mapper: remove ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy) = 98 … write(2, "Warning: ", 9Warning: ) = 9 write(2, "parted was unable to re-read the"..., 198parted was unable to re-read the partition table on /dev/mapper/3600508b1001036303020202020200005 (Device or resource busy). This means Linux won't know anything about the modifications you made. ) = 198 I interprete this as partprobe being the part who is interacting with DM, not udev. Do you have an idea why partprobe above tries to remove that partition?
Also, further down it tries to create a partition: … semctl(29753355, 0, GETVAL, 0xffffffffffffffff) = 2 ioctl(4, DM_DEV_CREATE, 0x127e3b0) = -1 EBUSY (Device or resource busy) write(2, "device-mapper: create ioctl on 3"..., 98device-mapper: create ioctl on 3600508b1001036303020202020200005p4 failed: Device or resource busy) = 98 write(2, "\n", 1 … But why? The partitioning did not change.
With the assumption that partprobe is calling dm directly, I dug a bit more. It turns out that libparted/arch/linux.c is the place where the DM calls come from. parted-2.1 with some patches is used in RHEL 6.6. Looking at the changes done to the file above after 2.1, revealed: commit c605b2cea04a6c7478f5ed1254c74e02d943fb58 Author: Hans de Goede <hdegoede> Date: Fri Apr 23 13:08:43 2010 +0200 linux: detect dm_task_run failure We were checking for a return value of < 0 for dm_task_run errors, but dm_task_run returns 0 on error (and 1 on success). Thanks to Joe Jin for spotting this, see Red Hat bug 582907. * libparted/arch/linux.c(_dm_remove_map_name, _dm_is_part, _dm_remove_parts, _dm_add_partition): dm_task_run returns 0 on error. I am npt familiar with the parted code, but the patch matches the observations. We observe: Partitions are created even if it is not necessary. The patch fixes: Incorrect return code handling (basically negating return codes IIUIC) The following patches might also be relevant here, because they looked like follow up patches, but I am not sure if they affect our codepath: commit 76f8e829e43773778b69915b5cfff9f643701074 Author: Jim Meyering <meyering> Date: Mon Apr 12 12:08:16 2010 +0200 libparted: linux_disk_commit: don't ignore _disk_sync_part_table failure * libparted/arch/linux.c (linux_disk_commit): When calling _disk_sync_part_table, always return its result. commit 81ed7fc413375a8b8ed5bd792e7385dacaf8a3e1 Author: Jim Meyering <meyering> Date: Mon Apr 12 12:06:30 2010 +0200 libparted: _disk_sync_part_table: always return 0 upon failure * libparted/arch/linux.c (_disk_sync_part_table): Return 0 (not 1) upon failure. Brian, what do you think?
Ok, I think I see what's going on now. Sorry for not realizing it sooner. I think this boils down to the fact that parted 2.1 and partprobe cannot operate on busy devices. When you call partprobe without specifying a device it probes all the devices on the system and attempts to tell the kernel about the partitioning. For non-dm devices this is done using BLKRRPART calls. For dm it removes and recreates the mappings. As for the patches in comment 47, they aren't related, and are from newer code than we have in 2.1. The return values in 2.1 are working correctly. I think the real problem here is your use of partprobe. Have you changed how you call it recently? Or added steps? In reproducer script you made for bug 1173698 you are calling partprobe later than you should, and not specifying the device that was partitioned. Is this how it is being called in RHEV and how is this different than in previous releases?
We had calls like "partprobe /dev/mapper/*" since 2010 and "partprobe" since 2012. Not really a recent change. And everything works good on 6.5. I know that there weren't many changes to parted since 6.5, but maybe kernel timing changes uncover the problems now. Despite that we added a plain partprobe call to a function lately. The reasons why we added those calls was often, that partitions did not appear after we created them. Regarding the patches, yes, they are not part fo 2.1, I just wondered if they needed to get backported. We could try ripping out the plain partprobe calls, because we know there are improvements on the multipath and udev fronts. But it still does not explain, why we are facing the problems now on 6.6.
*** Bug 1171892 has been marked as a duplicate of this bug. ***
If partitions aren't showing up after creation then that may be because of old metadata on the device getting activated by some other thing (mdadm, lvm, ?) we've been seeing problems like that with Fedora and Anaconda where if you repeat an exact layout the metadata gets detected and something else interferes with parted notifying the kernel. A bare partprobe shouldn't ever work if there are partitions mounted so those should be changed to specify the devices. You may also need to do more to wipe out previous metadata on the device.
Hi fabiand, Can we request cciss machine (10.66.73.4) back for our new build(1212) testing? Thanks!
Yes.
Created attachment 969921 [details] 1212-failed.png
Created attachment 969922 [details] 1212.tar.gz
After review the whole bug above comments and confirmed with shaochen, summarizing here for QE test probability and test machines: server: Dell R210 - tested about 5 times no such issue Dell pet105 -01 - tested about 5 times, 2 times encountered this bug 1167240. HP-bl465cg5-02 - tested about 8 times, 4 times encountered this bug 1167240. workstation: hp-xw4550-02 - tested about 6 times, 2 times encountered this bug 1167240 exsit. desktop: Dell 9010 - tested about 10 times, 3 times encountered this bug 1167240 exsit. dell 790 - tested about 10 times, 4 times encountered this bug 1167240 exsit.
The hp customer encountered the failed bootloader issue on bug 1171892 with rhev 3.5 beta5 builds. So here need to highlight it.
From QE side, I highlight and summarize this bug here: 1. this bug exist in rhevh 6.6 for 3.4.z build, see bug description and comment 17 2. this bug exist in rhevh 6.6 for 3.5 build, see bug comment 55. 3. this bug exist in rhevh 7.0 for 3.5 build, see bug 1171892. Whether we need to separate this bug to 3.4.z and 3.5 to separately track?
(In reply to Ying Cui from comment #60) > From QE side, I highlight and summarize this bug here: > > 1. this bug exist in rhevh 6.6 for 3.4.z build, see bug description and > comment 17 > 2. this bug exist in rhevh 6.6 for 3.5 build, see bug comment 55. > 3. this bug exist in rhevh 7.0 for 3.5 build, see bug 1171892. > > Whether we need to separate this bug to 3.4.z and 3.5 to separately track? We can track it here, and then do a z-stream clone to track. The attached patch addresses this issue by reducing the number of partprobe calls, which can lead to this problem.
Test version: rhev-hypervisor6-6.6-20141218.0.el6ev ovirt-node-3.1.0-0.37.20141218gitcf277e1.el6.noarch I noticed that the path http://gerrit.ovirt.org/#/c/36254/ have been merged, but I still encountered the failed bootloader issue with above build. Dell pet105 -01 - tested about 2 times, 1 times encountered this bug 1167240. Please see the new attachment "1218.tar.gz" for more details. Due to there is new build(1218) coming, so I can't leave the test env to here. All logs info have uploaded. /var/log/*.* /tmp/ovirt.log Thanks for understanding.
Created attachment 970989 [details] 1218.tar.gz
Created attachment 970990 [details] 1218-bootloader.png
Change status to ASSIGNED according #c62
Hi fabiand, I just reproduce this bug on hp-z800-02, so the bug is not only appear on pet105-1. I Still can't provide remote access due to new build testing in process. Thanks!
Chen, please provide logs for every failed installation. We need more data to solve this.
(In reply to Fabian Deutsch from comment #67) > Chen, please provide logs for every failed installation. We need more data > to solve this. Hi fabiand, Actually I have uploaded all log info as attachment"1218.tar.gz", please #c63. Thanks!
(In reply to shaochen from comment #68) > (In reply to Fabian Deutsch from comment #67) > > Chen, please provide logs for every failed installation. We need more data > > to solve this. > > Hi fabiand, > > Actually I have uploaded all log info as attachment"1218.tar.gz", please > #c63. Please add the logs for the failed installation from comment 66. We need all the data, to get a picture of what is going wrong.
(In reply to shaochen from comment #63) > Created attachment 970989 [details] > 1218.tar.gz It looks like a partition labeled Root exists twice, this should not be the case. Let#s see what the other failure log says.
> Please add the logs for the failed installation from comment 66. > > We need all the data, to get a picture of what is going wrong. So let me be clear, comment 66 just insisting that the bug is not only appear on pet105-1 as you talked to me on IRC, so I didn't attach the log info. Today I test 5 times on hp-z800-02, didn't met the issue, so can't obtain log, but I will upload log info once I can. Thanks!
Remove the bug subject prefix "[3.4_6.6]" due to this bug occurred on 6.6_3.4.z/6.6_3.5/7.0_3.5, see comment 60 and comment 61.
Test version: ovirt-node-iso-3.5-0.999.201412310932.el6.iso ovirt-node-3.1.999-0.0.master.el6.noarch Test machines: Dell pet105 -02 - tested about 8 times, didn't met this bug. dell-per210-01 - tested about 5 times, didn't met this bug. Dell 790 - tested about 2 times, didn't met this bug. Test result: I can't reproduce this issue, seem the bug has gone. I will verify this bug after get official build from brew page. But if I can reproduce this issue again, I will leave env to here and let you know. Thanks!
Test version: ovirt-node-iso-3.5-0.999.201412310932.el6.iso ovirt-node-3.1.999-0.0.master.el6.noarch Test machines: hp-z800-02 - tested about 5 times, didn't met this bug. hp-5850 - tested about 5 times, didn't met this bug.
Test version: rhev-hypervisor6-6.6-20150105.0 ovirt-node-3.1.0-0.39.20150105gitb784105.el6.noarch rhev-hypervisor7-7.0-20150105.0 ovirt-node-3.1.0-0.39.20150105gitb784105.el7.noarch Test machines: Dell pet105 -02 - tested about 5 times, didn't met this bug. hp-5850 - tested about 5 times, didn't met this bug. hp-bl465cg5-01 - tested about 5 times, didn't met this bug. Test result: I can't reproduce this issue, seem the bug has gone. I will verify this bug after status change to ON_QA
Test version: rhev-hypervisor7-7.0-20150106.0 ovirt-node-3.1.0-0.40.20150105git69f34a6.el7.noarch Test machines: Dell 790 - tested about 10 times, didn't met this bug hp-5850 - tested about 5 times, didn't met this bug dell-pet105-02 - tested about 5 times, didn't met this bug hp-z800-02 - tested about 5 times, didn't met this bug dell r510 - tested about 5 times, didn't met this bug
(In reply to shaochen from comment #84) > Test version: > rhev-hypervisor7-7.0-20150106.0 > ovirt-node-3.1.0-0.40.20150105git69f34a6.el7.noarch > > Test machines: > Dell 790 - tested about 10 times, didn't met this bug > hp-5850 - tested about 5 times, didn't met this bug > dell-pet105-02 - tested about 5 times, didn't met this bug > hp-z800-02 - tested about 5 times, didn't met this bug > dell r510 - tested about 5 times, didn't met this bug Test version: rhev-hypervisor7-7.0-20150114.0 ovirt-node-3.2.1-4.el7.noarch Test machines: Dell 9010 - Tested 7 times, did not encounter this bug HP-z600-03 - Tested 5 times, did not encounter this bug HP-z800-02 - Tested 6 times, did not encounter this bug Dell-pet105-02 - Tested 5 times, did not encounter this bug HP-5850 - Tested 3 times, did not encounter this bug Test result: According above testing result and #c84, the bug has been fixed, change bug status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2015-0160.html