Bug 835023 - win7.64 keep on rebooting at the end of installation
win7.64 keep on rebooting at the end of installation
Status: CLOSED WONTFIX
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm (Show other bugs)
5.9
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Virtualization Maintenance
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-25 05:47 EDT by Suqin Huang
Modified: 2013-01-09 20:02 EST (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-08-15 02:19:52 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
bsod (14.82 KB, image/png)
2012-06-25 05:47 EDT, Suqin Huang
no flags Details
keep on rebooting at this step (55.00 KB, image/jpeg)
2012-06-25 05:50 EDT, Suqin Huang
no flags Details
application log (12.98 KB, text/plain)
2012-08-04 23:40 EDT, Xiaoqing Wei
no flags Details
chkdsk log (10.98 KB, text/plain)
2012-08-04 23:41 EDT, Xiaoqing Wei
no flags Details
system log (41.34 KB, text/plain)
2012-08-04 23:41 EDT, Xiaoqing Wei
no flags Details
trying 835023 but 804888 happens (390.00 KB, application/octet-stream)
2012-08-07 04:28 EDT, Xiaoqing Wei
no flags Details
comment 30 (25.44 KB, image/png)
2012-08-09 08:52 EDT, Xiaoqing Wei
no flags Details

  None (edit)
Description Suqin Huang 2012-06-25 05:47:23 EDT
Created attachment 594145 [details]
bsod

Description of problem:


Version-Release number of selected component (if applicable):
kvm-83-254.el5
virtio-win-1.0.3-52454.iso

How reproducible:
1/6

Steps to Reproduce:
1. cmd
/usr/libexec/qemu-kvm -monitor stdio \
-drive file='/home/Win7-64-virtio.qcow2',if=virtio,media=disk,cache=none,boot=on,snapshot=off,format=qcow2 -net nic,vlan=0,model=virtio,macaddr='9a:a2:3b:f5:41:d5' -net tap,vlan=0,script=/home/scripts/qemu-ifup-switch -m 4096 -smp 2,cores=1,threads=1,sockets=2 -cpu 'qemu64' -drive file='/home/images/win7-64-virtio-error.qcow2',media=cdrom -drive file='/home/windows/winutils.iso',media=cdrom -drive file='/home/windows/virtio-win.iso.el5',media=cdrom -soundhw ac97 -fda '/home/autotest-devel/client/tests/kvm/images/win7-64/answer.vfd' -redir tcp:5000::10023 -vnc :0 -vga std -rtc-td-hack -M rhel5.6.0 -boot c   -usbdevice tablet

2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Suqin Huang 2012-06-25 05:50:00 EDT
Created attachment 594146 [details]
keep on rebooting at this step
Comment 3 Vadim Rozenfeld 2012-06-25 06:31:25 EDT
Is it different from
https://bugzilla.redhat.com/show_bug.cgi?id=804888 ?
Comment 4 Xiaoqing Wei 2012-06-25 07:50:52 EDT
(In reply to comment #3)
> Is it different from
> https://bugzilla.redhat.com/show_bug.cgi?id=804888 ?

Looks diff, the bz804888 is file missing, but cant see anything like 'file missing' from this one.
Comment 5 Miya Chen 2012-06-28 08:18:55 EDT
shuang, can we have a try with ide drive?
Comment 6 Suqin Huang 2012-07-06 03:47:10 EDT
(In reply to comment #5)
> shuang, can we have a try with ide drive?

repeat 150 times with ide, can not reproduce it.
Comment 7 Dor Laor 2012-07-08 05:25:17 EDT
Is it worth to test w/ the rhel6 virtio-win blk driver?
Comment 8 Vadim Rozenfeld 2012-07-08 05:45:35 EDT
(In reply to comment #7)
> Is it worth to test w/ the rhel6 virtio-win blk driver?

Absolutely. 
It's always good to test with the latest drivers.
Comment 9 Ronen Hod 2012-07-17 06:02:12 EDT
Please check the latest drivers. We want to use them in 5.9 anyhow.
Comment 12 Suqin Huang 2012-07-19 02:49:55 EDT
try 25 times with raw, virtio-win-1.5.3-1.el6, didn't reproduce it.
Comment 13 Xiaoqing Wei 2012-08-04 23:40:41 EDT
Created attachment 602324 [details]
application log
Comment 14 Xiaoqing Wei 2012-08-04 23:41:18 EDT
Created attachment 602325 [details]
chkdsk log
Comment 15 Xiaoqing Wei 2012-08-04 23:41:49 EDT
Created attachment 602326 [details]
system log
Comment 16 Gleb Natapov 2012-08-05 05:02:40 EDT
1. add werror=stop parameter
2. if you exit qemu after it enters reboot loop and start it again with the same qcow image can it boot or it enters reboot loop again?
Comment 17 Suqin Huang 2012-08-06 03:40:11 EDT
Add werror=stop, it enters reboot loop again.
Comment 18 Suqin Huang 2012-08-06 03:54:32 EDT
(In reply to comment #17)
> Add werror=stop, it enters reboot loop again.

make a mistake, will reproduce it with werror=stop
Comment 22 Xiaoqing Wei 2012-08-07 02:15:37 EDT
with werror=stop, now hit guest hanging on the 'windows error recovery' screen.

check the 'info blockstats' in monitor for many times, the outputs are constant:

(qemu) info blockstats 
virtio0: rd_bytes=1334828032 wr_bytes=8491436544 rd_operations=43199 wr_operations=198296
ide0-cd0: rd_bytes=3409486336 wr_bytes=0 rd_operations=1664786 wr_operations=0
ide0-cd1: rd_bytes=195072 wr_bytes=0 rd_operations=70 wr_operations=0
ide1-cd0: rd_bytes=66048 wr_bytes=0 rd_operations=32 wr_operations=0
floppy0: rd_bytes=252416 wr_bytes=7680 rd_operations=493 wr_operations=15
sd0: rd_bytes=0 wr_bytes=0 rd_operations=0 wr_operations=0


and here's the top info:

# top -Hp 19197 -n 1
Tasks:   7 total,   1 running,   6 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2%us,  2.5%sy,  0.0%ni, 94.2%id,  2.9%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  12159136k total, 12042996k used,   116140k free,   153608k buffers
Swap: 16777208k total,      232k used, 16776976k free,  7407704k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                           
19207 root      25   0 4364m 4.0g 2536 R 100.0 34.8 216:52.39 qemu-kvm                                                         
19197 root      15   0 4364m 4.0g 2536 S  0.0 34.8   0:29.28 qemu-kvm                                                          
19206 root      15   0 4364m 4.0g 2536 S  0.0 34.8   0:01.94 qemu-kvm                                                          
19208 root      15   0 4364m 4.0g 2536 S  0.0 34.8   2:24.53 qemu-kvm                                                          
19209 root      15   0 4364m 4.0g 2536 S  0.0 34.8   3:13.35 qemu-kvm                                                          
19210 root      15   0 4364m 4.0g 2536 S  0.0 34.8   3:11.66 qemu-kvm                                                          
19211 root      15   0 4364m 4.0g 2536 S  0.0 34.8   0:00.00 qemu-kvm
Comment 23 Xiaoqing Wei 2012-08-07 02:16:58 EDT
(In reply to comment #22)

# kvm_stat -1
efer_reload             10414764         0
exits                 3031402419    225296
fpu_reload               9913755        51
halt_exits               1583689         0
halt_wakeup              1219003         0
host_state_reload       41045646        51
hypercalls                     0         0
insn_emulation        2571736213    196444
insn_emulation_fail            0         0
invlpg                         0         0
io_exits                 8695918        19
irq_exits               16310318       758
irq_injections           2642275        19
irq_window                580946        11
kvm_request_irq                0         0
largepages                     0         0
mmio_exits                103618         0
mmu_cache_miss             10284         0
mmu_flooded                    0         0
mmu_pde_zapped                 0         0
mmu_pte_updated                0         0
mmu_pte_write            4856685         0
mmu_recycled                   0         0
mmu_shadow_zapped           8471         0
mmu_unsync                     0         0
mmu_unsync_global              0         0
nmi_injections                 7         0
nmi_window                     0         0
pf_fixed                 3625172         0
pf_guest                       0         0
remote_tlb_flush            2772         0
request_nmi                    0         0
signal_exits                 305         0
tlb_flush             1110364617     84190
Comment 25 Xiaoqing Wei 2012-08-07 04:28:23 EDT
Created attachment 602674 [details]
trying 835023 but 804888 happens
Comment 26 Xiaoqing Wei 2012-08-07 05:01:59 EDT
(In reply to comment #22)
> with werror=stop, now hit guest hanging on the 'windows error recovery'
> screen.

Correction:
guest is not completely dead, will reponse my keyboard event few mins later

I hit the enter key via vnc, the guest will then reboot few mins later(before it reboots, no picture changes).
Comment 27 Ronen Hod 2012-08-07 07:18:15 EDT
IRC message following comment 25
Thanks, Ronen.

<xwei/#kvm> rhod, ping
<xwei/#kvm> rhod,   about Bug 835023 - win7.64 keep on rebooting at the end of installation   :   till now , it didn't reproduced for many rounds of attempts, but bz804888 happens, I submitted some logs to bz835023.
<xwei/#kvm> rhod, and, with the 'werror=stop', the guest hangs after rebooting.
Comment 29 Kevin Wolf 2012-08-09 05:22:24 EDT
It sounds unlikely that this has never been tested before, so is this a regression?
Comment 31 Xiaoqing Wei 2012-08-09 08:52:23 EDT
Created attachment 603253 [details]
comment 30
Comment 32 Ronen Hod 2012-08-12 05:09:37 EDT
On 2012-7-30 Xiaoqing wrote in a mail regarding regression test of this bug and Bug 804888

Hi,
I tried [1] RHEL.5.6 host with virtio-win-1.0.3-52454.iso  and
       [2] RHEL.5.9 host with virtio-win-1.0.0-45801.iso

Both combinations can reproduce the above two bugs, but the ratio of
reproducible looks [2] > [1] for now(I make autotest repeat 200 rounds,
but ~ 150 rounds yet). will re-send an email when all finished and
summarize the ratio.


Thanks and Best Regards,
Xiaoqing.
Comment 34 Avi Kivity 2012-08-12 07:24:06 EDT
Please try qcow2 metadata preallocation: add "-o preallocation=metadata" to the qemu-img command line which creates the image.  If it still reproduces, that points the finger at virtio rather than qcow2 (though it does not rule it out completely).
Comment 37 Ronen Hod 2012-08-15 02:19:28 EDT
Regarding bug 835023, bug 840715, and bug 804888

We are late in RHEL5, and we also do not want to put too many resources into RHEL5 bugs, so I am closing these bugs.
- These bugs are difficult to reproduce, both Vadim and Kevin failed to reproduce them, for days.
- These bugs are related to the initial installation, so no real harm done.
- Reproducing it requires a cycle of installation, that takes a minimum of 1/2 an hour, and even more if we want to try it during high load.
- This is not a regression.
- They were not reported by customers, which probably means that customers rarely install Windows guests on RHEL5.9. They probably install once and use a template (if at all). Since we are talking about RHEL5 host, this probably will not change.
- A simple workaround is usually to simply retry. Other workarounds are to use raw file (instead of QCOW), or to add virtio-stor driver only after the initial installation

Once any of these bugs is reported by a customer, we will reconsider. In most cases the customer should be able to retry and forget about it.

With uncomfortable feelings, Ronen.
Comment 38 Ronen Hod 2012-08-15 03:56:15 EDT
An after thought,
Did we try rebooting an installed system repeatedly (without quitting QEMU).
Is there a difference between the first boot (during the installation) and the following boots?
Comment 39 Xiaoqing Wei 2012-08-15 04:19:18 EDT
(In reply to comment #38)
> An after thought,
> Did we try rebooting an installed system repeatedly (without quitting QEMU).

Yes, we have this testcase automated(reboot guest 25 times), run it as an regular test for every build, and didn't hit bz#804888 and bz#835023 so far.


> Is there a difference between the first boot (during the installation) and
> the following boots?

I dont know the exactly, but at least I think there's one thing diff [though this looks irrelevant to this bug :) ]
  During installation, OS has lesser RAM resource as it need to put all of its files in RAM(caused it dont have a 'swap' partition/file yet)

Just wanna to say there could be more diff between booting a installed/installing OS


Regards,
Xiaoqing Wei.
Comment 40 Ronen Hod 2012-08-15 10:48:13 EDT
(In reply to comment #39)
> (In reply to comment #38)
> > An after thought,
> > Did we try rebooting an installed system repeatedly (without quitting QEMU).
> 
> Yes, we have this testcase automated(reboot guest 25 times), run it as an
> regular test for every build, and didn't hit bz#804888 and bz#835023 so far.
> 

Thanks, this is good. So we stay with the current decision for now.

> 
> > Is there a difference between the first boot (during the installation) and
> > the following boots?
> 
> I dont know the exactly, but at least I think there's one thing diff [though
> this looks irrelevant to this bug :) ]
>   During installation, OS has lesser RAM resource as it need to put all of
> its files in RAM(caused it dont have a 'swap' partition/file yet)
> 
> Just wanna to say there could be more diff between booting a
> installed/installing OS
> 
> 
> Regards,
> Xiaoqing Wei.

Note You need to log in before you can comment on or make changes to this bug.