Bug 1879396
| Summary: | red hat virtio scsi disk device 6.3.9600.18758 | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Community] Virtualization Tools | Reporter: | Evgen Puzanov <e.puzanov> | ||||
| Component: | virtio-win | Assignee: | Vadim Rozenfeld <vrozenfe> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | menli <menli> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | unspecified | CC: | ghammer, haoliu, jinzhao, juzhang, lijin, mdean, virt-maint, vladimir, vrozenfe, yvugenfi | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Windows | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-12-09 10:48:58 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Evgen Puzanov
2020-09-16 07:59:23 UTC
Hi Evgen, Thank you for reporting the problem. Does it happen on some specific Windows version or just on all of them? Can you upload a couple of system log files from different VMs for the future investigation? It will be quite useful to see qemu command line and to know qemu version as well. There were not too many critical changes in vioscsi code between build 160 and the most recent officially released build 184, but updating drivers to the latest version is always a good idea. Best, Vadim. Vadim, at the moment we see that the current driver version is 0.1.185, not 0.1.184. Should we upgrade to 0.1.185 or is it 0.1.184? Vadim, in the logs of the operating system itself there are only such entries: The driver detected a controller error \ Device \ Harddisk0 \ DR. Unfortunately, we learned about the problem too late and the system log entries had already been overwritten at that time. As for the libvirt logs, please tell me how to send you the file? (In reply to Evgen Puzanov from comment #2) > Vadim, at the moment we see that the current driver version is 0.1.185, not > 0.1.184. Should we upgrade to 0.1.185 or is it 0.1.184? my bad. 185 is what you need. (In reply to Evgen Puzanov from comment #3) > Vadim, in the logs of the operating system itself there are only such > entries: The driver detected a controller error \ Device \ Harddisk0 \ DR. > Unfortunately, we learned about the problem too late and the system log > entries had already been overwritten at that time. As for the libvirt logs, > please tell me how to send you the file? you can add it to this bug as an attachment. Thanks, Vadim. Created attachment 1715069 [details]
log file
(In reply to Vadim Rozenfeld from comment #5) > (In reply to Evgen Puzanov from comment #3) > > Vadim, in the logs of the operating system itself there are only such > > entries: The driver detected a controller error \ Device \ Harddisk0 \ DR. > > Unfortunately, we learned about the problem too late and the system log > > entries had already been overwritten at that time. As for the libvirt logs, > > please tell me how to send you the file? > > you can add it to this bug as an attachment. > > Thanks, > Vadim. > added, please check Any particular reason for using rhel6.6.0 machine type, this one is extremely old and missing a lot of new stuff. I see that you are using virtio-blk-pci device, then yes, please update viostor driver. Hello virtio-blk-pci - we use a CD-ROM to emulate, but the hard disk is emulated as virtio-serial-pci and the problem we had with virtio-serial-pci Sorry, it cannot be true virtio-serial-pci is a virtio device designed to establish bi-directional communication channels between host and quest. It is not a storage device, and cannot be placed into Windows storage stack. In the following case, taken ftom the log file provided by your as the attachment to this bug ( https://bugzilla.redhat.com/attachment.cgi?id=1715069 ) 1. This is a qcow2 image attached to virtio-blk-pci device. I guess it is the system disk -drive file=/var/lib/libvirt/images/cb668a18-ec37-44d6-a8ef-ebef647afa3f,if=none,id=drive-virtio-disk0,format=qcow2,serial=cb668a18ec3744d6a8ef,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 2. This is an iso image attached to emulated IDE controller. -drive file=/mnt/743a2a5c-d1a6-3c0b-8805-d98548a166ad/453-2-f3a4d3ba-3b91-3480-a2ac-9a39f6d8994c.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 3. While this one, according to its name, is the qemu quest agent communication channel, created by virtio-serial-pci -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 Cheers, Vadim. Hello, I'm sorry, the fault was mine, I misled my colleague Evgen. Our orchestration system (Apache CloudStack) use virtio-blk-device when is starts a VM. Should we update the driver because there were a bug that could have led to this issue with virtio-blk-device? Thanks. (In reply to Volodymyr Melnyk from comment #12) > Hello, > > I'm sorry, the fault was mine, I misled my colleague Evgen. > > Our orchestration system (Apache CloudStack) use virtio-blk-device when is > starts a VM. > > Should we update the driver because there were a bug that could have led to > this issue with virtio-blk-device? > > Thanks. Hi Volodymyr, Never mind, I just needed to see the virtual machine configuration settings. In any case, it is always a good idea to update virtio-win drivers with the most recent ones. Apart from that, according to the log file that Evgen shared with us, it looks as you are still using rhel6.6.0 machine type. Can I ask you what qemu version it is? It might be important, since when it comes to reporting volume size the driver itself is fully depends on the information provided by qemu. Best regards, Vadim. Hello, There are 6 quite old hosts in our cloud, they're running qemu-kvm 0.12.1 and CentOS 6. Of course, it might be qemu-related issue, but there are 2 aspects that made us to consider this issue as driver-related: 1. There were at least 3 occurrences during the past couple of years, all the guests were running Windows Server (2008R2 and 2012R2), but it never happened to Linux guests. 2. Even if qemu-kvm reported wrong disk size, the guest operating system could just keep it unused, but in our cases the guest operating systems "thought" that the partition size is also bigger than it should be. Taking into account all the above, what do you think, could it be more likely to be caused by the guest's driver than by the host's virtualization software? Thanks. (In reply to Volodymyr Melnyk from comment #14) > Hello, > > There are 6 quite old hosts in our cloud, they're running qemu-kvm 0.12.1 > and CentOS 6. > > Of course, it might be qemu-related issue, but there are 2 aspects that made > us to consider this issue as driver-related: > 1. There were at least 3 occurrences during the past couple of years, all > the guests were running Windows Server (2008R2 and 2012R2), but it never > happened to Linux guests. > 2. Even if qemu-kvm reported wrong disk size, the guest operating system > could just keep it unused, but in our cases the guest operating systems > "thought" that the partition size is also bigger than it should be. > > Taking into account all the above, what do you think, could it be more > likely to be caused by the guest's driver than by the host's virtualization > software? > > Thanks. Thanks, Can you give me the exact qemu-kvm package name installed on the CentOS system(s), where the problem happens? Another question is just to confirm if the problem h appens on 2008R2 and 2012R2 systems only? We definitely need to know how to reproduce the problem, maybe you mentioned a common pattern of events before the problem happened. As I said before the driver itself is quite passive in determination the volume size it is attached to. Basically, it reads the size of volume on every boot ( or driver load to be more precise ) and passes it up on request. Theoretically, if there is some "glitch" and driver reported wrong amount of blocks, then there is a good chance that it will recover on next load. ( Honestly, I've never seen such problem in my life. ) Driver cannot change the volume size (qcow2 file) by itself, with only one exception when it is asking "TRIM", but it is a different story. Next time when the problem happens, I would suggest reading the volume size during run-time by checking disk geometry with "info qtree" and later on by checking qcow2 file size with "qemu-img info". Best, Vadim. reproduced, fixed and verified downstream https://bugzilla.redhat.com/show_bug.cgi?id=1890810 |