Bug 470678 - No Drives Found if run inside QEMU
No Drives Found if run inside QEMU
Status: CLOSED WORKSFORME
Product: Fedora
Classification: Fedora
Component: hal (Show other bugs)
10
All Linux
medium Severity medium
: ---
: ---
Assigned To: Richard Hughes
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-11-08 15:06 EST by Paul Bolle
Modified: 2013-01-09 23:54 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-11-18 07:30:40 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
screenshot of qemu cow in anaconda (462.77 KB, image/png)
2008-11-10 14:31 EST, Tom "spot" Callaway
no flags Details
log of the (stderr!) output of hald-generate-fdi-cache --verbose (9.89 MB, text/plain)
2008-11-11 04:13 EST, Paul Bolle
no flags Details

  None (edit)
Description Paul Bolle 2008-11-08 15:06:34 EST
Description of problem:
Anaconda doesn't seem to like the hard drive QEMU presents it.

Version-Release number of selected component (if applicable):
anaconda-11.4.1.40-1

How reproducible:
Always

Steps to Reproduce:
1. Launch F10-Beta-i686-Live.iso in QEMU with some (newly made) cow file as a hard disk
2. Do a liveinst --text
3. Notice no drive available in the "Partitioning Type" screen.
  
Actual results:
The few things I tried at the "Partitioning Type" screen all led to the "No Drives Found" error screen. See partedUtlis.py:

    def checkNoDisks(self):
        """Check that there are valid disk devices."""
        if len(self.disks.keys()) == 0:
            self.anaconda.intf.messageWindow(_("No Drives Found"),
                               _("An error has occurred - no valid devices were "
                                 "found on which to create new file systems. "
                                 "Please check your hardware for the cause "
                                 "of this problem."))
            return True 
        return False

Expected results:
liveinst (i.e. anaconda) allows me to install F10 to the drive QEMU presents it (an empty cow file).

Additional info:
I'm not at all sure what triggers this. I did notice that sfdisk is quite happy with what QEMU presents it as a hard disk. But even after having sfdisk partition the (previously) empty hard disk, anaconda still didn't accept it.
Comment 1 Jeremy Katz 2008-11-08 16:21:44 EST
Can you run parted against the drive and see what it shows?  Also, what's the output of lshal?
Comment 2 Paul Bolle 2008-11-08 17:43:55 EST
0) Typos mine (need to work on my QEMU skills):
$ su -c "parted -l"
Model: ATA QEMU HARDDISK (scsi)
Disk /dev/sda: 6037MB
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number  Start   End     Size    Type     File system  Flags
 1      512B    51.2MB  51.2MB  primary
 2      51.2MB  6037MB  5986MB  primary               lvm

[...]
$ lshal
Could not initialise connection to hald.
Normally this means the HAL daemon (hald) is not running or not ready.
$ su -c "/etc/init.d/haldaemon status"
hald is stopped

1) This is of course after I ran sfdisk on this disk image on an earlier QEMU run (on the first run the disk image should have been empty: no mbr, no partitions, etc.).

2) I'll have a further look to see if hald can be started by hand and/or to find out why it did not start at boot.
Comment 3 Paul Bolle 2008-11-08 18:37:03 EST
0) Starting hald by hand failed.

1) For what it's worth, hald log messages (after adding --verbose=yes --use-syslog to [...]/haldaemon):
[...] mmap_cache.c:126: Regenerating fdi cache..
[...] mmap_cache.c:104: In regen_cache_cb exit_type=1, return_code=0
[...] mmap_cache.c:153: fdi cache regeneration failed!
[...] mmap_cache.c:156: fdi cache generation done
[...] mmap_cache.c:274: cache mtime is [...]

Might be more suitable for bugzilla.freedesktop.org.
Comment 4 Richard Hughes 2008-11-10 10:24:24 EST
Can you get the output of "sudo /usr/libexec/hald-generate-fdi-cache --verbose" please.
Comment 5 Paul Bolle 2008-11-10 10:43:15 EST
0) Some info I gathered before I noticed comment #4 follows first.

1) "[...] mmap_cache.c:104: In regen_cache_cb exit_type=1, return_code=0" means a timeout occurred (#define  HALD_RUN_TIMEOUT 0x1), _possibly_ because the call of "hald_runner_run_sync (NULL, "hald-generate-fdi-cache", extra_env, 60000, regen_cache_cb, NULL, NULL)" took more than 60 seconds.

2) Played with different setups (current F9 kernel, current rawhide kernel, current rawhide qemu, current svn qemu, another "host" machine): this seems host machine related.

3) An uninteresting Compaq Desktop host turns into this issue. cpuinfo:
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 6
model		: 8
model name	: AMD Athlon(tm) XP 2600+
stepping	: 1
cpu MHz		: 2131.584
cache size	: 256 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow up
bogomips	: 4265.69
clflush size	: 32
power management: ts

4) An IBM ThinkPad X41 host works (current F9 kernel, current F9 qemu). cpuinfo:
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 13
model name	: Intel(R) Pentium(R) M processor 1.60GHz
stepping	: 8
cpu MHz		: 1600.000
cache size	: 2048 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx up bts est tm2
bogomips	: 3195.02
clflush size	: 64
power management:

5) Sadly I am not really able to interpret cpuinfo in a meaningful way, but my _guess_ is that it's one of:
- don't run qemu on an underpowered CPU such as 3) above;
- qemu shouldn't perform badly on the AMD cpu of 3) above.
Comment 6 Jesse Keating 2008-11-10 13:47:33 EST
Are any of the above tests done with anything later than Beta?  We've released numerous snapshots and a Preview release since then, with much much newer code.  Have you tried with one of those?
Comment 7 Paul Bolle 2008-11-10 14:24:27 EST
(In reply to comment #6)
> Are any of the above tests done with anything later than Beta?  We've released
> numerous snapshots and a Preview release since then, with much much newer code.
>  Have you tried with one of those?

I didn't mentioned that, but yes, I tried F10-PR-i686-Live.iso (which is suppose is current) too.

I managed to (virtually, i.e. under qemu) install with that release on a IBM ThinkPad X41 but still ran into the same problems on the Compaq Desktop.
Comment 8 Tom "spot" Callaway 2008-11-10 14:29:45 EST
On a virt enabled F10 x86_64 host system:

qemu-img create -f cow f10-i686.img 6G
sudo qemu-kvm -m 512M -cdrom /home/spot/F10-PR-i686-Live.iso -hda /home/spot/f10-i686.img -boot d

Sees the cow disk fine when I call liveinst --text. Are you sure you have your qemu command line correct?
Comment 9 Tom "spot" Callaway 2008-11-10 14:31:32 EST
Created attachment 323106 [details]
screenshot of qemu cow in anaconda
Comment 10 Paul Bolle 2008-11-10 14:51:53 EST
(In reply to comment #8)
>  Are you sure you have your qemu command line correct?

As sure as I could be.

Just the last ten command lines on this machine (running rawhide on the Compaq Desktop) were:

$ grep qemu ~/.bash_history | grep -- -hda | tail -10
~/.sandbox/qemu/i386-softmmu/qemu -L ~/.sandbox/qemu/pc-bios/ -net none -hda F10.cow -cdrom /dev/scd1 -boot d -m 448 -curses
~/.sandbox/qemu/i386-softmmu/qemu -L ~/.sandbox/qemu/pc-bios/ -net none -hda F10.cow -cdrom /dev/scd1 -boot d -m 448 
~/.sandbox/qemu/i386-softmmu/qemu -L ~/.sandbox/qemu/pc-bios/ -net none -hda F10.cow -cdrom /dev/scd1 -boot d -m 448 -curses
qemu  -hda F10.cow -cdrom /dev/scd1 -boot d -m 448 -hdb hdb.img 
qemu  -hda F10.cow -cdrom /dev/scd1 -boot d -m 448 -hdb hdb.img 
qemu -net none -hda F10.cow -cdrom Download/F10-PR-i686-Live.iso -boot d -m 448
qemu -net none -hda F10.cow -cdrom Download/F10-Beta-i686-Live.iso -boot d -m 448
qemu -net none -hda F10.cow -cdrom Download/F10-PR-i686-Live.iso -boot d -m 448
~/.sandbox/qemu/i386-softmmu/qemu -L ~/.sandbox/qemu/pc-bios/ -hda F10.cow -cdrom Download/F10-PR-i686-Live.iso -boot d -m 448
/usr/bin/qemu -hda F10.cow -cdrom Download/F10-PR-i686-Live.iso -boot d -m 448

$ qemu-img info F10.cow 
image: F10.cow
file format: qcow2
virtual size: 5.6G (6037355520 bytes)
disk size: 44K
cluster_size: 4096
Comment 11 Paul Bolle 2008-11-10 14:52:45 EST
(In reply to comment #9)
> Created an attachment (id=323106) [details]
> screenshot of qemu cow in anaconda

Looks just like what I saw on the (more powerful?) IBM ThinkPad X41.
Comment 12 Tom "spot" Callaway 2008-11-10 15:01:11 EST
It looks like you're using a custom build of qemu... have you tried the Fedora provided one? What's the base OS on that old Compaq?
Comment 13 Paul Bolle 2008-11-10 15:36:11 EST
(In reply to comment #12)
> It looks like you're using a custom build of qemu... have you tried the Fedora
> provided one?

I tried both qemu from its svn and current qemu from rawhide (i.e. qemu-0.9.1-10.fc10.i386). Same result.

> What's the base OS on that old Compaq?

$ cat /etc/redhat-release 
Fedora release 9.93 (Rawhide)

Note that I tried F9 (and its latest qemu) too on that old Compaq (also didn't work). However, F9 and its latest qemu worked flawless on the IBM ThinkPad X41. All of which makes me believe this is a machine related issue (i.e. qemu runs too slow on the old Compaq, leading to a haldaemon/dbus timeout in the live CD).
Comment 14 Paul Bolle 2008-11-10 16:35:01 EST
(In reply to comment #4)
> Can you get the output of "sudo /usr/libexec/hald-generate-fdi-cache --verbose"
> please.

This generated 10+ MB of output (well over 100k lines). Are you sure?
Comment 15 Paul Bolle 2008-11-10 16:48:09 EST
Looking at the timestamps at the start of each line that fdi-cache (whatever that is) took over 10 minutes to generate (which does support my reading of the code - a timeout after 60 seconds - in comment #5).
Comment 16 Richard Hughes 2008-11-11 03:57:58 EST
I just wanted to see the timestamps. It very much looks like the cache generation takes more than 60 seconds. On my system this takes 4 seconds! I'm not sure that increasing the timeout past 60 seconds is a good idea, I would prefer to find out why this command is taking so long on your system.
Comment 17 Paul Bolle 2008-11-11 04:13:28 EST
Created attachment 323157 [details]
log of the (stderr!) output of hald-generate-fdi-cache --verbose

(In reply to comment #16)
> I just wanted to see the timestamps. It very much looks like the cache
> generation takes more than 60 seconds. On my system this takes 4 seconds! I'm
> not sure that increasing the timeout past 60 seconds is a good idea, I would
> prefer to find out why this command is taking so long on your system.

4 seconds while running in qemu? Anyway here's the log.
Comment 18 Paul Bolle 2008-11-11 05:02:49 EST
0) liveinst --text succeeded after:
  - running /usr/libexec/hald-generate-fdi-cache
  - runing /etc/init.d/haldaemon start

lshal then showed what (I guess) it's supposed to show.

1) Not sure what, if anything, needs to be changed to resolve corner cases like this (except that it would be nice if hald-generate-fdi-cache wouldn't take about 100 times as long under qemu).

It might be an idea to have liveinst check at the start whether haldaemon is running or lshal returns something useful, etc. instead of noticing halfway through that the installation can't continue.
Comment 19 Bug Zapper 2008-11-26 00:03:38 EST
This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 20 Bug Zapper 2009-11-18 03:48:07 EST
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 21 Paul Bolle 2009-11-18 07:30:40 EST
0) I don't own the PC I originally reported this for anymore.

1) This can't be reproduced (with qemu-system-x86-0.11.0-11.fc13.i686, ie current Rawhide, booting Fedora-12-i686-Live.iso) on the least powerful PC I currently have running Fedora. That's an IBM ThinkPad T41, cpuinfo:
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 9
model name	: Intel(R) Pentium(R) M processor 1700MHz
stepping	: 5
cpu MHz		: 1700.000
cache size	: 1024 KB
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr mce cx8 mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 tm pbe up bts est tm2
bogomips	: 3388.60
clflush size	: 64
cache_alignment	: 64
address sizes	: 32 bits physical, 32 bits virtual
power management:

2) Did anyone ever discover why /usr/libexec/hald-generate-fdi-cache took so long on that (virtual) machine (see comment #16)?

3) Anyway, marking as CLOSED (WORKSFORME). WORKSFORME is used as an alias for DONTKNOWABETTERRESOLUTIONFORTHISREPORT.

Note You need to log in before you can comment on or make changes to this bug.