Bug 216243
Summary: | xm create fails after 8 VM's are running | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | George Toft <george> |
Component: | xen | Assignee: | Xen Maintainance List <xen-maint> |
Status: | CLOSED NOTABUG | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 5.0 | CC: | ddomingo |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-02-26 19:04:36 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 197865 |
Description
George Toft
2006-11-17 22:43:38 UTC
Please confirm the arch (currently labeled ia64 only). We are using RHEL5 on Dell 2950 server. It may NOT be IA64 specific as the following losetup commands fail in x86 as well. Further troubleshooting shows losetup (as found in the xen scripts) fails. once all 8 VM's are up, issuing losetup -f yields: # losetup -f losetup: could not find any free loop device # losetup /dev/loop8 loop: can't open device /dev/loop8: No such device or address [root@xenhvip xen]# ls -l /dev/loop* brw-r----- 1 root disk 7, 0 Nov 20 10:38 /dev/loop0 brw-r----- 1 root disk 7, 1 Nov 20 10:38 /dev/loop1 brw-r----- 1 root disk 7, 2 Nov 20 10:38 /dev/loop2 brw-r----- 1 root disk 7, 3 Nov 20 10:38 /dev/loop3 brw-r----- 1 root disk 7, 4 Nov 20 10:38 /dev/loop4 brw-r----- 1 root disk 7, 5 Nov 20 10:38 /dev/loop5 brw-r----- 1 root disk 7, 6 Nov 20 10:38 /dev/loop6 brw-r----- 1 root disk 7, 7 Nov 20 10:38 /dev/loop7 brw-r----- 1 root disk 7, 8 Nov 20 10:38 /dev/loop8 brw-r----- 1 root disk 7, 9 Nov 20 10:38 /dev/loop9 # The device is there, but system does not recognize it. /etc/udev/makedev.d/50-udev.nodes was modified to include loop8 and loop9 at line 16. /boot/grub/grub.conf was modified to include max_loop=64 and the system was rebooted. Grepping dmesg for "loop" shows the kernel command line and a comment that the kernel is limited to a max of 8 devices: # dmesg | grep loop Kernel command line: ro root=/dev/VolGroup00/LogVol00 max_loop=64 loop: loaded (max 8 devices) # File /etc/modules.conf was created with the following contents: options loop max_loop=64 and system was rebooted. Problem still exists. I have confirmed this behaviour - it appears each virtual disk configured for an HVM guest uses a single loopback device. 1 disk per guest x 8 guests & you'll hit the loopback driver limit. Now, the interesting question is *why* are these loopback devices getting created at all. the qemu-dm device model for HVM guests is perfectly happy accessing the raw files directly - it has no need for the loopback device. Looking at a running guest shows no process actually using the loopdevice # grep disk /etc/xen/demo disk = [ "file:/xen/demo.img,hda,w", "file:/root/boot.iso,hdc:cdrom,r" ] [root@localhost ~]# ps -axuwf | grep loop root 18631 0.0 0.0 0 0 ? S< 13:32 0:00 [loop0] root 18673 0.0 0.0 0 0 ? S< 13:32 0:00 [loop1] [root@localhost ~]# lsof /dev/loop0 [root@localhost ~]# lsof /dev/loop1 [root@localhost ~]# lsof /root/boot.iso COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME qemu-dm 18551 root 6u REG 253,0 6711296 1933665 /root/boot.iso [root@localhost ~]# lsof /xen/demo.img COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME qemu-dm 18551 root 5u REG 253,0 4294967297 1277954 /xen/demo.img Here is output on target machine showing all 8 loopback devices allocated: # ps -ef | egrep "loop|xen" root 21 19 0 10:38 ? 00:00:00 [xenwatch] root 22 19 0 10:38 ? 00:00:00 [xenbus] avahi 2696 1 0 10:39 ? 00:00:00 avahi-daemon: running [xenhvip.local] root 2963 1 0 10:39 ? 00:00:00 xenstored --pid- file /var/run/xenstore.pid root 2967 1 0 10:39 ? 00:00:00 python /usr/sbin/xend start root 2969 2967 0 10:39 ? 00:00:02 python /usr/sbin/xend start root 2971 1 0 10:39 ? 00:00:00 xenconsoled root 3679 2969 0 10:40 ? 00:00:09 /usr/lib64/xen/bin/qemu-dm -d 1 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen9 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:44,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 3861 1 0 10:40 ? 00:00:00 [loop0] root 3913 2969 0 10:40 ? 00:00:08 /usr/lib64/xen/bin/qemu-dm -d 2 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen8 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:42,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 4038 1 0 10:40 ? 00:00:00 [loop1] root 4059 2969 0 10:40 ? 00:00:10 /usr/lib64/xen/bin/qemu-dm -d 3 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen7 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:38,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 4194 1 0 10:40 ? 00:00:00 [loop2] root 4273 2969 0 10:41 ? 00:00:11 /usr/lib64/xen/bin/qemu-dm -d 4 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen6 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:36,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 4399 1 0 10:41 ? 00:00:00 [loop3] root 4426 2969 0 10:41 ? 00:00:11 /usr/lib64/xen/bin/qemu-dm -d 5 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen5 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:36,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 4554 1 0 10:41 ? 00:00:00 [loop4] root 4593 2969 0 10:41 ? 00:00:11 /usr/lib64/xen/bin/qemu-dm -d 6 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen4 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:1d,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 4742 1 0 10:41 ? 00:00:00 [loop5] root 4768 2969 0 10:41 ? 00:00:09 /usr/lib64/xen/bin/qemu-dm -d 7 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen3 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:32,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 4923 1 0 10:41 ? 00:00:00 [loop6] root 4948 2969 0 10:41 ? 00:00:09 /usr/lib64/xen/bin/qemu-dm -d 8 -m 500 -boot c -serial pty -vcpus 1 -acpi -domain-name win2k3xen2 -net nic,vlan=1,macaddr=00:16:3e:4b:f8:30,model=rtl8139 -net tap,vlan=1,bridge=xenbr0 -vncunused -k en-us -vnclisten 0.0.0.0 root 5110 1 0 10:41 ? 00:00:00 [loop7] root 6547 4207 0 12:41 pts/5 00:00:00 egrep loop|xen # # lsof | grep loop loop0 3861 root cwd DIR 253,0 4096 2 / loop0 3861 root rtd DIR 253,0 4096 2 / loop0 3861 root txt unknown /proc/3861/exe loop1 4038 root cwd DIR 253,0 4096 2 / loop1 4038 root rtd DIR 253,0 4096 2 / loop1 4038 root txt unknown /proc/4038/exe loop2 4194 root cwd DIR 253,0 4096 2 / loop2 4194 root rtd DIR 253,0 4096 2 / loop2 4194 root txt unknown /proc/4194/exe loop3 4399 root cwd DIR 253,0 4096 2 / loop3 4399 root rtd DIR 253,0 4096 2 / loop3 4399 root txt unknown /proc/4399/exe loop4 4554 root cwd DIR 253,0 4096 2 / loop4 4554 root rtd DIR 253,0 4096 2 / loop4 4554 root txt unknown /proc/4554/exe loop5 4742 root cwd DIR 253,0 4096 2 / loop5 4742 root rtd DIR 253,0 4096 2 / loop5 4742 root txt unknown /proc/4742/exe loop6 4923 root cwd DIR 253,0 4096 2 / loop6 4923 root rtd DIR 253,0 4096 2 / loop6 4923 root txt unknown /proc/4923/exe loop7 5110 root cwd DIR 253,0 4096 2 / loop7 5110 root rtd DIR 253,0 4096 2 / loop7 5110 root txt unknown /proc/5110/exe # It has thus far proved to be impractical to stop HVM guests using loop devies,
thus we need to make sure max_loop is working as advertised
Wrt to the comment:
> File /etc/modules.conf was created with the following contents:
> options loop max_loop=64
> and system was rebooted. Problem still exists.
I cannot reproduce this behaviour. I added 'max_loop=256' to modprobe.conf,
rebooted & it configured 256 loop devices without problems:
# grep loop /etc/modprobe.conf
options loop max_loop=256
# dmesg | grep loop
loop: loaded (max 256 devices)
# ls /dev/loop* | wc -l
256
Please try just adding the modprobe.conf setting, without changing udev, or grub
configs.
Re-reading it appears you used the wrong config file for specifying the loop device parameters. 'modules.conf' is obsolete & hasn't been used for many years now (for compatability it was a symlink to modprobe.conf for a while too, but even that's gone now). Please re-test with 'max_loop=256' in modprobe.conf instead. As requested . . . 50-udev.nodes restored to original setting grub.conf restored to original setting modprobe.conf line added modules.conf deleted. (Yes, I used the wrong config file - sorry) Last login: Mon Nov 20 14:51:10 2006 from 192.168.111.1 [root@rhel5 ~]# grep loop /etc/modprobe.conf options loop max_loop=256 [root@rhel5 ~]# dmesg | grep loop [root@rhel5 ~]# ls /dev/loop* | wc -l 8 [root@rhel5 ~]# modprobe loop [root@rhel5 ~]# ls /dev/loop* | wc -l 256 [root@rhel5 ~]# This command now works (it did not previously): [root@rhel5 ~]# for I in `seq 0 255`; do dd if=/dev/zero of=/tmp/file$I bs=1k count=10; losetup /dev/loop$I /tmp/file$I; done Using lsof to validate: [root@rhel5 ~]# lsof | grep loop | wc -l 768 [root@rhel5 ~]# Rebooted and ran following commands: [root@rhel5 ~]# lsof | grep loop | wc -l 0 [root@rhel5 ~]# for I in `seq 0 255`; do dd if=/dev/zero of=/tmp/file$I bs=1k count=10; losetup /dev/loop$I /tmp/file$I; done [root@rhel5 ~]# lsof | grep loop | wc -l 768 [root@rhel5 ~]# Problem resolved. Thank you very much. George Ok, great. So sounds like we just need to add documentation about adding 'max_loop=64' (or larger) to modprobe.conf if you want more than 8 file backed disks for HVM guests. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion. Considering our initial error in editing /etc/modules.conf, would it make sense to create that file with contents similar to this: # NOTE *** NOTE *** NOTE *** NOTE # # Use of modules.conf is obsolete. THE CONTENTS OF THIS FILE IS IGNORED. # Please edit modprobe.conf to pass parameters to modules. # For more information, please view the modprobe.conf manpage. # # NOTE *** NOTE *** NOTE *** NOTE |