Bug 1669266
| Summary: | LVM2 PV scan hangs 50-70% of boots | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Orion Poplawski <orion> | ||||
| Component: | lvm2 | Assignee: | Peter Rajnoha <prajnoha> | ||||
| lvm2 sub component: | Activating existing Logical Volumes | QA Contact: | cluster-qe <cluster-qe> | ||||
| Status: | CLOSED INSUFFICIENT_DATA | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | unspecified | CC: | agk, heinzm, jbrassow, msnitzer, nsoffer, orion, prajnoha, teigland, zkabelac | ||||
| Version: | 7.6 | ||||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-11-03 12:56:34 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Orion Poplawski
2019-01-24 19:29:41 UTC
systemd makes it very difficult to get any lvm debugging information, but Peter knows some tricks to get some info. The warning: "WARNING: Device /dev/sda1 not initialized in udev database even after waiting 10000000 microseconds" ...is issued by LVM if it tries to retrieve record about a device through libudev from udev database but at the same time, the database record is not yet initialized - either it hasn't been processed at all yet or it's being processed right at that moment and the processing hasn't finished even after waiting some time. (I've already noticed that people see the same on Debian right now: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=918590). It looks that this comes from the LVM filter code where we try to read udev db to check whether certain device is not an MD or mpath component (we've probably added a line of code where we look at udev db for certain info about MD to do our filtering more correct, but I had to check the changes in code to be sure). I can't say right now why the record is not initialized even after waiting 10 seconds (the 10 million microseconds). It's possible the boot is taking a long time (too many devices or devices with slow access) and when we're trying to read the udev db records, the udev processing in the background hasn't finished yet, even after waiting... I'll try to simulate this problem and see if that's the issue here. Could you please add "debug" to kernel command line at boot? This will add more debug logging both to systemd and udev. Then you can collect the log with "journalctl -b". But if it hangs forever and you can't collect the logs, then please try rebooting until it's successful and then call "journalctl -b-1" to get logs for the previous failed boot. (In reply to Peter Rajnoha from comment #3) > The warning: > > "WARNING: Device /dev/sda1 not initialized in udev database even after > waiting 10000000 microseconds" > > ...is issued by LVM if it tries to retrieve record about a device through > libudev from udev database but at the same time, the database record is not > yet initialized - either it hasn't been processed at all yet or it's being > processed right at that moment and the processing hasn't finished even after > waiting some time. > I've tried to simulate this problem and the only way I could get to this situation is when we don't have any database record about the device yet. This may happen in two situations: - this is the first time we're processing the uevent for this device after we started udevd daemon (this happens at boot) - the udev database has been made unavailable (e.g. making /run/udev unavailable or using a different /run), this might be the case of chroot-ed or container environments where the /run with existing info from host hasn't been remounted to "guest" Anyway, please, if you could collect the logs after adding "debug" to kernel command line, that should help us here (comment #4). Thanks for all the help and suggestions. I'll give them a try as soon as the machine becomes free again, which may be a few days. Unfortunately I've not been able to get to a shell on the hung boots and journalctl -b -1 has not yielded proper logs either. I've cleaned out old journal files though so perhaps that will help. I'm afraid that the keyboard is completely unresponsive when it locks up and journalctl -b -1 is worthless, just a few random sshd entries. Journal file /var/log/journal/a1b20bef010c40939e8e5d84dd88da0f/system~ is truncated, ignoring file. -- Logs begin at Thu 2019-01-24 16:20:03 MST, end at Mon 2019-02-04 09:49:39 MST. -- Jan 24 16:20:03 smmic1.cora.nwra.com sshd[22942]: Received disconnect from 10.10.30.10 port 40356:11: disconnected by user Jan 24 16:20:03 smmic1.cora.nwra.com sshd[22942]: Disconnected from 10.10.30.10 port 40356 ... Feb 02 22:22:57 smmic1.cora.nwra.com sshd[10225]: Received disconnect from 10.11.1.6 port 34218:11: disconnected by user Feb 02 22:22:57 smmic1.cora.nwra.com sshd[10225]: Disconnected from 10.11.1.6 port 34218 Well, I setup a serial console to try to capture the output, but since enabling it I've been unable to get the machine to hang at boot. I suppose that's an acceptable workaround since it's only the one machine. Not sure if there is anything else I can provide at this point. I just heard about the same udev timeout error, but it was happening in a container (which does not properly handle udev). Does this bug have any relation to containers? (Sorry if the answer is obvious from previous comments, I've not used containers, so I'm not sure what it would look like.) This is a physical machine. Can you please attach lvm.conf from problematic machine and basic 'lvdisplay' ? Created attachment 1526888 [details]
/etc/lvm/lvm.conf
# lvdisplay
--- Logical volume ---
LV Path /dev/vg_root/scratch
LV Name scratch
VG Name vg_root
LV UUID AftvWn-up8G-VcjP-9562-iX2g-5vQX-REsgXP
LV Write Access read/write
LV Creation host, time smmic1.cora.nwra.com, 2015-05-20 16:01:29 -0600
LV Status available
# open 1
LV Size 1.29 TiB
Current LE 338617
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 4096
Block device 253:2
--- Logical volume ---
LV Path /dev/vg_root/swap
LV Name swap
VG Name vg_root
LV UUID RcDeRE-VjUZ-vFv1-4Byf-pHOo-31s8-9onVeD
LV Write Access read/write
LV Creation host, time smmic1.cora.nwra.com, 2015-05-20 16:01:36 -0600
LV Status available
# open 2
LV Size <48.83 GiB
Current LE 12500
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 4096
Block device 253:0
--- Logical volume ---
LV Path /dev/vg_root/root
LV Name root
VG Name vg_root
LV UUID wj3BPF-xEaR-tP0x-rAZL-IE0A-U6j2-8baW3R
LV Write Access read/write
LV Creation host, time smmic1.cora.nwra.com, 2015-05-20 16:01:37 -0600
LV Status available
# open 1
LV Size <24.77 GiB
Current LE 6340
Segments 3
Allocation inherit
Read ahead sectors auto
- currently set to 4096
Block device 253:1
(In reply to Peter Rajnoha from comment #5) > - the udev database has been made unavailable (e.g. making /run/udev > unavailable or using a different /run), this might be the case of chroot-ed > or container environments where the /run with existing info from host hasn't > been remounted to "guest" Peter, we seems to hit the same issue in oVirt CI, reproducing easily (80%). Our environment is: CI Slave VM running CentOS 7.6 (I don't know the exact version) Inside the slave, we run mock: ../jenkins/mock_configs/mock_runner.sh \ --execute-script automation/check-patch.tests-py27.sh \ --mock-confs-dir ../jenkins/mock_configs \ --secrets-file /home/jenkins/workspace/vdsm_standard-check-patch/std_ci_secrets.yaml \ --try-proxy --timeout-duration 3h \ --try-mirrors http://mirrors.phx.ovirt.org/repos/yum/all_latest.json \ 'el7.*x86_64' Mock is using a cache, so we may run up to 2 days old version. Inside mock, we run our lvm tests. In the tests we do stuff like: losetup -f backing-file --show # /dev/loop2 pvcreate /dev/loop2 pvs ... We use this lvmlocal.conf on in this environment: activation { udev_sync = 0 udev_rules = 0 } This solved issue when activating lvs inside mock or when running in a container in travis. These tests used to work fine 2-3 weeks ago, but when I run them today they broke because pvs was hang. Looking in the stack trace of the hang test, we see that is is hang waiting for pvs to exit. This is the pvs command we run (this is the args list passed to subprocess.Popen()): ['/usr/bin/taskset', '--cpu-list', '0-15', '/usr/sbin/lvm', 'pvs', '--config', 'devices { preferred_names=["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter=["a|^/dev/loop2$|^/dev/loop3$|^/dev/loop4$|", "r|.*|"] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min=50 retain_days=0 }', '--noheadings', '--units', 'b', '--nosuffix', '--separator', '|', '--ignoreskippedcluster', '-o', 'uuid,name,size,vg_name,vg_uuid,pe_start,pe_count,pe_alloc_count,mda_count,dev_size,mda_used_count', u'/dev/loop3'] We see this command hang, logging these warnings to stderr: WARNING: Device /dev/loop3 not initialized in udev database even after waiting 10000000 microseconds.\n', ' WARNING: Device /dev/loop2 not initialized in udev da... The command was hang for about 10 minutes until out test runner killed the hang test. You can see the hang build log here: https://jenkins.ovirt.org/job/vdsm_standard-check-patch/2554//artifact/check-patch.tests-py27.el7.x86_64/mock_logs/script/stdout_stderr.log This is the hang test: https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=tests/storage/lvm_test.py;h=8b19c8f07bb06c83a9948d081ad0c574dcb21bfc;hb=2903e279c70a4edbb061dd9a9c0a0dac3bb17f49#l218 I tried your suggestion and mounted /run/udev inside the chroot https://gerrit.ovirt.org/c/97555 This seems to fix the issue, all tests succeed again. About 20 runs were failing before and pass now. I hope it helps to debug this issue. Another interesting detail - we run the same tests in Fedora 28 and 29 containers in travis-ci. - Fedora 28 all lvms tests pass - Fedora 29 hangs in the lvm tests on every build. Here is a Fedora 29 hang build from travis: https://travis-ci.org/nirs/vdsm/jobs/480107313 In this build lvchange command was hang, in this test: https://gerrit.ovirt.org/gitweb?p=vdsm.git;a=blob;f=tests/storage/lvm_test.py;h=3289ae8ca8228fea24bb30efcaef5edda1e75315;hb=81093b351e3cefcde4058012ee8234e69bedc14b#l490 If you want to reproduce this in the same container: docker pull ovirtorg/vdsm-test-fedora-29 See how we run docker here: https://travis-ci.org/nirs/vdsm/jobs/480107313/config (In reply to Orion Poplawski from comment #8) > Well, I setup a serial console to try to capture the output, but since > enabling it I've been unable to get the machine to hang at boot. I suppose > that's an acceptable workaround since it's only the one machine. Not sure > if there is anything else I can provide at this point. I have to close this report then due to insufficient data/logs. Note: we had other reports where we saw the message "not initialized in udev database even after waiting N microseconds". Usually, this was caused by delayed udev processing which in turn could be caused by high number of devices or some device which is not accessible/responsive correctly and delays further udev processing (we'd probably see this in the debug log, but unfortunately, we don't have it). If we're not able to identify what actually causes these delays, the only thing I can recommend to keep the system booting and not timing out, is to increase the default timeouts (in systemd configuration). (The other issue that Nir mentioned is about container/chroot environments where udev was not correctly set up - attaching the /run/udev and other needed parts for udev, so this is a bit different than what was originally reported in this bug.) |