Bug 1836828 - @lymouthd: ply-event-loop.c:913: ply_event_loop_watch_for_timeout: Assertion `seconds > 0.0' failed.
Summary: @lymouthd: ply-event-loop.c:913: ply_event_loop_watch_for_timeout: Assertion ...
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: plymouth
Version: 8.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: 8.0
Assignee: Ray Strode [halfline]
QA Contact: Desktop QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-05-18 09:26 UTC by Olivier LAHAYE
Modified: 2020-05-23 19:49 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)
the tracelog of the crash (68.85 KB, text/plain)
2020-05-18 09:26 UTC, Olivier LAHAYE
no flags Details
full dmesg (33.41 KB, text/plain)
2020-05-18 11:10 UTC, Olivier LAHAYE
no flags Details

Description Olivier LAHAYE 2020-05-18 09:26:55 UTC
Created attachment 1689553 [details]
the tracelog of the crash

Description of problem:

Plymouth crash in initramfs with error:
@lymouthd: ply-event-loop.c:913: ply_event_loop_watch_for_timeout: Assertion `seconds > 0.0' failed.

Version-Release number of selected component (if applicable):
plymouth-0.9.3-15.el8.x86_64


How reproducible:
100%


Steps to Reproduce:
1.systemctl start Plymouth-start
2.
3.

Actual results:
Failure


Expected results:
Theme displayed

Additional info:
This log is with system imager script theme, but if I replace the theme with default theme, the promlem seems to remain.
If log is not sufficient I can provide the specific initrd and how to test it.
Note that this occurs when running in qemu (v4.5 and 5.0.0).
The initrd is generated from CentOS-8.1.1911 (Core)
using plymouth-0.9.3-15.el8.x86_64

Also note that /sys/devices/virtual/drm/ttm/uevent contains:
DEVTYPE=ttm
Plymouth tries to access /run/udev/data/+drm:ttm which doesn't exists and fails ter after.

Also a strange thing is that Plymouth tries to find theme files in /var/run/plymouth/themes/<themename> while it is located in /usr/share/plymouth/themes/<themename>

Comment 1 Olivier LAHAYE 2020-05-18 11:09:58 UTC
The dmesg.log where you can see the gfx driver initialization:
[    5.514413] fb: switching to bochsdrmfb from EFI VGA
[    5.535535] Console: switching to colour dummy device 80x25
[    5.535706] bochs-drm 0000:00:01.0: vgaarb: deactivate vga console
[    5.535815] [drm] Found bochs VGA, ID 0xb0c5.
[    5.535818] [drm] Framebuffer size 16384 kB @ 0xc0000000, mmio @ 0xc1042000.
[    5.542528] [TTM] Zone  kernel: Available graphics memory: 500358 kiB
[    5.542532] [TTM] Initializing pool allocator
[    5.542536] [TTM] Initializing DMA pool allocator
[    5.545355] [drm] Found EDID data blob.
[    5.545740] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:01.0 on minor 0
[    5.569541] fbcon: DRM emulated (fb0) is primary device
[    5.570853] Console: switching to colour frame buffer device 128x48
[    5.576661] bochs-drm 0000:00:01.0: fb0: DRM emulated frame buffer device

Comment 2 Olivier LAHAYE 2020-05-18 11:10:56 UTC
Created attachment 1689569 [details]
full dmesg

The full dmesg

Comment 3 Olivier LAHAYE 2020-05-23 13:08:44 UTC
Bug also reported here: https://gitlab.freedesktop.org/plymouth/plymouth/-/issues/112
I4ve given full test cases with material to reproduce the problem.

Comment 4 Olivier LAHAYE 2020-05-23 15:17:19 UTC
When doing: plymouth --debug
The Plymouth log is:
00:16:52.991 main.c:854:plymouth_should_show_default_splash                : checking if plymouth should show default splash
00:16:52.992 main.c:882:plymouth_should_show_default_splash                : using default splash because kernel command line has option "splash"
00:16:52.994 ply-device-manager.c:886:create_devices_from_terminals        : checking for consoles
00:16:52.994 ply-device-manager.c:664:add_consoles_from_file               : opening /sys/class/tty/console/active
00:16:52.995 ply-device-manager.c:672:add_consoles_from_file               : reading file
00:16:52.996 ply-device-manager.c:710:add_consoles_from_file               : console /dev/tty7 found!
00:16:52.997 ply-device-manager.c:486:watch_for_udev_events                : watching for udev graphics device add and remove events
00:16:52.998 ply-device-manager.c:346:create_devices_for_subsystem         : creating objects for drm devices

Then:
@lymouthd: ply-event-loop.c:913: ply_event_loop_watch_for_timeout: Assertion `seconds > 0.0' failed.

When looking at dmesg:
access("/sys/devices/virtual/drm/ttm/uevent", F_OK) = 0
openat(AT_FDCWD, "/sys/devices/virtual/drm/ttm/uevent", O_RDONLY|O_CLOEXEC) = 10
fstat(10, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
fstat(10, {st_mode=S_IFREG|0644, st_size=4096, ...}) = 0
read(10, "DEVTYPE=ttm\n", 4096)         = 12
read(10, "", 4096)                      = 0
close(10)                               = 0
readlinkat(AT_FDCWD, "/sys/devices/virtual/drm/ttm/subsystem", "../../../../class/drm", 99) = 21
openat(AT_FDCWD, "/run/udev/data/+drm:ttm", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)

Can't it be a missing dev rule?

Comment 5 Olivier LAHAYE 2020-05-23 19:49:46 UTC
BUG Tracked down; please fix.
Looking at where the problem seems to occur, I suspected that the problem was here:
in src/libply-splash-core/ply-device-manager.c at line 974 ply_event_loop_watch_for_timeout (in ply_device_manager_watch_devices() function))
Indeed, the console is correctly setup and this occurs at line 954 of that file, and there after, there is a specific ifdef HAVE_UDEV
According to the ABRT error, the device_timout variable which is a double is incorrect (not > 0.0). So either it is equal 0.0 (not initialized) or NaN or corrupted and seen as negative.

ply_device_manager_watch_devices() is called from main.c:1093 in load_devices(). The timeout value comes from state->device_timeout.
BUT at line 2241 in main(), state.device_timeout is set to NAN and at line 2278, the load_devices() is called with &state parameter!!!!
So, between line 2241 where state.device_timeout is set to NAN and line 2278 where it is used, the state.device_timout MUST be set to something > 0.0...

At line 2257, find_distribution_default_splash(&state) is called
in find_distribution_default_splash() line 428, load_settings is called
In my settings file initrd:/usr/share/plymouth/plymouthd.defaults I have the following content:
# SystemImager is the default theme
[Daemon]
Theme=systemimager
ShowDelay=0
# DeviceTimeout=5

in main.c:310, we have: timeout_string = ply_key_file_get_value (key_file, "Daemon", "DeviceTimeout"); 
Below, timeout_string is checked against NULL.
As the trace "Device timout is set to ...." is not in the log, This means that timeout _string is NULL.
And IT IS as ply_key_file_get_value() returns NULL if ply_key_file_get_raw_value() returns NULL, and ply_key_file_get_raw_value() returns NULL if group ("Daemon" here) is not found. (and it is not found as it is commented out)

THUS, state->device_timeout is still NAN value!!! BUG #1 (no default value AFAICT)

IMHO, there should be a else statement at line 317 that would set state->device_timeout to a default value (maybe 5.0) ?

BUG #2: at line 313 the result of atof() is not checked against value of 0.0 while the  ply_event_loop_watch_for_timeout() require NON-NULL double.

WARNING #3: at line 313: strtod() should be used instead of atof() according to atof manual. More over, you don't take care of locale LC_NU MERIC! (use ply_key_file_get_double() which does instead)

BUG #4: at line 310: you should use state->device_timeout = ply_key_file_get_double (key_file, "Daemon", "DeviceTimeout", 5.0);
 By that way, a default value would exists. Be careful, if strtod() fails in ply_key_file_get_double(), a value of 0.0 is returned. A test against this invalid value for DeviceTimeout MUST exists!
 (also you can have NaN in config file that atof() know how to deal with. So MAKE TESTS!

BUG #5: Same problem for ShowDelay (though, I assume value of 0 is acceptable)

BUG #6: Same problem for DeviceScale (but there is no ply_key_file_get_unsigned_long() and surtout() provides 0 as default value)

Updating settings with this content allows Plymouthd to start:
# SystemImager is the default theme
[Daemon]
Theme=systemimager
ShowDelay=0.1
DeviceTimeout=2.0

IMHO, in order to fix things:

1 - replace lines 296 to 304 with:
    state->splash_delay = ply_key_file_get_double (key_file, "Daemon", "DeviceTimeout", 5.0); /* replace 5.0 with define PLYMOUTH_DEFAULT_SPLASH_DELAY (hardcoded value is BAD) */
    if (!(state->splash_delay > 0.0)) {
        ply_trace ("Splash delay is invalid. defaulting to %d", PLYMOUTH_DEFAULT_SPLASH_DELAY);
	state->splash_delay= PLYMOUTH_DEFAULT_SPLASH_DELAY;
    }

2 - same operation for lines 308 to 317 regarding DeviceTimeout test must use ">= instead of > if zero is acceptable)

3 - Maybe check that what surtout() line 323 returns something that Plymouth can handle.

Please fix.


Note You need to log in before you can comment on or make changes to this bug.