Bug 2317127

Summary: make_image hangs forever for larger images
Product: [Fedora] Fedora Reporter: Doug Magee <djmagee>
Component: loraxAssignee: Brian Lane <bcl>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 40CC: anaconda-maint, bcl, reallylongword
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2025-04-15 21:12:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Doug Magee 2024-10-08 00:46:50 UTC
Following the instructions here: https://fedoraproject.org/wiki/Livemedia-creator-_How_to_create_and_use_a_Live_CD, livemedia-creator hangs forever on the line after "Processing logs from...", in the example output on this page it would be the line "Starting automated", so...
^C^CTraceback (most recent call last):
  File "/usr/lib/python3.12/site-packages/pylorax/installer.py", line 461, in novirt_install
    for line in execReadlines("unshare", unshare_args, reset_lang=False,
  File "/usr/lib/python3.12/site-packages/pylorax/executils.py", line 344, in __next__
    time.sleep(0.5)
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/sbin/livemedia-creator", line 229, in <module>
    main()
  File "/usr/sbin/livemedia-creator", line 212, in main
    (result_dir, disk_img) = run_creator(opts)
                             ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/pylorax/creator.py", line 680, in run_creator
    disk_img = make_image(opts, ks, cancel_func=cancel_func)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/pylorax/creator.py", line 482, in make_image
    novirt_install(opts, disk_img, disk_size, cancel_func=cancel_func, tar_img=tar_img)
  File "/usr/lib/python3.12/site-packages/pylorax/installer.py", line 501, in novirt_install
    log_monitor.shutdown()
  File "/usr/lib/python3.12/site-packages/pylorax/monitor.py", line 199, in shutdown
    self.server_thread.join()
  File "/usr/lib64/python3.12/threading.py", line 1149, in join
    self._wait_for_tstate_lock()
  File "/usr/lib64/python3.12/threading.py", line 1169, in _wait_for_tstate_lock
    if lock.acquire(block, timeout):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt

As you see, it's stuck in make_image.  Loop devices exist and that part seems fine, and after a bit of fiddling i discover the ulimit -l is too small for my intended image size.  Run 'ulimit -l unlimited', and next call to livemedia-creator with the same ks file works as expected.                                                             

Reproducible: Always

Steps to Reproduce:
1. Follow steps at https://fedoraproject.org/wiki/Livemedia-creator-_How_to_create_and_use_a_Live_CD, using up to date Fedora 40 on x86_64 as host and target
2. use a custom kickstart file with a root partition >= 11520M
3. run livemedia-creator as shown at the HOWTO
Actual Results:  
hangs forever

Expected Results:  
friendly failure with message pointing to the need for more resources that current ulimit -l allows for desired disk image size

Comment 1 Brian Lane 2024-10-08 18:23:08 UTC
I'm not sure what lmc could do here. It's not getting an error from the system so it's really hard to tell the difference between running slow and stuck. I'm open to suggestions though.

Comment 2 Doug Magee 2024-10-09 05:19:07 UTC
It does seem odd that there isn't an OOM or some such error from the system.  Absent an error, the only option would be including this in a pre-call sanity check.  But i don't know enough about how this is implemented to say what values are needed for what conditions.

The image is created in /var/lmc, and i don't see that being mounted as tmpfs anywhere.  So i'd assume the make_image process is using mmap and that's why the locked memory limit has any effect at all?

Comment 3 Brian Lane 2024-10-09 17:19:16 UTC
You really don't want to use tmpfs for image building like this, unless you have a lot of spare ram :) It may be related to the logs returned from anaconda, that's the bit of code where it seems to be stuck, but even with lots of packages it shouldn't generate enough to exhaust memory. Can you look at the raw anaconda logs in /tmp/ and see if there is anything suspicious near the end of them? Or in the system's journalctl -e output for clues.