Bug 1116651 - systemd-219 breaks livecd-tools with /etc/resolv.conf handling (also breaks traditional installer images, probably other things)
Summary: systemd-219 breaks livecd-tools with /etc/resolv.conf handling (also breaks t...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: livecd-tools
Version: rawhide
Hardware: All
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Brian Lane
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1117020 (view as bug list)
Depends On:
Blocks: systemd-other-tracker
TreeView+ depends on / blocked
 
Reported: 2014-07-06 19:36 UTC by Kevin Fenzi
Modified: 2016-02-29 23:01 UTC (History)
30 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-03-17 20:55:34 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1313085 0 unspecified CLOSED Name resolution fails (resolv.conf is a broken systemd symlink) on livemedia-creator live image 2021-02-22 00:41:40 UTC

Internal Links: 1313085

Description Kevin Fenzi 2014-07-06 19:36:56 UTC
This may well need to be fixed in livecd-tools, but with systemd-215 all live image creation is broken:

DEBUG util.py:281:  Traceback (most recent call last):
DEBUG util.py:281:    File "/usr/bin/livecd-creator", line 228, in <module>
DEBUG util.py:281:      sys.exit(main())
DEBUG util.py:281:    File "/usr/bin/livecd-creator", line 212, in main
DEBUG util.py:281:      creator.configure()
DEBUG util.py:281:    File "/usr/lib/python2.7/site-packages/imgcreate/creator.py", line 735, in configure
DEBUG util.py:281:      kickstart.SelinuxConfig(self._instroot).apply(ksh.selinux)
DEBUG util.py:281:    File "/usr/lib/python2.7/site-packages/imgcreate/kickstart.py", line 478, in apply
DEBUG util.py:281:      self.relabel(ksselinux)
DEBUG util.py:281:    File "/usr/lib/python2.7/site-packages/imgcreate/kickstart.py", line 443, in relabel
DEBUG util.py:281:      f = file(path, "w+")
DEBUG util.py:281:  IOError: [Errno 2] No such file or directory: '/var/tmp/imgcreate-c2FcCX/install_root/etc/resolv.conf'
DEBUG util.py:371:  Child return code was: 1

Comment 1 Kevin Fenzi 2014-07-06 20:17:31 UTC
Also pylorax: 

writing .buildstamp file
doing post-install configuration
running runtime-postinstall.tmpl
template command error in runtime-postinstall.tmpl:
  append etc/resolv.conf 
  IOError: [Errno 2] No such file or directory: '/tmp/treebuild.20140706/output/work/x86_64/yumroot/etc/resolv.conf'
Traceback (most recent call last):
  File "/usr/bin/pungi", line 276, in <module>
    main()
  File "/usr/bin/pungi", line 158, in main
    mypungi.doBuildinstall()
  File "/usr/lib/python2.7/site-packages/pypungi/__init__.py", line 1390, in doBuildinstall
    workdir=workdir, outputdir=outputdir)
  File "/usr/lib/python2.7/site-packages/pylorax/__init__.py", line 274, in run
    rb.postinstall()
  File "/usr/lib/python2.7/site-packages/pylorax/treebuilder.py", line 132, in postinstall
    self._runner.run("runtime-postinstall.tmpl", configdir=configdir_path)
  File "/usr/lib/python2.7/site-packages/pylorax/ltmpl.py", line 181, in run
    self._run(commands)
  File "/usr/lib/python2.7/site-packages/pylorax/ltmpl.py", line 200, in _run
    f(*args)
  File "/usr/lib/python2.7/site-packages/pylorax/ltmpl.py", line 274, in append
    with open(self._out(filename), "a") as fobj:
IOError: [Errno 2] No such file or directory: '/tmp/treebuild.20140706/output/work/x86_64/yumroot/etc/resolv.conf'
<mock-chroot>++ echo -n '<mock-chroot>'

Comment 3 Dennis Gilmore 2014-07-07 16:00:54 UTC
we can not compose TC1 without this being fixed, so proposing as a blocker

Comment 4 Colin Walters 2014-07-07 19:05:13 UTC
Upstream isn't taking it at the moment, but Fedora disables systemd-resolved anyways, so we might as well carry the patch at least temporarily.  We can consider sorting out the image compose tools post TC1.

Building now:

http://koji.fedoraproject.org/koji/taskinfo?taskID=7114076

Comment 5 Lennart Poettering 2014-07-07 19:45:46 UTC
This has no place in systemd. I will remove this again as soon as TC1 is done.

The anaconda people need to clean up their stuff, this is not a systemd issue. They should not follow symlinks. They should replace whatever they find in /etc/resolv.conf.

I have no interest in making systemd the dumping ground for work-arounds like this for broken behaviour of anaconda.

Reassigning.

Anaconda folks, please, can you make anaconda properly overwrite what you find in /etc/resolv.conf if it already exists? Thanks

(And yupp, /etc/resolv.conf as symlink is not a weirdness we systemd folks made up out of thin air, it's pretty much common on most other systems, so that /etc can be read-only. I filed an RFE bug against NM to do the same, see bug 1116999).

Comment 6 Colin Walters 2014-07-07 20:29:20 UTC
Let's discuss architecture in https://bugzilla.redhat.com/show_bug.cgi?id=1116999
 for F21 final and longer term discussion, and leave this bug for the TC1 compose issue.

Comment 7 Radek Vykydal 2014-07-08 08:45:12 UTC
Actually our goal is not to touch /etc/resolv.conf at all in Anaconda. And we don't do it in Fedora/master (passing domain and dns info via ifcfg files or connection settings to NM).

What we do with the file is:

- Copy it to target system at the end of installation, the use case here was no NM on target system I think. I believe systemd + NM would take care of creating and managing the file, and perhaps systemd + systemd-networkd for non-NM case?

- (not yet, but we have a bz for it) Copy it to target system root during installation for %post rpm scriplets run in chroot.

Comment 8 Lennart Poettering 2014-07-08 13:06:54 UTC
(In reply to Radek Vykydal from comment #7)
> Actually our goal is not to touch /etc/resolv.conf at all in Anaconda. And
> we don't do it in Fedora/master (passing domain and dns info via ifcfg files
> or connection settings to NM).
> 
> What we do with the file is:
> 
> - Copy it to target system at the end of installation, the use case here was
> no NM on target system I think. I believe systemd + NM would take care of
> creating and managing the file, and perhaps systemd + systemd-networkd for
> non-NM case?
> 
> - (not yet, but we have a bz for it) Copy it to target system root during
> installation for %post rpm scriplets run in chroot.

Note sure I follow.

During installation anaconda copies /etc/resolv.conf from the installing OS into the target OS, right? This copy routine is confused by the fact that there might be a symlink around already, as it follows the destination symlink? This bit should really be fixed, it should use a copy routine that does not follow destination symlinks, but simply replaces the entire destination with the new file.

Comment 9 Adam Williamson 2014-07-08 16:17:53 UTC
remember the initial bug here concerns livecd-tools - the script we use to compose live images. It wasn't about anaconda install, although the change *does* cause installs to fail after package installation too, among other fallout.

when creating a live image we're basically creating an entire system image and Doing Stuff in it, but it's not really running. Nothing is going to be 'managing' /etc/resolv.conf inside that image at the point we're building it. AIUI, anyway.

Do we need to collect together all the release-breaking borkage this very late and unannounced major change in behaviour has caused here, in https://bugzilla.redhat.com/show_bug.cgi?id=1116999 , or somewhere else? Lennart, could you please consider reverting it until the issues have been more thoroughly considered, as right now it is very clearly causing major borkage in multiple important workflows, only in order to implement some experimental new functionality which clearly is not as important?

Comment 10 Brian Lane 2014-07-08 16:30:18 UTC
The problem here is that systemd is dropping in a broken symlink with no consideration for the circumstances. If it wants to manage it as part of their networkd thing then creation of the symlink (and target file) needs to be conditional on that service/feature/whatever.

As far as I can tell this change wasn't tested at all and needs to be reverted (either upstream or in fedora) until such time as it actually works.

Also, comments like "broken behaviour of anaconda" don't help anyone. YOU are the one who changed, not anaconda, lorax or livecd-tools.

Comment 11 Lennart Poettering 2014-07-08 18:23:52 UTC
Adam, Colin already patched the systemd version to my knowledge for now. But this is only a temporary thing, the livecd/anaconda/lorax folks really should fix their stuff and not follow destination symlinks when they copy in their own resolv.conf.

Making /etc/resolv.conf a symlink to /run is hardly something surprising, and is what NM really should be doing too. 

"bcl", whoever you are [would be great if at least RH folks would use real names on bugzilla, btw, we are not 16...], just reassigning this back to systemd won't make the thing go away, we only applied a fedora-specific hack, this will come around again for 22 the latest.

Comment 12 David Shea 2014-07-08 18:30:25 UTC
(In reply to Lennart Poettering from comment #11)
> Making /etc/resolv.conf a symlink to /run is hardly something surprising,

Newsflash: everyone is surprised! Maybe you're the problem here!

Comment 13 Vít Ondruch 2014-07-08 20:17:43 UTC
*** Bug 1117020 has been marked as a duplicate of this bug. ***

Comment 14 Lennart Poettering 2014-07-09 10:41:15 UTC
David Shea, thank you for that! Much appreciated!

Comment 15 Tim Flink 2014-07-09 18:32:52 UTC
Discussed at the 2014-07-09 Fedora 21 alpha blocker review meeting. As the livecds are fixed for now, this isn't a release blocking bug for Fedora 21 alpha.

If the livecd problems reappear, please re-propose as a blocker.

Comment 16 Adam Williamson 2014-07-10 04:42:04 UTC
tflink: the problem 'disappeared' because the change was temporarily reverted in systemd. but if it gets re-applied in systemd without livecd-tools and anaconda being updated, we're going to hit this again.

Comment 17 Radek Vykydal 2014-07-10 10:54:44 UTC
(In reply to Lennart Poettering from comment #8)
 
> Note sure I follow.
> 
> During installation anaconda copies /etc/resolv.conf from the installing OS
> into the target OS, right? This copy routine is confused by the fact that
> there might be a symlink around already, as it follows the destination
> symlink? This bit should really be fixed, it should use a copy routine that
> does not follow destination symlinks, but simply replaces the entire
> destination with the new file.

Yes, we can make the copy routine more robust.

As for lorax failing to create empty /etc/resolv.conf on the symlink, I believe we can stop creating the file and let NM do it for us in installer environment.

Comment 18 Colin Walters 2014-07-10 15:06:01 UTC
(In reply to Radek Vykydal from comment #17)

> Yes, we can make the copy routine more robust.
> 
> As for lorax failing to create empty /etc/resolv.conf on the symlink, I
> believe we can stop creating the file and let NM do it for us in installer
> environment.

I'm fine with that, though then the ball moves to NM to clean it up.

Comment 19 Adam Williamson 2015-02-19 22:24:41 UTC
systemd-219 removes the workaround for this (thanks for the notice, folks, not like we're trying to build Alpha TC1 or anything) and hence breaks all 22 and Rawhide live composes. e.g. the last TC attempt: http://koji.fedoraproject.org/koji/taskinfo?taskID=8998028 , https://kojipkgs.fedoraproject.org//work/tasks/8028/8998028/mock_output.log :

Traceback (most recent call last):
  File "/usr/bin/livecd-creator", line 236, in <module>
    sys.exit(main())
  File "/usr/bin/livecd-creator", line 213, in main
    creator.configure()
  File "/usr/lib/python2.7/site-packages/imgcreate/creator.py", line 754, in configure
    kickstart.SelinuxConfig(self._instroot).apply(ksh.selinux)
  File "/usr/lib/python2.7/site-packages/imgcreate/kickstart.py", line 483, in apply
    self.relabel(ksselinux)
  File "/usr/lib/python2.7/site-packages/imgcreate/kickstart.py", line 443, in relabel
    f = file(path, "w+")
IOError: [Errno 2] No such file or directory: '/var/tmp/imgcreate-DJyFL_/install_root/etc/resolv.conf'

This is an automatic Alpha blocker: "Bugs which entirely prevent the composition of one or more of the release-blocking images required to be built for a currently-pending (pre-)release" - https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process#Automatic_blockers - so marking as such.

Note that 22 and Rawhide boot.iso images built since systemd-219 landed have no /etc/resolv.conf and hence fail to configure any repositories - the INSTALLATION SOURCE spoke reports "Error setting up base repository". This seems very likely to be another symptom of the same basic issue. For now we can treat it as part of this bug, I guess, if we need to split it out as a new bug we can do that later.

Comment 20 Adam Williamson 2015-02-19 22:35:15 UTC
On the boot.iso case, see the 2015-02-19 nightly compose log, http://kojipkgs.fedoraproject.org/mash/branched-20150219/logs/x86_64.log :

program.INFO: removing broken symbolic link /etc/resolv.conf -> ../run/systemd/resolve/resolv.conf

that's from lorax, I believe.

Comment 21 Lennart Poettering 2015-02-20 13:41:00 UTC
Michal has readded the work-around to systemd.rpm, again.

This really has to go. The anaconda/livecd-tools/whatever had ample time to fix their copy routine to not get confused by symlinks. I made clear last time I would remove it by TC2 of the last cycle. I left it in instead for the whole cycle. But now it's time to kill this.

Comment 22 Adam Williamson 2015-02-20 17:00:08 UTC
Lennart: next time you want to take it out, though, please just *tell people a week ahead*. That way we can get all the fixes ready and deploy them together, instead of causing everything to break until QA/releng figures out what's gone wrong and everyone else has to scramble to fix it.

Comment 23 Colin Walters 2015-02-22 23:48:42 UTC
https://github.com/rhinstaller/anaconda/pull/6

Comment 24 Colin Walters 2015-02-23 04:25:16 UTC
https://github.com/rhinstaller/livecd-tools/pull/1

Comment 25 Adam Williamson 2015-02-25 03:11:21 UTC
As this was reverted for F22 the blocker status is resolved, but it really should be fixed properly for Rawhide in *all* affected areas well in advance of F23.

Comment 26 Brian Lane 2015-02-25 20:02:08 UTC
I've merged colin's commit for livecd-tools and will build it for rawhide and f22.

But I'm NAKing the change to anaconda. Commit https://github.com/rhinstaller/anaconda/commit/633fb6244a6dbbf07cbb0e8547bd9282be073d7c explains that some users want to control various config files.

I don't think it is reasonable to ask Anaconda to try and discern meaning from the creation of a broken symlink. You made it, you get to keep the pieces. If you need the symlink there, create it when you need it, not at package install time.

Comment 27 Adam Williamson 2015-02-25 20:05:42 UTC
Should we open a new bug for that part of the problem, then? Because presumably someone needs to fix *something*, or else we won't have networking on network install images?

Comment 28 Brian Lane 2015-02-25 20:21:29 UTC
(In reply to Adam Williamson (Red Hat) from comment #27)
> Should we open a new bug for that part of the problem, then? Because
> presumably someone needs to fix *something*, or else we won't have
> networking on network install images?

No, the network images themselves should be fine.

Comment 29 Adam Williamson 2015-02-25 20:25:04 UTC
oh, OK. I'll check that with a Rawhide nightly at some point, I guess.

Comment 30 Colin Walters 2015-03-17 16:58:31 UTC
This is apparently still breaking the Atomic cloud image composes:

http://koji.fedoraproject.org/koji/taskinfo?taskID=9250227

But somehow works for mainline.  Was there a mainline-located workaround somewhere besides Anaconda?

Comment 31 Brian Lane 2015-03-17 20:55:34 UTC
Take it up with systemd, please don't reopen and move this bug around. This bug is for livecd-tools which has been fixed.

Remaining problems are as described in the various other bugs.

Comment 32 Colin Walters 2015-03-17 21:32:22 UTC
This bug is an additional new place to have a conversation: https://bugzilla.redhat.com/show_bug.cgi?id=1197204


Note You need to log in before you can comment on or make changes to this bug.