Bug 1204031

Summary: 22 Beta TC3 install images do not bring up network (unless updates image or kickstart used)
Product: [Fedora] Fedora Reporter: Adam Williamson <awilliam>
Component: loraxAssignee: Brian Lane <bcl>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 22CC: anaconda-maint-list, bcl, dcbw, dennis, jpopelka, psimerda, robatino, satellitgo, sgallagh, thozza
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard: AcceptedBlocker
Fixed In Version: lorax-22.7-1.fc22 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-29 04:44:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1043125    

Description Adam Williamson 2015-03-20 08:14:34 UTC
Testing Fedora 22 Beta TC3:

https://dl.fedoraproject.org/pub/alt/stage/22_Beta_TC3/

using the Server boot.iso, unless you pass inst.updates or inst.ks (which trigger dracut to bring up the network), the network is not activated on boot, so anaconda fails to configure the default repos. Hub shows 'Error setting up base repository' for the INSTALLATION SOURCE spoke.

'ip addr' shows the ethernet interface (ens3, in my VM) as 'UP' but with no IP address at all, only a 'link/ether' (MAC address).

'systemctl status NetworkManager.service' shows 'active (running)', but with very few log messages.

There is no /etc/resolv.conf file.

There is no default route.

Proposing as a Beta blocker: Alpha criterion https://fedoraproject.org/wiki/Fedora_22_Alpha_Release_Criteria#Remote_package_sources , "When using a release-blocking dedicated installer image, the installer must be able to use either HTTP or FTP repositories (or both) as package sources. Release-blocking network install images must default to a valid publicly-accessible package source." - can't do that if the network doesn't work.

Comment 1 Adam Williamson 2015-03-20 08:32:38 UTC
/tmp/syslog shows activation of the network interface reaches stage 3, then:

dhclient started with pid 1342
Activation: Stage 3 of 5 (IP Configure Start) complete.
DHCPv4 client pid 1432 exited with status 127
DHCPv4 state changed unknown -> done
canceled DHCP transaction

all that happens within the space of one second.

Beta TC3 has:

systemd-219-8.fc22
NetworkManager-1.0.0-8.fc22
dhcp-client-4.3.2-2.fc22

Comment 2 Adam Williamson 2015-03-20 17:33:10 UTC
well, running dhclient manually gives us a big honkin' clue:

dhclient: error while loading shared libraries: libirs-export.so.91: cannot open shared object file: No such file or directory

That file is actually present. It's provided by bind99-libs and it's at /usr/lib64/bind99/libirs-export.so.91 .

bind99-libs also provides a /etc/ld.so.conf.d/bind99-x86_64.conf which adds /usr/lib64/bind99 , and that file is *also* present in anaconda's environment.

What's missing, though, is /etc/ld.so.conf , and that's a problem because:

[adamw@adam lorax (master)]$ cat /etc/ld.so.conf
include ld.so.conf.d/*.conf

so the thing that should cause the ld.so.conf.d files to be included isn't there, so /usr/lib64/bind99 never winds up in the linker path, so dhclient can't find it.

lorax excludes /etc/ld.so.conf explicitly (and has done since 2011):

[adamw@adam lorax (f22-branch)]$ git blame share/runtime-cleanup.tmpl | grep ld.so.conf
e7e07059 (Will Woods         2011-06-22 14:20:02 -0400 194) removefrom glibc /etc/gai.conf /etc/ld.so.conf /etc/localtime /etc/rpc

so, that seems to be the immediate problem here. Why it only showed up now I don't know for sure, but we *did* pull an updated bind into TC3, so that probably moved stuff around and caused the problem.

Comment 3 Adam Williamson 2015-03-20 17:36:27 UTC
I confirmed that after re-creating /etc/ld.so.conf by hand and running 'ldconfig' I can bring up the network connection.

Comment 4 Adam Williamson 2015-03-20 17:37:09 UTC
CCing thozza for any needed input on what's the deal with all the moving stuff around in bind.

Comment 5 Adam Williamson 2015-03-20 18:09:04 UTC
An additional wrinkle here: the images as built actually have an OK ld cache that's built during image generation (when package %post scripts run ldconfig). The reason it breaks is systemd's 'ldconfig.service', which runs 'ldconfig -X' on boot. It re-generates the cache and, because of the missing /etc/ld.so.conf , loses all the libraries in directories listed in files in /etc/ld.so.conf.d .

I tested this: if I boot with 'systemd.confirm_spawn=true', which is kinda like the old 'interactive boot' thing - it asks you to confirm each process spawned during init - and say 'no' when it asks if it should run 'ldconfig -X', then things work correctly (network comes up, and I can run dhclient manually).

So I think at least as long as ldconfig.service is active for anaconda, we need to have /etc/ld.so.conf in the installer environment (i.e. lorax shouldn't strip it).

Comment 6 Brian Lane 2015-03-20 20:35:18 UTC
I think it makes more sense to just turn it off. This matches the behavior of live and of lorax master.

Comment 7 Fedora Update System 2015-03-21 01:24:09 UTC
lorax-22.7-1.fc22 has been submitted as an update for Fedora 22.
https://admin.fedoraproject.org/updates/lorax-22.7-1.fc22

Comment 8 Fedora Update System 2015-03-22 04:34:24 UTC
Package lorax-22.7-1.fc22:
* should fix your issue,
* was pushed to the Fedora 22 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing lorax-22.7-1.fc22'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-4398/lorax-22.7-1.fc22
then log in and leave karma (feedback).

Comment 9 Tomáš Hozza 2015-03-23 06:58:06 UTC
(In reply to awilliam from comment #4)
> CCing thozza for any needed input on what's the deal with all the moving
> stuff around in bind.

I added bind99 since ISC DHCP is not able to function correctly when built against BIND 9.10.x. bind99 installs libraries and headers into a different location so they don't conflict with the original bind package.

This is due to Bug #1184173 and Bug #1199428

Comment 10 Adam Williamson 2015-03-23 17:06:09 UTC
Discussed at 2015-03-23 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2015-03-23/f22-blocker-review.2015-03-23-16.02.log.txt . Accepted as a Beta blocker per criterion cited in #c0.

Comment 11 Fedora Update System 2015-03-29 04:44:15 UTC
lorax-22.7-1.fc22 has been pushed to the Fedora 22 stable repository.  If problems still persist, please make note of it in this bug report.