Bug 1183807 - network spoke listed as not connected despite having assigned IP and hostname
Summary: network spoke listed as not connected despite having assigned IP and hostname
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 22
Hardware: arm
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Dan Williams
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
Depends On:
Blocks: ARMTracker F22BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2015-01-19 21:20 UTC by Paul Whalen
Modified: 2015-05-27 13:16 UTC (History)
17 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-04-07 16:04:53 UTC


Attachments (Terms of Use)
anaconda log (15.98 KB, text/plain)
2015-01-19 21:20 UTC, Paul Whalen
no flags Details
program log (43.99 KB, text/plain)
2015-01-19 21:21 UTC, Paul Whalen
no flags Details
ifcfg log (1.50 KB, text/plain)
2015-01-19 21:22 UTC, Paul Whalen
no flags Details
journalctl (122.50 KB, text/plain)
2015-03-03 14:07 UTC, Paul Whalen
no flags Details
syslog (58.02 KB, text/plain)
2015-03-11 13:34 UTC, Paul Whalen
no flags Details
anaconda (4.74 KB, text/plain)
2015-03-11 13:38 UTC, Paul Whalen
no flags Details
dnf (3.96 KB, text/plain)
2015-03-11 13:38 UTC, Paul Whalen
no flags Details
ifcfg (1.59 KB, text/plain)
2015-03-11 13:39 UTC, Paul Whalen
no flags Details
packaging (832 bytes, text/plain)
2015-03-11 13:39 UTC, Paul Whalen
no flags Details
program log (14.41 KB, text/plain)
2015-03-11 13:40 UTC, Paul Whalen
no flags Details
storage log (21.94 KB, text/plain)
2015-03-11 13:40 UTC, Paul Whalen
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Bugzilla 1225479 None None None Never

Internal Links: 1225479

Description Paul Whalen 2015-01-19 21:20:17 UTC
Created attachment 981629 [details]
anaconda log

Description of problem:

Doing text or VNC installations on armhfp, network spoke is listed as not connected, when selected the network is configured as expected. Repo and software selections are successfully verified.  

Version-Release number of selected component (if applicable):

anaconda-22.15-1.fc22


Additional info:
When doing text installation I can select:

"8) [ ] Apply configuration in installer" 

and complete the installation (it is not selected by default).

Comment 1 Paul Whalen 2015-01-19 21:21:18 UTC
Created attachment 981631 [details]
program log

Comment 2 Paul Whalen 2015-01-19 21:22:18 UTC
Created attachment 981633 [details]
ifcfg log

Comment 3 Paul Whalen 2015-02-10 17:43:00 UTC
Still seeing this with anaconda 22.18-1 .

Comment 4 Paul Whalen 2015-03-02 15:20:26 UTC
This still occurs on F22 Alpha TC7 (anaconda 22.20.1-1) and prevents VNC installations from completing. In text I am able to use the "Apply configuration in installer" option and complete the installation.

Proposing as a blocker:

"When using a dedicated installer image, the installer must be able to complete an installation using the text, graphical and VNC installation interfaces."

Comment 5 Petr Schindler 2015-03-02 17:35:40 UTC
Discussed at today's blocker review meeting [1].

This bug was accepted as Alpha Blocker - This but is a clear violation of the Alpha criterion: "When using a dedicated installer image, the installer must be able to complete an installation using the text, graphical and VNC installation interfaces.

http://meetbot.fedoraproject.org/fedora-blocker-review/2015-03-02/

Comment 6 Adam Williamson 2015-03-02 22:37:50 UTC
bcl says he recalls dmarlin mentioning something about a bug with interface renaming on ARM; that could well be the issue here. Paul, could you attach syslog / journalctl from an affected run? That may help.

CCing dmarlin.

Comment 7 Paul Whalen 2015-03-03 14:07:54 UTC
Created attachment 997553 [details]
journalctl

Comment 8 Paul Whalen 2015-03-03 14:08:19 UTC
In journalctl:
<audit-1400> avc:  denied  { getattr } for  pid=1672 comm="hwclock" path="/dev/pts/0" dev="devpts" ino=3 scontext=system_u:system_r:hwclock_t:s0 tcontext=system_u:object_r:devpts_t:s0 tclass=chr_file permissive=1

After selecting "8) [ ] Apply configuration in installer":

<audit-1300> arch=40000028 syscall=197 per=800000 success=yes exit=0 a0=0 a1=bedd1408 a2=bedd1408 a3=0 items=0 ppid=1441 pid=1672 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=4294967295 comm="hwclock" exe="/usr/sbin/hwclock" subj=system_u:system_r:hwclock_t:s0 key=(null)
Mar 03 13:56:46 bpi.friendly-neighbours.net unknown: <audit-1327> proctitle=6877636C6F636B002D2D6863746F737973002D2D757463
Mar 03 13:56:46 bpi.friendly-neighbours.net unknown[1672]: <audit-1111> pid=1672 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:hwclock_t:s0 msg='changing system time exe="/usr/sbin/hwclock" hostname=? addr=? terminal=pts/0 res=success'

The issue does not occur on ARM platforms that include an RTC (Calxeda Highbank).

Full log attached.

Comment 9 Bastien Nocera 2015-03-04 08:42:06 UTC
I actually encountered the same problem on Fedlet (i686), trying to use VNC because of graphics trouble. I didn't use NetworkManager but just ran "dhclient" to get online.

I would recommend that Anaconda be changed to use GNetworkMonitor to check for network connectivity, as it supports NetworkManager not being available:
https://developer.gnome.org/gio/unstable/GNetworkMonitor.html

Comment 10 Adam Williamson 2015-03-04 18:39:04 UTC
note: "permissive=1"

anaconda runs in permissive mode, so SELinux denials should never actually break anything; they're just being logged, not enforced.

Comment 11 Adam Williamson 2015-03-04 19:02:37 UTC
Some thoughts on blockeriness:

* Affected platforms are somewhat more likely to be deployed with disk images than the installer
* The bug can be worked around in text mode
* The criterion states "When using a dedicated installer image", but there are no installer images for ARM: you have to use PXE boot
* PXE boot itself is a Beta requirement, not Alpha: Beta criterion "It must be possible to install by booting the installation kernel directly (including via PXE) and correctly specifying a remote source for the installer itself."

So...I'm wondering if we really made the right call in making this a blocker. Put all that together and it sounds a lot more like, at worst, a Beta blocker.

Comment 12 Mike Ruckman 2015-03-04 19:13:33 UTC
+1 to Beta blocker per the previous comment.

Comment 13 Stephen Gallagher 2015-03-04 19:17:17 UTC
Yeah, +1 to move this to Beta Blocker.

Comment 14 Paul Whalen 2015-03-04 19:25:50 UTC
+1 to move to Beta. Most people will likely use our images for development boards, and installations work fine on platforms where the installer would be preferred (highbank, midway).

Comment 15 Adam Williamson 2015-03-04 19:32:41 UTC
bcl is also OK with moving to Beta, so we have a solid consensus for that: I'm moving it.

Comment 16 Brian Lane 2015-03-09 21:51:12 UTC
I don't see the normal amount of NetworkManager info in the logs. Nothing about activation for example. I do see a bit of dhclient, but not what I'd expect to see. Anaconda depends on NetworkManager, passing the buck for now.


If you switch to a shell what does 'nmcli d' show?

Comment 17 Jirka Klimes 2015-03-11 08:18:13 UTC
Yeah, there are no NM logs in the file. Would you get anaconda logs from /tmp, especially /tmp/syslog?
http://fedoraproject.org/wiki/Anaconda/Logging

Comment 18 Paul Whalen 2015-03-11 13:34:42 UTC
Created attachment 1000440 [details]
syslog

Comment 19 Paul Whalen 2015-03-11 13:35:34 UTC
[anaconda root@wandq tmp]# nmcli d
DEVICE  TYPE      STATE      CONNECTION 
eth0    ethernet  connected  eth0       
lo      loopback  unmanaged  --

Comment 20 Paul Whalen 2015-03-11 13:38:00 UTC
Created attachment 1000441 [details]
anaconda

Comment 21 Paul Whalen 2015-03-11 13:38:20 UTC
Created attachment 1000442 [details]
dnf

Comment 22 Paul Whalen 2015-03-11 13:39:04 UTC
Created attachment 1000443 [details]
ifcfg

Comment 23 Paul Whalen 2015-03-11 13:39:34 UTC
Created attachment 1000445 [details]
packaging

Comment 24 Paul Whalen 2015-03-11 13:40:03 UTC
Created attachment 1000446 [details]
program log

Comment 25 Paul Whalen 2015-03-11 13:40:54 UTC
Created attachment 1000447 [details]
storage log

Comment 26 Paul Whalen 2015-03-11 15:40:05 UTC
in the syslog:

0:03:40,525 WARNING chronyd: System clock wrong by 1426077285.544833 seconds, adjustment started
12:38:26,070 WARNING chronyd: System clock was stepped by 1426077285.544833 seconds
12:38:26,108 INFO NetworkManager: <info>  (eth0): DHCPv4 state changed bound -> expire
00:10:30,403 INFO NetworkManager: <info>  (eth0): canceled DHCP transaction, DHCP client pid 1440
00:10:30,403 INFO NetworkManager: <info>  (eth0): DHCPv4 state changed expire -> done
00:10:30,404 INFO NetworkManager: <info>  (eth0): device state change: activated -> failed (reason 'ip-config-expired') [100 120 6]
00:10:30,405 INFO NetworkManager: <info>  NetworkManager state is now CONNECTED_LOCAL
00:10:30,405 INFO NetworkManager: <info>  NetworkManager state is now DISCONNECTED

Restarting NetworkManager allowed me to complete the installation in VNC

Comment 27 Jirka Klimes 2015-03-23 08:04:54 UTC
Yes, the problem is a big clock jump. The question is why it happens in the first place.

Unfortunately, dhclient cannnot handle clock jumps gracefully and that's why the connection fails. You can see more information about dhclient time handling in bug 1093803.

As far as NM is concerned, it is possible to run with an internal DHCP client code instead (dhcp=internal in NetworkManager.conf). However, it still lacks some features of dhclient.

Comment 28 Peter Robinson 2015-03-23 09:16:43 UTC
(In reply to Jirka Klimes from comment #27)
> Yes, the problem is a big clock jump. The question is why it happens in the
> first place.

Because a lot of cheap ARM devices don't have their RTC clock backed by a battery. The initial date varies depending on the device but I've seen anything from unix epoch through to 1st Jan 2010.

> Unfortunately, dhclient cannnot handle clock jumps gracefully and that's why
> the connection fails. You can see more information about dhclient time
> handling in bug 1093803.

It's never been an issue prior to now and the issue with the lack of batteries has been a long time (since at least F-13 when I started with ARM)

> As far as NM is concerned, it is possible to run with an internal DHCP
> client code instead (dhcp=internal in NetworkManager.conf). However, it
> still lacks some features of dhclient.

Does the NM variant not have that issue? Is the feature delta documented somewhere?

Comment 29 Paul Whalen 2015-03-23 13:47:17 UTC
This is working again in F22 Beta TC 3/4 (text and vnc confirmed).

Comment 30 Dan Williams 2015-03-26 14:19:29 UTC
(In reply to Peter Robinson from comment #28)
> (In reply to Jirka Klimes from comment #27)
> > Yes, the problem is a big clock jump. The question is why it happens in the
> > first place.
> 
> Because a lot of cheap ARM devices don't have their RTC clock backed by a
> battery. The initial date varies depending on the device but I've seen
> anything from unix epoch through to 1st Jan 2010.
> 
> > Unfortunately, dhclient cannnot handle clock jumps gracefully and that's why
> > the connection fails. You can see more information about dhclient time
> > handling in bug 1093803.
> 
> It's never been an issue prior to now and the issue with the lack of
> batteries has been a long time (since at least F-13 when I started with ARM)
> 
> > As far as NM is concerned, it is possible to run with an internal DHCP
> > client code instead (dhcp=internal in NetworkManager.conf). However, it
> > still lacks some features of dhclient.
> 
> Does the NM variant not have that issue? Is the feature delta documented
> somewhere?

A more limited set of supported options and no DHCPv6 support (yet), to name the two largest ones.  This is documented in 'man NetworkManager.conf' in the section describing the "dns" option.

Comment 31 Dan Williams 2015-03-26 14:23:51 UTC
This seems like a dhclient bug instead of a NetworkManager bug, since dhclient is the program that is failing when the clock is advancing.

One thing to make sure of though is that the connection is 'autoconnect'.  I know the ifcfg file sets that, but ISTR anaconda plays around with it a bit.  Can you jump to the command-prompt VT or somehow get a shell and run "nmcli dev show eth0" and check the 'autoconnect' line?

Comment 32 Paul Whalen 2015-03-30 14:46:34 UTC
(In reply to Dan Williams from comment #31)
> This seems like a dhclient bug instead of a NetworkManager bug, since
> dhclient is the program that is failing when the clock is advancing.
> 
> One thing to make sure of though is that the connection is 'autoconnect'.  I
> know the ifcfg file sets that, but ISTR anaconda plays around with it a bit.
> Can you jump to the command-prompt VT or somehow get a shell and run "nmcli
> dev show eth0" and check the 'autoconnect' line?

GENERAL.DEVICE:                         eth0
GENERAL.TYPE:                           ethernet
GENERAL.HWADDR:                         00:1F:7B:B4:02:34
GENERAL.MTU:                            1500
GENERAL.STATE:                          100 (connected)
GENERAL.CONNECTION:                     eth0
GENERAL.CON-PATH:                       /org/freedesktop/NetworkManager/ActiveConnection/0
WIRED-PROPERTIES.CARRIER:               on
IP4.ADDRESS[1]:                         192.168.0.50/24
IP4.GATEWAY:                            192.168.0.1
IP4.DNS[1]:                             192.168.0.80
IP6.ADDRESS[1]:                         fe80::21f:7bff:feb4:234/64
IP6.GATEWAY:

Comment 33 Dan Williams 2015-04-02 18:58:18 UTC
My fault sorry, that shows the current device information but not the connection profile.  So take the name in GENERAL.CONNECTION (eth0) and then do:

nmcli con show eth0 | grep -i auto

Comment 34 Paul Whalen 2015-04-02 19:37:57 UTC
[anaconda root@bpi ~]# nmcli con show eth0 | grep -i auto
connection.autoconnect:                 yes
connection.autoconnect-priority:        0
802-3-ethernet.auto-negotiate:          yes
802-3-ethernet.mtu:                     auto
ipv4.method:                            auto
ipv4.ignore-auto-routes:                no
ipv4.ignore-auto-dns:                   no
ipv6.method:                            auto
ipv6.ignore-auto-routes:                no
ipv6.ignore-auto-dns:                   no


The above is from 22_Beta_TC2. TC3+ has been working.

Comment 35 Adam Williamson 2015-04-07 02:51:34 UTC
Dan, where do we stand on this? Go/No-Go is in three days :)

Comment 36 Paul Whalen 2015-04-07 16:04:53 UTC
This seems to be resolved since 22_Beta_TC3, confirmed working in TC8. Closing for now.


Note You need to log in before you can comment on or make changes to this bug.