Bug 1151665

Summary:

NM repeatedly updates default route, causing GNetworkMonitor using apps to go crazy

Product:

[Fedora] Fedora

Reporter:

Changming Sun <me>

Component:

NetworkManager

Assignee:

Dan Winship <danw>

Status:

CLOSED EOL

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

low

Docs Contact:

Priority:

unspecified

Version:

CC:

danw, dcbw, fedora, mbarnes, mclasen, mcrha, mikhail.v.gavrilov, psimerda, redhat-bugzilla

Target Milestone:

---

Keywords:

Reopened

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

NetworkManager-0.9.10.0-12.git20140704.fc21

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-12-02 04:14:03 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
strace -f -s 256 evolution	none
radvdump	none
strace of /usr/libexec/evolution-calendar-factory	none
journalctl --since=2014-10-23 --no-pager -u NetworkManager.service	none
NetworkManager log	none
evolution-addressbook-factory strace with new NM	none
Network Manager log	none
Network Manager log after fixing issue	none
NetworkManager-0.9.10.0-10 log	none
Network Manager log (0.9.10.0-12)	none
strace of evolution-addressbook-factory	none
screenshot of htop	none
screenshot of htop	none
Network Manager log	none
strace of evolution-source-registry	none
strace of evolution-addressbook-factory	none
Network Manager log	none
Network Manager log	none
router ipv6 settings	none

Description Changming Sun 2014-10-11 02:00:26 UTC

Description of problem:
Evolution-calendar-factory uses 100% cpu for a very long time (more than 1 hour)
when I added a google account.

the backtrace of it is :

Thread 18 (Thread 0x7fd91789f700 (LWP 2121)):
#0  0x00007fd92132c1dd in poll () at /lib64/libc.so.6
#1  0x00007fd91f636e34 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f636f4c in g_main_context_iteration () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f636f89 in glib_worker_main () at /lib64/libglib-2.0.so.0
#4  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 17 (Thread 0x7fd8fdbdd700 (LWP 2176)):
#0  0x00007fd92132c1dd in poll () at /lib64/libc.so.6
#1  0x00007fd91f636e34 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f636f4c in g_main_context_iteration () at /lib64/libglib-2.0.so.0
#3  0x00007fd8fdbe524d in dconf_gdbus_worker_thread () at /usr/lib64/gio/modules/libdconfsettings.so
#4  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 16 (Thread 0x7fd8fd3dc700 (LWP 2177)):
#0  0x00007fd92132c1dd in poll () at /lib64/libc.so.6
#1  0x00007fd91f636e34 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f6371c2 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#3  0x00007fd926cb9a34 in source_registry_object_manager_thread () at /lib64/libedataserver-1.2.so.18
#4  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 15 (Thread 0x7fd8fcbdb700 (LWP 2178)):
#0  0x00007fd92132c1dd in poll () at /lib64/libc.so.6
#1  0x00007fd91f636e34 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f6371c2 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#3  0x00007fd91fc51d56 in gdbus_shared_thread_func () at /lib64/libgio-2.0.so.0
#4  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 14 (Thread 0x7fd8ee1fb700 (LWP 2196)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67b2bc in g_cond_wait () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f60bc4b in g_async_queue_pop_intern_unlocked () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65e115 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#5  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#6  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 13 (Thread 0x7fd8e7fff700 (LWP 2619)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 12 (Thread 0x7fd8e6ffd700 (LWP 2624)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67b2bc in g_cond_wait () at /lib64/libglib-2.0.so.0
#2  0x00007fd91fbf9933 in g_task_run_in_thread_sync () at /lib64/libgio-2.0.so.0
#3  0x00007fd8ed5dfbdb in g_tls_connection_gnutls_handshake () at /usr/lib64/gio/modules/libgiognutls.so
#4  0x00007fd91ff3d0e6 in soup_connection_connect_sync () at /lib64/libsoup-2.4.so.1
#5  0x00007fd91ff5fc45 in soup_session_process_queue_item () at /lib64/libsoup-2.4.so.1
#6  0x00007fd91ff6037e in soup_session_real_send_message () at /lib64/libsoup-2.4.so.1
#7  0x00007fd916c8248f in send_and_handle_redirection () at /usr/lib64/evolution-data-server/calendar-backends/libecalbackendcaldav.so
#8  0x00007fd916c8271d in open_calendar_wrapper () at /usr/lib64/evolution-data-server/calendar-backends/libecalbackendcaldav.so
#9  0x00007fd916c885c5 in caldav_do_open () at /usr/lib64/evolution-data-server/calendar-backends/libecalbackendcaldav.so
#10 0x00007fd92740e666 in cal_backend_open () at /lib64/libedata-cal-1.2.so.23
#11 0x00007fd927405993 in cal_backend_open_thread () at /lib64/libedata-cal-1.2.so.23
#12 0x00007fd927403752 in cal_backend_dispatch_thread () at /lib64/libedata-cal-1.2.so.23
#13 0x00007fd91f65e0d8 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#14 0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#15 0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#16 0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 11 (Thread 0x7fd8e67fc700 (LWP 2625)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd916c886e3 in caldav_synch_slave_loop () at /usr/lib64/evolution-data-server/calendar-backends/libecalbackendcaldav.so
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 10 (Thread 0x7fd8e4ff9700 (LWP 2628)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 9 (Thread 0x7fd8cffff700 (LWP 2632)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 8 (Thread 0x7fd8cf7fe700 (LWP 2633)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 7 (Thread 0x7fd8ceffd700 (LWP 2634)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 6 (Thread 0x7fd8ce7fc700 (LWP 2635)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 5 (Thread 0x7fd8cdffb700 (LWP 2636)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 4 (Thread 0x7fd8cd7fa700 (LWP 2637)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 3 (Thread 0x7fd8ccff9700 (LWP 2638)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 2 (Thread 0x7fd8abfff700 (LWP 2639)):
#0  0x00007fd921331e89 in syscall () at /lib64/libc.so.6
#1  0x00007fd91f67a799 in g_mutex_lock_slowpath () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f65e0e1 in g_thread_pool_thread_proxy () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f65d745 in g_thread_proxy () at /lib64/libglib-2.0.so.0
#4  0x00007fd92077a52a in start_thread () at /lib64/libpthread.so.0
#5  0x00007fd92133777d in clone () at /lib64/libc.so.6
Thread 1 (Thread 0x7fd927818840 (LWP 2096)):
#0  0x00007fd91f632574 in g_list_sort_real () at /lib64/libglib-2.0.so.0
#1  0x00007fd91f63259c in g_list_sort_real () at /lib64/libglib-2.0.so.0
#2  0x00007fd91f6325ad in g_list_sort_real () at /lib64/libglib-2.0.so.0
#3  0x00007fd91f6325ad in g_list_sort_real () at /lib64/libglib-2.0.so.0
#4  0x00007fd91f647376 in g_queue_sort () at /lib64/libglib-2.0.so.0
#5  0x00007fd91f60c47a in g_async_queue_sort_unlocked () at /lib64/libglib-2.0.so.0
#6  0x00007fd91f65e9ee in g_thread_pool_set_sort_function () at /lib64/libglib-2.0.so.0
#7  0x00007fd91fbf9be2 in task_thread_cancelled () at /lib64/libgio-2.0.so.0
#8  0x00007fd91f935f64 in _g_closure_invoke_va () at /lib64/libgobject-2.0.so.0
#9  0x00007fd91f94fb70 in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#10 0x00007fd91f9503bf in g_signal_emit () at /lib64/libgobject-2.0.so.0
#11 0x00007fd91fbb0318 in g_cancellable_cancel () at /lib64/libgio-2.0.so.0
#12 0x00007fd926f48af6 in backend_update_online_state_idle_cb () at /lib64/libebackend-1.2.so.7
#13 0x00007fd91f636afb in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#14 0x00007fd91f636e98 in g_main_context_iterate.isra () at /lib64/libglib-2.0.so.0
#15 0x00007fd91f6371c2 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#16 0x00007fd926f4c342 in dbus_server_run_server () at /lib64/libebackend-1.2.so.7
#17 0x00007fd91e4cfd60 in ffi_call_unix64 () at /lib64/libffi.so.6
#18 0x00007fd91e4cf7d1 in ffi_call () at /lib64/libffi.so.6
#19 0x00007fd91f9369ac in g_cclosure_marshal_generic_va () at /lib64/libgobject-2.0.so.0
#20 0x00007fd91f935f64 in _g_closure_invoke_va () at /lib64/libgobject-2.0.so.0
#21 0x00007fd91f94fb70 in g_signal_emit_valist () at /lib64/libgobject-2.0.so.0
#22 0x00007fd91f9503bf in g_signal_emit () at /lib64/libgobject-2.0.so.0
#23 0x00007fd926f4c61c in e_dbus_server_run () at /lib64/libebackend-1.2.so.7
#24 0x0000000000400e9c in main ()



Version-Release number of selected component (if applicable):


How reproducible:
Didn't know

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Milan Crha 2014-10-13 10:54:21 UTC

Thanks for a bug report. It seems to me (from the backtrace), that the Google account was trying to open its associated Calendar, when a network change had been received, which resulted in a notification of an online state change being discovered by the evolution-data-server code and propagated further, through some signals into GLib. I guess from the backtrace that the GLib got stuck in the sort function.

Could you provide your glib2 version, please? (rpm -q glib2)

Dan (or Philip), does this sound familiar to anyone of you, please? I found an upstream bug report []1, but it doesn't fit the above backtrace, thus it's possible the change there is unrelated to this issue.

[1] https://bugzilla.gnome.org/show_bug.cgi?id=736806

Comment 2 Milan Crha 2014-10-13 11:01:44 UTC

Here are backtraces with line numbers: bug #1148247 comment #5, see the backtraces for evolution-addressbook-factory.

Comment 3 Mikhail 2014-10-13 11:13:29 UTC

(In reply to Milan Crha from comment #1)
> Could you provide your glib2 version, please? (rpm -q glib2)
$ rpm -q glib2
glib2-2.42.0-1.fc21.x86_64
glib2-2.42.0-1.fc21.i686

Comment 4 Changming Sun 2014-10-14 01:38:21 UTC

rpm -q glib2
glib2-2.42.0-1.fc21.x86_64

Comment 5 Milan Crha 2014-10-14 06:35:53 UTC

Thanks for the update. Let's move this to glib2 for further investigation.

Comment 6 Milan Crha 2014-10-15 07:27:15 UTC

*** Bug 1152692 has been marked as a duplicate of this bug. ***

Comment 7 Milan Crha 2014-10-15 07:31:12 UTC

Dan, could you have a look on this, please? There are happening more same issues in GTask all exhibiting the same symptoms.

Comment 8 Dan Winship 2014-10-20 13:04:29 UTC

Hm... so this is the same as https://bugzilla.gnome.org/show_bug.cgi?id=735325. The strace output there showed that the kernel was apparently spamming e-d-s with routing change notifications for some reason, but the reporter stopped being able to reproduce the bug before we could figure out why.

It would be useful to get "strace -f -s 256" output of e-d-s while it's stuck in the loop. (The loop is pretty tight, so you only need like 1/10s worth of output.) Also, it might be useful to run NetworkManager with debug logging (to see what it's doing/noticing about the routing). Add this to /etc/NetworkManager/NetworkManager.conf:

  [logging]
  level=DEBUG
  domains=DEFAULT

and restart NM.

Regardless, I guess I should also add some rate limiting to GNetworkMonitor...

Comment 9 Mikhail 2014-10-20 13:30:38 UTC

Created attachment 948558 [details]
strace -f -s 256 evolution

Comment 10 Dan Winship 2014-10-21 14:42:48 UTC

OK, so the complete netlink message is:

  6c 00 00 00   [ length 108 ]
  18 00 00 00   [ type RTM_NEWROUTE, flags 0 ]
  xx xx xx xx   [ serial # ]
  yy yy yy yy   [ pid ?? ]

  0a 00 00 00   [ AF_INET6, src len 0, dst len 0, TOS 0 ]
  fe 04 00 01   [ RT_TABLE_MAIN, RTPROT_STATIC, RT_SCOPE_UNIVERSE, RTN_UNICAST]
  00 00 00 00   [ flags 0 ]

  08 00 0f 00   [ attr of length 8, type RTA_TABLE ]
  fe 00 00 00   [ RT_TABLE_MAIN ]

  14 00 05 00   [ attr of length 20, type RTA_GATEWAY ]
  fe 80 00 00   [ fe80::16:da:e9:ff:fe:f8:a8:f7 ]
  00 00 00 00
  16 da e9 ff
  fe f8 a8 f7

  08 00 04 00   [ attr of length 8, type RTA_OIF ]
  02 00 00 00   [ device ifindex 2 ]

  08 00 06 00   [ attr of length 8, type RTA_PRIORITY ]
  00 04 00 00   [ route metric 1024 ]

  24 00 0c 00   [ attr of length 36, type RTA_CACHEINFO ]
  00 00 00 00
                [ truncated? ]

The one thing that this trace tells us that the truncated one in the gnome bug didn't is that the "new" route is actually the same one every time.

The other thing in this trace is that there are also thousands of

[pid 31632] recvmsg(10, {msg_name(0)=NULL, msg_iov(1)=[{"l\4\1\1\210\0\0\0\376s\0\0\233\0\0\0\1\1o\0+\0\0\0/org/freedesktop/NetworkManager/IP6Config/3\0\0\0\0\0\2\1s\0(\0\0\0org.freedesktop.NetworkManager.IP6Config\0\0\0\0\0\0\0\0\3\1s\0\21\0\0\0PropertiesChanged\0\0\0\0\0\0\0\10\1g\0\5a{sv}\0\0\0\0\0\0\200\0\0\0\0\0\0\0\6\0\0\0Routes\0\ta(ayuayu)\0\0\0`\0\0\0\0\0\0\0\20\0\0\0*\2\1\310\1\1\f\204\0\0\0\0\0\0\0\0@\0\0\0\20\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 576

meaning that NM is also receiving and reacting to the same flood of RTM_NEWLINK messages.


So... is there something "weird" about your network configuration? Do you have any weird (physical or virtual) network interfaces? Are you running any weird network-related daemons? Weird sysctl settings? Etc...

(The NM DEBUG logs mentioned before might still be useful too.)

Actually, another thing: install radvd, and then (as root) run "radvdump", and attach the output.

Comment 11 Dan Winship 2014-10-21 14:48:40 UTC

Pretty sure this is NM's fault.

Comment 12 Dan Winship 2014-10-21 15:37:09 UTC

Mikhail (or anyone else who can reproduce this reliably): can you try installing the NetworkManager package at http://koji.fedoraproject.org/koji/taskinfo?taskID=7926164 (and then restart NetworkManager).

I expect that just restarting NM will make the CPU usage go away temporarily, but with old NM it would come back eventually, and with the new it won't.

Comment 13 Mikhail 2014-10-22 03:07:55 UTC

Created attachment 949234 [details]
radvdump

Comment 14 Changming Sun 2014-10-22 08:41:04 UTC

Created attachment 949295 [details]
strace of /usr/libexec/evolution-calendar-factory

Comment 15 Mikhail 2014-10-23 01:52:21 UTC

Created attachment 949607 [details]
journalctl --since=2014-10-23 --no-pager -u NetworkManager.service

Comment 16 Mikhail 2014-10-24 11:26:13 UTC

Created attachment 950330 [details]
NetworkManager log

Comment 17 Mikhail 2014-10-24 11:27:01 UTC

Anything else?

Comment 18 Milan Crha 2014-10-24 12:11:28 UTC

Just a side note that evolution-data-server was issuing a server availability checks after each single "network-changed" signal, which can come in a quick rate in junks like 5+ in a row in certain cases. That caused unnecessary performance requirements and so on. I tried to address this upstream at [1]. The only difference is that it's not fully related to this, I think, because the backtrace as such is a new thing, not an old.

[1] https://bugzilla.gnome.org/show_bug.cgi?id=712392

Comment 19 Dan Winship 2014-10-24 12:55:12 UTC

(In reply to Mikhail from comment #17)
> Anything else?

Yes:

(In reply to Dan Winship from comment #12)
> Mikhail (or anyone else who can reproduce this reliably): can you try
> installing the NetworkManager package at
> http://koji.fedoraproject.org/koji/taskinfo?taskID=7926164 (and then restart
> NetworkManager).
> 
> I expect that just restarting NM will make the CPU usage go away
> temporarily, but with old NM it would come back eventually, and with the new
> it won't.

Comment 20 Mikhail 2014-10-24 13:42:21 UTC

With this NM
http://koji.fedoraproject.org/koji/taskinfo?taskID=7926164 
issue still reproducible:


$ rpm -qa | grep NetworkManager
NetworkManager-wifi-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-debuginfo-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-pptp-gnome-0.9.8.2-6.fc21.x86_64
NetworkManager-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-glib-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-vpnc-gnome-0.9.9.0-6.git20140428.fc21.x86_64
NetworkManager-bluetooth-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-pptp-0.9.8.2-6.fc21.x86_64
NetworkManager-wwan-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-openvpn-0.9.9.0-3.git20140128.fc21.x86_64
NetworkManager-config-connectivity-fedora-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-openvpn-gnome-0.9.9.0-3.git20140128.fc21.x86_64
NetworkManager-adsl-0.9.10.0-8.1danw.git20140704.fc21.x86_64
NetworkManager-openconnect-0.9.8.4-4.fc21.x86_64
NetworkManager-vpnc-0.9.9.0-6.git20140428.fc21.x86_64

Comment 21 Mikhail 2014-10-24 13:45:19 UTC

Created attachment 950385 [details]
evolution-addressbook-factory strace with new NM

Comment 22 Mikhail 2014-10-24 13:47:28 UTC

Created attachment 950386 [details]
Network Manager log

Comment 23 Dan Winship 2014-10-24 18:43:06 UTC

OK... new test package with a different fix:

http://koji.fedoraproject.org/koji/taskinfo?taskID=7939674

BTW, are you getting constant high CPU usage, or just repeated bursts of high CPU usage? The NM log shows a second or two of high activity, followed by a 30-second gap, then another burst of activity, a 30-second gap, etc. (If evolution-data-server is constantly spinning rather than only sporadically spinning, then there may be a GNetworkMonitor-specific bug too...)

Comment 24 Mikhail 2014-10-26 18:38:37 UTC

(In reply to Dan Winship from comment #23)
> OK... new test package with a different fix:
> 
> http://koji.fedoraproject.org/koji/taskinfo?taskID=7939674

Thanks, with this update I don't see issue.

> BTW, are you getting constant high CPU usage, or just repeated bursts of
> high CPU usage?

If this occurs evolution-addressbook-factory process constant utilize CPU core.

> The NM log shows a second or two of high activity, followed
> by a 30-second gap, then another burst of activity, a 30-second gap, etc.
> (If evolution-data-server is constantly spinning rather than only
> sporadically spinning, then there may be a GNetworkMonitor-specific bug
> too...)

Subjectively, two days of work all right. Why is subjective because there is no clear instructions for reproducing problem. But NetworkManager log are still very huge.

Comment 25 Mikhail 2014-10-26 18:39:24 UTC

Created attachment 950824 [details]
Network Manager log after fixing issue

Comment 26 Milan Crha 2014-10-27 07:48:00 UTC

(In reply to Dan Winship from comment #23)
> (If evolution-data-server is constantly spinning rather than only
> sporadically spinning, then there may be a GNetworkMonitor-specific bug
> too...)

Or the way evolution used to react on the GNetworkMonitor "network-changed" signal. See comment #18.

Comment 27 Mikhail 2014-10-27 14:29:49 UTC

Hi, today come NetworkManager-0.9.10.0-10.git20140704.fc21.x86_64 and this issue returned :(

Comment 28 Mikhail 2014-10-27 14:30:32 UTC

Created attachment 951025 [details]
NetworkManager-0.9.10.0-10 log

Comment 29 Dan Winship 2014-10-27 18:42:36 UTC

(In reply to Mikhail from comment #24)
> Thanks, with this update I don't see issue.

Awesome

> But NetworkManager log are still very huge.

Oh, right, because you still have debugging logging enabled. You'll want to remove the [logging] section from /etc/NetworkManager/NetworkManager.conf (and restart NM)

(In reply to Mikhail from comment #27)
> Hi, today come NetworkManager-0.9.10.0-10.git20140704.fc21.x86_64 and this
> issue returned :(

Yes, the package I gave you was just a test build, and after I made it, someone else put out a new release build of NM, which has a higher version, but doesn't contain the test bug fix. A new, fixed package should be coming out soon. Until then, you can just downgrade to the build in koji.

Comment 30 Mikhail 2014-10-28 03:21:53 UTC

(In reply to Dan Winship from comment #29)
> Oh, right, because you still have debugging logging enabled. You'll want to
> remove the [logging] section from /etc/NetworkManager/NetworkManager.conf
> (and restart NM)

I just wanted to know, it is normal that NetworkManager always something to do.

Comment 31 Fedora Update System 2014-10-29 14:07:21 UTC

NetworkManager-0.9.10.0-12.git20140704.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/NetworkManager-0.9.10.0-12.git20140704.fc21

Comment 32 Fedora Update System 2014-10-31 01:26:12 UTC

Package NetworkManager-0.9.10.0-12.git20140704.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing NetworkManager-0.9.10.0-12.git20140704.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-13942/NetworkManager-0.9.10.0-12.git20140704.fc21
then log in and leave karma (feedback).

Comment 33 Mikhail 2014-10-31 04:20:49 UTC

And it occurs again with new NetworkManager
$ rpm -qa | grep NetworkManager
NetworkManager-bluetooth-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-glib-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-pptp-gnome-0.9.8.2-6.fc21.x86_64
NetworkManager-vpnc-gnome-0.9.9.0-6.git20140428.fc21.x86_64
NetworkManager-pptp-0.9.8.2-6.fc21.x86_64
NetworkManager-wifi-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-config-connectivity-fedora-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-openvpn-0.9.9.0-3.git20140128.fc21.x86_64
NetworkManager-openvpn-gnome-0.9.9.0-3.git20140128.fc21.x86_64
NetworkManager-debuginfo-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-adsl-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-wwan-0.9.10.0-12.git20140704.fc21.x86_64
NetworkManager-openconnect-0.9.8.4-4.fc21.x86_64
NetworkManager-vpnc-0.9.9.0-6.git20140428.fc21.x86_64

Comment 34 Mikhail 2014-10-31 04:22:37 UTC

Created attachment 952386 [details]
Network Manager log (0.9.10.0-12)

Comment 35 Mikhail 2014-10-31 04:25:35 UTC

Created attachment 952387 [details]
strace of evolution-addressbook-factory

Comment 36 Mikhail 2014-10-31 04:26:45 UTC

Created attachment 952388 [details]
screenshot of htop

Comment 37 Mikhail 2014-11-02 15:01:10 UTC

Created attachment 952921 [details]
screenshot of htop

Comment 38 Mikhail 2014-11-02 15:01:48 UTC

Created attachment 952922 [details]
Network Manager log

Comment 39 Mikhail 2014-11-02 15:02:35 UTC

Created attachment 952923 [details]
strace of evolution-source-registry

Comment 40 Mikhail 2014-11-02 15:03:04 UTC

Created attachment 952924 [details]
strace of evolution-addressbook-factory

Comment 41 Christian Stadelmann 2014-11-04 09:37:58 UTC

Same here. NetworkManager update to NetworkManager-0.9.10.0-12.git20140704.fc21 does not fix the 100% CPU issue in Evolution.

Comment 42 Milan Crha 2014-11-05 07:49:57 UTC

Upstream bug report for the task_thread_cancelled():
https://bugzilla.gnome.org/show_bug.cgi?id=738545

Comment 43 Dan Winship 2014-11-05 19:52:19 UTC

I've put a new test build up at http://koji.fedoraproject.org/koji/taskinfo?taskID=8048589. It doesn't actually have any new bug fixes, but adds a little bit more debugging in the hopes that I can figure out where this loop is coming from...

(Christian, if you want to try it, see comment 8 about enabling NM debug logging first.)

Comment 44 Mikhail 2014-11-05 20:55:45 UTC

Created attachment 954180 [details]
Network Manager log

Comment 45 Dan Winship 2014-11-05 22:35:03 UTC

new build at http://koji.fedoraproject.org/koji/taskinfo?taskID=8051195. This includes both more debugging, and a potential fix

Comment 46 Mikhail 2014-11-06 03:43:10 UTC

(In reply to Dan Winship from comment #45)
> new build at http://koji.fedoraproject.org/koji/taskinfo?taskID=8051195.
> This includes both more debugging, and a potential fix

I am not sure but seems already all ok with this update :)

Comment 47 Mikhail 2014-11-06 03:45:07 UTC

Created attachment 954301 [details]
Network Manager log

Comment 48 Mikhail 2014-11-06 03:46:18 UTC

And NetworkManager log become much smaller.

Comment 49 Fedora Update System 2014-11-10 06:00:43 UTC

NetworkManager-0.9.10.0-12.git20140704.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 50 Dan Williams 2015-04-07 22:40:23 UTC

Mikhail, do you know if you have more than one IPv6 router on your LAN?

Comment 51 Mikhail 2015-04-08 07:20:03 UTC

There should be no more than one physical router.

[mikhail@localhost ~]$ ping6 -I enp2s0 FF02::2
PING FF02::2(ff02::2) from fe80::96de:80ff:fe6b:dd24 enp2s0: 56 data bytes
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=1 ttl=64 time=4.73 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=2 ttl=64 time=0.466 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=3 ttl=64 time=0.389 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=4 ttl=64 time=0.456 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=5 ttl=64 time=0.372 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=6 ttl=64 time=0.387 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=7 ttl=64 time=0.456 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=8 ttl=64 time=0.417 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=9 ttl=64 time=0.437 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=10 ttl=64 time=0.402 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=11 ttl=64 time=0.461 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=12 ttl=64 time=0.408 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=13 ttl=64 time=0.493 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=14 ttl=64 time=0.368 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=15 ttl=64 time=0.458 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=16 ttl=64 time=0.440 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=17 ttl=64 time=0.419 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=18 ttl=64 time=0.452 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=19 ttl=64 time=0.502 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=20 ttl=64 time=0.458 ms
^C
--- FF02::2 ping statistics ---
20 packets transmitted, 20 received, 0% packet loss, time 19001ms
rtt min/avg/max/mdev = 0.368/0.648/4.738/0.939 ms
[mikhail@localhost ~]$ ping6 -I enp2s0 FF02::1
PING FF02::1(ff02::1) from fe80::96de:80ff:fe6b:dd24 enp2s0: 56 data bytes
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=1 ttl=64 time=7.42 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=2 ttl=64 time=0.465 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=3 ttl=64 time=0.499 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=4 ttl=64 time=0.535 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=5 ttl=64 time=0.476 ms
64 bytes from fe80::faa9:d0ff:fe71:91ae: icmp_seq=5 ttl=64 time=48.0 ms (DUP!)
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=6 ttl=64 time=0.393 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=7 ttl=64 time=0.344 ms
64 bytes from fe80::faa9:d0ff:fe71:91ae: icmp_seq=7 ttl=64 time=92.0 ms (DUP!)
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=8 ttl=64 time=0.479 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=9 ttl=64 time=0.376 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=10 ttl=64 time=0.412 ms
64 bytes from fe80::faa9:d0ff:fe71:91ae: icmp_seq=10 ttl=64 time=62.0 ms (DUP!)
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=11 ttl=64 time=0.471 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=12 ttl=64 time=0.385 ms
64 bytes from fe80::faa9:d0ff:fe71:91ae: icmp_seq=12 ttl=64 time=107 ms (DUP!)
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=13 ttl=64 time=0.493 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=14 ttl=64 time=0.423 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=15 ttl=64 time=0.383 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=16 ttl=64 time=0.355 ms
64 bytes from fe80::faa9:d0ff:fe71:91ae: icmp_seq=16 ttl=64 time=97.1 ms (DUP!)
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=17 ttl=64 time=0.365 ms
64 bytes from fe80::16da:e9ff:fef8:a8f7: icmp_seq=18 ttl=64 time=0.311 ms
^C
--- FF02::1 ping statistics ---
18 packets transmitted, 18 received, +5 duplicates, 0% packet loss, time 17006ms
rtt min/avg/max/mdev = 0.311/18.305/107.269/34.833 ms

Comment 52 Dan Williams 2015-04-09 23:48:04 UTC

I mean, more than one machine sending out IPv6 Router Advertisement packets.  I have isolated a case where, if more than one router sends RAs, with one router including DNS servers/domains, and the other not, and the first router either stops responding or stops including servers/domains, NM could get into a tight loop sending a router solicitation, which could trigger the behavior you see.

Do you know which machine is sending the router advertisements, and do you control it?

Comment 53 Mikhail 2015-04-12 21:23:09 UTC

(In reply to Dan Williams from comment #52)
> I mean, more than one machine sending out IPv6 Router Advertisement packets.

In my LAN several computers with Linux Fedora, and phone with Android, but when I created this strace only one computer was worked. Can phone sending out IPv6 Router Advertisement packets I don't know.

> Do you know which machine is sending the router advertisements, and do you
> control it?

I can show my router settings. As I described above it can be phone, but I am not sure.

Comment 54 Mikhail 2015-04-12 21:23:37 UTC

Created attachment 1013767 [details]
router ipv6 settings

Comment 55 Fedora End Of Life 2015-11-04 15:36:30 UTC

This message is a reminder that Fedora 21 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 21. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '21'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 21 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 56 Fedora End Of Life 2015-12-02 04:14:08 UTC

Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.