Bug 1269086

Summary: regression in pxe install. very slow pxe installation (stall for an hour)
Product: Red Hat Enterprise Linux 7 Reporter: Levente Farkas <lfarkas>
Component: python-urlgrabberAssignee: Valentina Mukhamedzhanova <vmukhame>
Status: CLOSED DUPLICATE QA Contact: BaseOS QE - Apps <qe-baseos-apps>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: lfarkas, mkolman, packaging-team-maint, pmancillas
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-14 12:08:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
anaconda lsof during wait
none
anaconda yum strace during wait
none
after install log file in tmp anaconda
none
after install log file in tmp ifcfg
none
after install log file in tmp lsof
none
after install log file in tmp packaging
none
after install log file in tmp program
none
after install log file in tmp storage
none
after install log file in tmp strace
none
after install log file in tmp X none

Description Levente Farkas 2015-10-06 09:34:00 UTC
we install rhel/centos system through pxe install using our internal gigabit connected ftp mirror site. it's working properly with rhel/centos 5 and 6 but when we try to install an os 7 system it's stall at the 
"Starting package installation process" 
and this steps wait VERY long time. this means 30-120 minutes! during this time it seems to nothing happened at the console (ctrl-alt-f2) only anaconda-yum process is running. 
there is no cpu load there is no network load. we try to debug this situation in the last month but we can't find any reason or solution to the problem.
while if we start a dvd install exactly the same machine, network etc. then this step only takes a few minutes or second. anyway the network is fast since the package installation is faster from the network then from a dvd, but this preparing stage has some kind of serious problem.

this is the process list while the system wait so long:
------------------------------------
root         1  1.6  0.1  57180  7072 ?        Ss   09:27   0:04 /usr/lib/systemd/systemd --switched-root --system --deserialize 24
root      1183  0.0  0.0  43016  2312 ?        Ss   09:28   0:00 /usr/lib/systemd/systemd-journald
root      1197  0.1  0.0  42572  2188 ?        Ss   09:28   0:00 /usr/lib/systemd/systemd-udevd
root      1202  0.0  0.1 304268  5328 ?        SLl  09:28   0:00 /sbin/multipathd
root      1263  0.0  0.0 327528  1704 ?        Ssl  09:28   0:00 /usr/sbin/rsyslogd -n
root      1270  0.0  0.0  11744  1888 tty2     Ss+  09:28   0:00 /bin/bash --login
dbus      1300  0.0  0.0  34916  1924 ?        Ssl  09:28   0:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root      1302  0.0  0.2 415232 10480 ?        Ssl  09:28   0:00 /usr/sbin/NetworkManager --no-daemon
root      1310  0.0  0.4 103736 15672 ?        S    09:28   0:00  \_ /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp3s0.pid -lf /var/lib/NetworkManager/dhclient-dd603c24-c6c5-474d-af74-e0c378612354-enp3s0.lease -cf /var/lib/NetworkManager/dhclient-enp3s0.conf enp3s0
polkitd   1311  0.0  0.3 516084 11752 ?        Ssl  09:28   0:00 /usr/lib/polkit-1/polkitd --no-debug
root      1386  0.0  0.0  82488  3600 ?        Ss   09:28   0:00 /usr/sbin/sshd -D -f /etc/ssh/sshd_config.anaconda
root      1762  2.0  0.1  87120  4156 ?        Ss   09:31   0:00  \_ sshd: root@pts/5
root      1764  1.1  0.0  11740  1816 pts/5    Ss   09:31   0:00      \_ -bash
root      1771  0.0  0.0  19892  1372 pts/5    R+   09:32   0:00          \_ ps axufww
root      1389  0.1  0.1  24796  4228 ?        Ss   09:28   0:00 /usr/bin/tmux -u -f /usr/share/anaconda/tmux.conf start
root      1390 26.6  4.0 653540 153440 pts/0   Ssl+ 09:28   1:00  \_ /usr/bin/python /sbin/anaconda
root      1717 19.1  1.4 294072 55020 pts/0    S+   09:30   0:20  |   \_ /usr/bin/python /usr/libexec/anaconda/anaconda-yum --config /tmp/anaconda-yum.conf --tsfile /mnt/sysimage/anaconda-yum.yumtx --rpmlog /tmp/rpm-script.log --installroot /mnt/sysimage --release 7 --arch x86_64 --macro __dbi_htconfig hash nofsync %{__dbi_other} %{__dbi_perms} --macro __file_context_path /etc/selinux/targeted/contexts/files/file_contexts
root      1391  0.0  0.0 115348  1968 pts/1    Ss+  09:28   0:00  \_ bash --login
root      1392  0.0  0.0 107932   696 pts/2    Ss+  09:28   0:00  \_ tail -F /tmp/anaconda.log
root      1393  0.0  0.0 107932   696 pts/3    Ss+  09:28   0:00  \_ tail -F /tmp/storage.log
root      1394  0.0  0.0 107932   700 pts/4    Ss+  09:28   0:00  \_ tail -F /tmp/program.log
root      1402  0.0  0.0  22088  1072 tty1     Ss+  09:28   0:00 /usr/bin/tmux -u attach -t anaconda
root      1418  0.0  0.1 115484  4272 ?        Ss   09:28   0:00 /usr/sbin/lvmetad -f
root      1468  0.0  0.0  22664  2068 ?        Ss   09:28   0:00 /usr/lib/systemd/systemd-localed
chrony    1486  0.0  0.0  24728  1468 ?        S    09:28   0:00 /usr/sbin/chronyd -u chrony
------------------------------------

Comment 2 Brian Lane 2015-10-06 18:45:59 UTC
Please attach the logs from /tmp/*log asa individual text/plain attachments.

Also, when it gets stuck switch to the shell and attach the output from:

lsof -p <pid of anaconda-yum>
strace -o anaconda-yum.strace -p <pid of anaconda-yum>

You'll have to crtl-c the strace, let it gather for a minute or so.

Comment 3 Levente Farkas 2015-10-07 06:52:34 UTC
Created attachment 1080525 [details]
anaconda lsof during wait

Comment 4 Levente Farkas 2015-10-07 06:53:23 UTC
Created attachment 1080526 [details]
anaconda yum strace during wait

Comment 5 Levente Farkas 2015-10-07 06:54:05 UTC
Created attachment 1080527 [details]
after install log file in tmp anaconda

Comment 6 Levente Farkas 2015-10-07 06:54:22 UTC
Created attachment 1080528 [details]
after install log file in tmp ifcfg

Comment 7 Levente Farkas 2015-10-07 06:54:39 UTC
Created attachment 1080529 [details]
after install log file in tmp lsof

Comment 8 Levente Farkas 2015-10-07 06:55:15 UTC
Created attachment 1080530 [details]
after install log file in tmp packaging

Comment 9 Levente Farkas 2015-10-07 06:55:40 UTC
Created attachment 1080531 [details]
after install log file in tmp program

Comment 10 Levente Farkas 2015-10-07 06:55:58 UTC
Created attachment 1080532 [details]
after install log file in tmp storage

Comment 11 Levente Farkas 2015-10-07 06:56:16 UTC
Created attachment 1080533 [details]
after install log file in tmp strace

Comment 12 Levente Farkas 2015-10-07 06:56:34 UTC
Created attachment 1080534 [details]
after install log file in tmp X

Comment 13 Martin Kolman 2015-10-07 14:48:30 UTC
Looking at the packaging log the single longest operation seems to be "populate transaction set" which takes about 20 minutes:

05:11:41,337 DEBUG packaging:  populate transaction set
05:33:33,774 DEBUG packaging:  check transaction set
05:33:33,906 DEBUG packaging:  order transaction set
05:33:33,922 INFO packaging:  running transaction

That would certainly be something to investigate - if it really needs to take so long or if it is taking too long due to some bug. In case it really needs to take so long there should definitely be some feedback provided to the user ("populating transaction set 9%", etc.),

After that you can see yum installing individual packages for about 40 minutes. Considering it is installing 1200 packages, that amounts to 30 packages being installed per minute or one every 2 seconds. This is probably I/O bound - either by the disk or network when the packages are being downloaded (IIRC the packages are continually downloaded before being installed, not at once before the transaction).

Comment 14 Levente Farkas 2015-10-07 16:16:11 UTC
one more important note is that it's working in OS-6. everything is on a gigabit local network. the tftp server, the ftp server and workstation which try to install is 8, 16 and 4gb ram and at least intel i5 cpu. so imho the local setup can't be the bottleneck.
and imho if it's working (on OS-6) with yum than it should have to work on OS-7 too.
yesterday i try to install my new workstation with 16gb ram 250gb ssd and gigabit network has the same effect. so imho something was totally different in OS-6 yum package management then on OS-7.

i'd be happy to test or debug any kind of tipp patch enhancement.

Comment 15 Levente Farkas 2015-10-19 09:46:52 UTC
is there any progress with this regression?

Comment 16 Paul Mancillas 2015-11-05 21:05:12 UTC
We had the same problem using ftp. When we changed the protocol to http the installation completed much faster. So it's definitely a problem with the ftp transfer.

Comment 17 Levente Farkas 2015-11-06 08:57:23 UTC
unfortunately it's not working for us. if we replace ftp to http it's still VERY slow. which means from a CD it takes a few second from the network it takes a half an hour (both ftp and http).

Comment 18 Levente Farkas 2015-11-06 12:12:27 UTC
sorry. it was our fault. since our http server redirect to ftp. and yes it seems only ftp protocol slow while with http it's fast again. so we've got a workaround now at least. anyway we use vsftpd for ftp.

Comment 19 Chris Lumens 2016-01-11 21:37:25 UTC
Reassigning to urlgrabber - we don't implement the fetching of anything, just give it the URLs and tell it to go.  This appears to only be happening with FTP, not with HTTP.

Comment 20 Valentina Mukhamedzhanova 2016-01-14 12:08:20 UTC
This looks like a duplicate of bug 1152628 and bug 1218272, and should be fixed in the latest version of curl.

*** This bug has been marked as a duplicate of bug 1218272 ***