Bug 1494834
Summary: | NFS gets hung after upgrade to 7.4 (CentOS) | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Nikolaos Milas <nmilas> |
Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
Status: | CLOSED NOTABUG | QA Contact: | Filesystem QE <fs-qe> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.4 | CC: | nmilas, yoyang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-10-25 13:00:39 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Attachments: |
Description
Nikolaos Milas
2017-09-23 12:35:25 UTC
After further tests, it seems the problem occurs mainly when mounting at boot time (through /etc/fstab). I have managed to work successfully multiple times when mounting manually, but it is important for us to be able to mount the NFS share at boot time, through /etc/fstab. Mounting through /etc/fstab fails every time. Please advise. Created attachment 1332697 [details]
Full messages file after reboot, including nfs mounts in /etc/fstab
Full output from /var/log/messages after reboot, when nfs mounts exist in /etc/fstab.
The following actions were made during this period (notice how nfs hangs):
[Parallel Session 1 right after boot (see parallel session 2 below)]
[root@hesperia1 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 46G 7.6G 39G 17% /
devtmpfs 1.9G 0 1.9G 0% /dev
tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs 1.9G 8.6M 1.9G 1% /run
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/mapper/vg2-lv1 100G 94G 6.2G 94% /hesperiamount
/dev/vda1 497M 216M 282M 44% /boot
10.201.40.34:/data/col1/noc-bkups-1 11T 1.1T 9.4T 10% /mnt/dd2500-1
10.201.40.34:/data/col1/hesperia-mount 11T 1.1T 9.4T 10% /hesperiamount2
tmpfs 380M 0 380M 0% /run/user/998
tmpfs 380M 0 380M 0% /run/user/1001
tmpfs 380M 0 380M 0% /run/user/0
[root@hesperia1 ~]# ls -la /hesperiamount2/isnet1/
total 1315
drwxr-xr-x 16 isnet1 isnet 3604 Apr 26 15:26 .
drwxrwxrwx 6 root root 274 Mar 2 2017 ..
drwxr-xr-x 5 isnet1 isnet 857 Jul 17 09:47 AGGELIS_DEBUG
-rwxr-xr-x 1 isnet1 isnet 15929 Mar 24 2017 alert_processes
-rw-r--r-- 1 isnet1 isnet 240 Mar 24 2017 ALERT_PROCESSES.ini
-rw------- 1 isnet1 isnet 819474 Apr 26 15:47 .bash_history
-rw-r--r-- 1 isnet1 isnet 220 Apr 9 2016 .bash_logout
-rw-r--r-- 1 isnet1 isnet 193 Aug 2 2016 .bash_profile
-rw-r--r-- 1 isnet1 isnet 3625 Apr 25 2016 .bashrc
-rw-r--r-- 1 isnet1 isnet 3515 Apr 9 2016 .bashrc_old
-rwxr-xr-x 1 isnet1 isnet 42 Feb 17 2017 ChangeTonano
-rwxr-xr-x 1 isnet1 isnet 12010 Mar 24 2017 check_processes_work_well
-rw-r--r-- 1 isnet1 isnet 138 Jan 15 2017 CHECK_PROCESSES_WORK_WELL.ini
-rwxr-xr-x 1 isnet1 isnet 10287 Feb 5 2017 check_processes_work_well.save
drwx------ 4 isnet1 isnet 207 Feb 23 2017 .config
-rwxr-xr-x 1 isnet1 isnet 606 Mar 24 2017 cronreleaserealtime
-rwxr-xr-x 1 isnet1 isnet 1308 Mar 24 2017 cronumasep500
drwxr-xr-x 2 isnet1 isnet 412 Mar 26 2017 Database
-rwxr-xr-x 1 isnet1 isnet 266 Mar 24 2017 GetReleaseToLocalhost
-rwxr-xr-x 1 isnet1 isnet 346 Feb 17 2017 GetUmasepLastFile
-rwxr-xr-x 1 isnet1 isnet 211 Feb 13 2017 GetUmasepToLocalhost
-rwxr-xr-x 1 isnet1 isnet 117 Apr 26 15:05 GetUmasepToLocalhostHTTP
drwxr-xr-x 3 isnet1 isnet 164 Feb 19 2017 hesperiamount
-rwxr-xr-x 1 isnet1 isnet 12203 Mar 24 2017 kernel_email
-rw-r--r-- 1 isnet1 isnet 76 Mar 24 2017 KERNEL_EMAIL.ini
-rw-r--r-- 1 isnet1 isnet 172 Nov 3 2015 .kshrc
-rw-r--r-- 1 isnet1 isnet 135581 Jun 1 16:58 Latest_SEP_500_estimations_2017_06_01.txt
-rw------- 1 isnet1 isnet 43 Jul 17 08:09 .lesshst
drwxr-xr-x 3 isnet1 isnet 155 Feb 23 2017 .local
drwxr-x--- 2 isnet1 isnet 11237 Sep 29 04:02 log
drwx------ 2 isnet1 isnet 101 Apr 27 2016 Mail
-rw------- 1 isnet1 isnet 7941 Apr 26 15:26 .mysql_history
-rw------- 1 isnet1 isnet 17 Feb 11 2017 .nano_history
-rw------- 1 isnet1 isnet 200281 Apr 8 14:57 nohup.out
-rw-r--r-- 1 isnet1 isnet 675 Feb 23 2017 .profile
lrwxrwxrwx 1 root root 23 Feb 20 2017 release -> /hesperiamount/release1
drwxr-xr-x 2 isnet1 isnet 353 Mar 24 2017 RELEASE_ALERT_IMAGES
-rwxr-xr-x 1 isnet1 isnet 18967 Mar 24 2017 release_epam_realtime
-rw-r--r-- 1 isnet1 isnet 382 Feb 19 2017 RELEASE_EPAM_REALTIME.ini
-rwxr-xr-x 1 isnet1 isnet 18970 Mar 24 2017 release_ephin_realtime
-rw-r--r-- 1 isnet1 isnet 382 Jan 16 2017 RELEASE_EPHIN_REALTIME.ini
drwxr-xr-x 6 isnet1 isnet 413 Jan 14 2017 release_local
drwxr-xr-x 2 isnet1 isnet 174 Feb 13 2017 RELEASE_realtime
-rwxr-xr-x 1 isnet1 isnet 237 Jan 19 2017 ReleaseToComp1
-rwxr-xr-x 1 isnet1 isnet 95 Jul 4 2016 sarlmove
-rw-r--r-- 1 isnet1 isnet 66 Apr 25 2016 .selected_editor
-rwxr-xr-x 1 isnet1 isnet 1610 May 20 20:18 send_email
-rwxr-xr-x 1 isnet1 isnet 1433 Mar 24 2017 send_email.py
-rw-r--r-- 1 isnet1 isnet 1222 Mar 24 2017 send_email.pyc
drwx------ 2 isnet1 isnet 275 Apr 26 2016 .ssh
-rwxr-xr-x 1 isnet1 isnet 21019 Apr 8 14:34 umasep500_1_minute
-rw-r--r-- 1 isnet1 isnet 253 Apr 26 14:53 UMASEP_500.ini
drwxr-xr-x 4 isnet1 isnet 207 Jan 10 2017 UMASEP_500MEV_IMAGES
drwxr-xr-x 2 isnet1 isnet 19063 Sep 17 00:02 UMASEP_realtime
-rwxr-xr-x 1 isnet1 isnet 107 Jun 22 2016 UmasepToComp1
drwxr-xr-x 2 isnet1 isnet 403 Apr 9 18:46 webform
-rw------- 1 isnet1 isnet 171 Feb 14 2017 .Xauthority
[root@hesperia1 ~]# rpcdebug -v -m rpc -s all
rpc xprt call debug nfs auth bind sched trans svcsock svcdsp misc cache
Module Valid flags
rpc xprt call debug nfs auth bind sched trans svcsock svcdsp misc cache
[root@hesperia1 ~]#
[root@hesperia1 ~]# rpcdebug -v -m nfs -s all
nfs vfs dircache lookupcache pagecache proc xdr file root callback client mount fscache pnfs pnfs_ld state
Module Valid flags
nfs vfs dircache lookupcache pagecache proc xdr file root callback client mount fscache pnfs pnfs_ld state
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]# Disconnecting: Timeout, server not responding.
<session hung>
[Parallel Session 2 (right after boot) on another terminal]
[root@hesperia1 ~]# rsync -azv --del --stats --progress /hesperiamount/isnet1/ /hesperiamount2/isnet1
sending incremental file list
RELEASE_ALERT_IMAGES/release_alert_merged_plots.png
310091 100% 8.82MB/s 0:00:00 (xfer#1, to-check=1062/1153)
Disconnecting: Timeout, server not responding.
[New Terminal Session Follows, when the above hung, but before they display "Timeout, server not responding"]
[root@hesperia1 ~]# ps axjf
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
0 2 0 0 ? -1 S 0 0:00 [kthreadd]
2 3 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/0]
2 4 0 0 ? -1 S 0 0:00 \_ [kworker/0:0]
2 5 0 0 ? -1 S< 0 0:00 \_ [kworker/0:0H]
2 6 0 0 ? -1 S 0 0:00 \_ [kworker/u8:0]
2 7 0 0 ? -1 S 0 0:00 \_ [migration/0]
2 8 0 0 ? -1 S 0 0:00 \_ [rcu_bh]
2 9 0 0 ? -1 S 0 0:00 \_ [rcu_sched]
2 10 0 0 ? -1 S 0 0:00 \_ [watchdog/0]
2 11 0 0 ? -1 S 0 0:00 \_ [watchdog/1]
2 12 0 0 ? -1 S 0 0:00 \_ [migration/1]
2 13 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/1]
2 14 0 0 ? -1 S 0 0:00 \_ [kworker/1:0]
2 15 0 0 ? -1 S< 0 0:00 \_ [kworker/1:0H]
2 16 0 0 ? -1 S 0 0:00 \_ [watchdog/2]
2 17 0 0 ? -1 S 0 0:00 \_ [migration/2]
2 18 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/2]
2 19 0 0 ? -1 S 0 0:00 \_ [kworker/2:0]
2 20 0 0 ? -1 S< 0 0:00 \_ [kworker/2:0H]
2 21 0 0 ? -1 S 0 0:00 \_ [watchdog/3]
2 22 0 0 ? -1 S 0 0:00 \_ [migration/3]
2 23 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/3]
2 24 0 0 ? -1 S 0 0:00 \_ [kworker/3:0]
2 25 0 0 ? -1 S< 0 0:00 \_ [kworker/3:0H]
2 27 0 0 ? -1 S 0 0:00 \_ [kdevtmpfs]
2 28 0 0 ? -1 S< 0 0:00 \_ [netns]
2 29 0 0 ? -1 S 0 0:00 \_ [khungtaskd]
2 30 0 0 ? -1 S< 0 0:00 \_ [writeback]
2 31 0 0 ? -1 S< 0 0:00 \_ [kintegrityd]
2 32 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 33 0 0 ? -1 S< 0 0:00 \_ [kblockd]
2 34 0 0 ? -1 S< 0 0:00 \_ [md]
2 35 0 0 ? -1 S 0 0:00 \_ [kworker/0:1]
2 36 0 0 ? -1 S 0 0:00 \_ [kworker/1:1]
2 37 0 0 ? -1 S 0 0:00 \_ [kworker/2:1]
2 38 0 0 ? -1 S 0 0:00 \_ [kworker/3:1]
2 40 0 0 ? -1 S 0 0:00 \_ [kswapd0]
2 41 0 0 ? -1 SN 0 0:00 \_ [ksmd]
2 42 0 0 ? -1 SN 0 0:00 \_ [khugepaged]
2 43 0 0 ? -1 S< 0 0:00 \_ [crypto]
2 51 0 0 ? -1 S< 0 0:00 \_ [kthrotld]
2 52 0 0 ? -1 S 0 0:00 \_ [kworker/u8:1]
2 53 0 0 ? -1 S< 0 0:00 \_ [kmpath_rdacd]
2 54 0 0 ? -1 S< 0 0:00 \_ [kpsmoused]
2 55 0 0 ? -1 S< 0 0:00 \_ [ipv6_addrconf]
2 74 0 0 ? -1 S< 0 0:00 \_ [deferwq]
2 106 0 0 ? -1 S 0 0:00 \_ [kauditd]
2 286 0 0 ? -1 S< 0 0:00 \_ [ata_sff]
2 300 0 0 ? -1 S 0 0:00 \_ [scsi_eh_0]
2 301 0 0 ? -1 S< 0 0:00 \_ [scsi_tmf_0]
2 302 0 0 ? -1 S 0 0:00 \_ [scsi_eh_1]
2 303 0 0 ? -1 S< 0 0:00 \_ [scsi_tmf_1]
2 304 0 0 ? -1 S< 0 0:00 \_ [ttm_swap]
2 356 0 0 ? -1 S 0 0:00 \_ [kworker/3:2]
2 357 0 0 ? -1 S< 0 0:00 \_ [kworker/2:1H]
2 359 0 0 ? -1 S 0 0:00 \_ [kworker/2:2]
2 399 0 0 ? -1 S< 0 0:00 \_ [kdmflush]
2 400 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 411 0 0 ? -1 S< 0 0:00 \_ [kdmflush]
2 412 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 425 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 426 0 0 ? -1 S< 0 0:00 \_ [xfsalloc]
2 427 0 0 ? -1 S< 0 0:00 \_ [xfs_mru_cache]
2 428 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/dm-0]
2 429 0 0 ? -1 S< 0 0:00 \_ [xfs-data/dm-0]
2 430 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/dm-0]
2 431 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/dm-0]
2 432 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/dm-]
2 433 0 0 ? -1 S< 0 0:00 \_ [xfs-log/dm-0]
2 434 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/d]
2 435 0 0 ? -1 S 0 0:00 \_ [xfsaild/dm-0]
2 542 0 0 ? -1 S< 0 0:00 \_ [rpciod]
2 543 0 0 ? -1 S< 0 0:00 \_ [xprtiod]
2 587 0 0 ? -1 S 0 0:00 \_ [kworker/1:2]
2 592 0 0 ? -1 S< 0 0:00 \_ [kworker/0:1H]
2 602 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/vda1]
2 603 0 0 ? -1 S< 0 0:00 \_ [xfs-data/vda1]
2 604 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/vda1]
2 605 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/vda1]
2 606 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/vda]
2 607 0 0 ? -1 S< 0 0:00 \_ [xfs-log/vda1]
2 608 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/v]
2 609 0 0 ? -1 S 0 0:00 \_ [xfsaild/vda1]
2 611 0 0 ? -1 S< 0 0:00 \_ [kworker/3:1H]
2 614 0 0 ? -1 S< 0 0:00 \_ [kdmflush]
2 615 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 622 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/dm-2]
2 623 0 0 ? -1 S< 0 0:00 \_ [xfs-data/dm-2]
2 624 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/dm-2]
2 625 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/dm-2]
2 626 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/dm-]
2 627 0 0 ? -1 S< 0 0:00 \_ [xfs-log/dm-2]
2 628 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/d]
2 629 0 0 ? -1 S 0 0:00 \_ [xfsaild/dm-2]
2 782 0 0 ? -1 S< 0 0:00 \_ [kworker/1:1H]
2 976 0 0 ? -1 S< 0 0:00 \_ [nfsiod]
2 11912 0 0 ? -1 S 0 0:00 \_ [kworker/1:3]
2 12394 0 0 ? -1 S 0 0:00 \_ [kworker/2:3]
2 12528 0 0 ? -1 S 0 0:00 \_ [kworker/0:2]
0 1 1 1 ? -1 Ss 0 0:01 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
1 505 505 505 ? -1 Ss 0 0:12 /usr/lib/systemd/systemd-journald
1 533 533 533 ? -1 Ss 0 0:00 /usr/sbin/lvmetad -f
1 541 541 541 ? -1 Ss 0 0:00 /usr/lib/systemd/systemd-udevd
1 654 654 654 ? -1 S<sl 0 0:00 /sbin/auditd
1 682 682 682 ? -1 Ss 0 0:00 /usr/sbin/irqbalance --foreground
1 683 683 683 ? -1 Ss 0 0:00 /usr/lib/systemd/systemd-logind
1 684 684 684 ? -1 Ssl 0 0:16 /usr/sbin/rsyslogd -n
1 685 685 685 ? -1 Ssl 999 0:00 /usr/lib/polkit-1/polkitd --no-debug
1 687 687 687 ? -1 Ss 81 0:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
1 703 703 703 ? -1 Ssl 0 0:00 /usr/sbin/gssproxy -D
1 715 715 715 ? -1 Ssl 0 0:00 /usr/sbin/NetworkManager --no-daemon
1 953 953 953 ? -1 Ssl 0 0:00 /usr/bin/python -Es /usr/sbin/tuned -l -P
1 961 961 961 ? -1 Ss 0 0:00 /usr/sbin/sshd -D
961 11768 11768 11768 ? -1 Ss 0 0:00 \_ sshd: root@pts/0
11768 11770 11770 11770 pts/0 11770 Ss+ 0 0:00 | \_ -bash
961 12423 12423 12423 ? -1 Ss 0 0:00 \_ sshd: root@pts/1
12423 12425 12425 12425 pts/1 12445 Ss 0 0:00 | \_ -bash
12425 12445 12445 12425 pts/1 12445 S+ 0 0:00 | \_ rsync -azv --del --stats --progress /hesperiamount/isnet1/ /hesperiamount2/isnet1
12445 12447 12445 12425 pts/1 12445 D+ 0 0:01 | \_ rsync -azv --del --stats --progress /hesperiamount/isnet1/ /hesperiamount2/isnet1
12447 12448 12445 12425 pts/1 12445 S+ 0 0:00 | \_ rsync -azv --del --stats --progress /hesperiamount/isnet1/ /hesperiamount2/isnet1
961 12568 12568 12568 ? -1 Ss 0 0:00 \_ sshd: root@pts/2
12568 12571 12571 12571 pts/2 12594 Ss 0 0:00 \_ -bash
12571 12594 12594 12571 pts/2 12594 R+ 0 0:00 \_ ps axjf
1 986 986 986 ? -1 Ss 27 0:00 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
986 1293 986 986 ? -1 Sl 27 0:00 \_ /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log
1 1021 1021 1021 ? -1 Ss 0 0:00 /usr/sbin/httpd -DFOREGROUND
1021 2826 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2828 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2829 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2832 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2835 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 4799 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 8606 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1 1040 1040 1040 ? -1 Ss 0 0:00 /usr/sbin/crond -n
1040 12330 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
12330 12350 12350 12350 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/check_processes_work_well > /dev/null 2>&1
12350 12360 12350 12350 ? -1 S 1001 0:00 | \_ /usr/bin/python /hesperiamount/isnet1/check_processes_work_well
1040 12332 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
12332 12345 12345 12345 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/GetUmasepLastFile >> /hesperiamount/isnet1/log/UMASEP_ftp_get_
12345 12358 12345 12345 ? -1 S 1001 0:00 | \_ /bin/sh /hesperiamount/isnet1/GetUmasepLastFile
12358 12367 12345 12345 ? -1 S 1001 0:00 | \_ ftp -p -n -v spaceweather.uma.es
1040 12481 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
12481 12504 12504 12504 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/check_processes_work_well > /dev/null 2>&1
12504 12512 12504 12504 ? -1 S 1001 0:00 | \_ /usr/bin/python /hesperiamount/isnet1/check_processes_work_well
1040 12483 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
12483 12498 12498 12498 ? -1 Ss 1001 0:00 \_ /bin/sh -c /hesperiamount/isnet1/GetUmasepLastFile >> /hesperiamount/isnet1/log/UMASEP_ftp_get_
12498 12508 12498 12498 ? -1 S 1001 0:00 \_ /bin/sh /hesperiamount/isnet1/GetUmasepLastFile
12508 12515 12498 12498 ? -1 S 1001 0:00 \_ ftp -p -n -v spaceweather.uma.es
1 1121 1121 1121 tty1 1121 Ss+ 0 0:00 /sbin/agetty --noclear tty1 linux
1 1365 1365 1365 ? -1 Ss 998 0:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
1 1620 1620 1620 ? -1 Ss 0 0:00 /usr/libexec/postfix/master -w
1620 1687 1620 1620 ? -1 S 89 0:00 \_ pickup -l -t unix -u
1620 1688 1620 1620 ? -1 S 89 0:00 \_ qmgr -l -t unix -u
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]# less /var/log/messages
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]# ls -la /hesperiamount2/is
<session hung>
[New Terminal Session Follows]
[root@hesperia1 ~]# top
top - 14:10:09 up 11 min, 3 users, load average: 3.44, 2.27, 1.11
Tasks: 157 total, 3 running, 153 sleeping, 0 stopped, 1 zombie
%Cpu(s): 48.6 us, 0.9 sy, 0.0 ni, 50.3 id, 0.0 wa, 0.0 hi, 0.1 si, 0.1 st
KiB Mem : 3881424 total, 2761644 free, 439432 used, 680348 buff/cache
KiB Swap: 4063228 total, 4063228 free, 0 used. 2879808 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13508 release1 20 0 382188 94528 7760 R 99.7 2.4 0:06.70 python
13517 release1 20 0 376016 88396 7760 R 99.7 2.3 0:06.95 python
1 root 20 0 125408 3840 2420 S 0.3 0.1 0:01.78 systemd
9 root 20 0 0 0 0 S 0.3 0.0 0:00.51 rcu_sched
715 root 20 0 469628 8676 6428 S 0.3 0.2 0:00.24 NetworkManager
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u8:0
7 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
10 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/1
12 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/1
13 root 20 0 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/1
15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
16 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/2
17 root rt 0 0 0 0 S 0.0 0.0 0:00.02 migration/2
18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2
20 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/2:0H
21 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/3
22 root rt 0 0 0 0 S 0.0 0.0 0:00.05 migration/3
23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/3
25 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/3:0H
27 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs
28 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns
29 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khungtaskd
30 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback
31 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd
32 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset
33 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd
34 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 md
35 root 20 0 0 0 0 S 0.0 0.0 0:00.15 kworker/0:1
36 root 20 0 0 0 0 S 0.0 0.0 0:00.10 kworker/1:1
37 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kworker/2:1
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]#
[root@hesperia1 ~]# ps axjf
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
0 2 0 0 ? -1 S 0 0:00 [kthreadd]
2 3 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/0]
2 5 0 0 ? -1 S< 0 0:00 \_ [kworker/0:0H]
2 6 0 0 ? -1 S 0 0:00 \_ [kworker/u8:0]
2 7 0 0 ? -1 S 0 0:00 \_ [migration/0]
2 8 0 0 ? -1 S 0 0:00 \_ [rcu_bh]
2 9 0 0 ? -1 S 0 0:00 \_ [rcu_sched]
2 10 0 0 ? -1 S 0 0:00 \_ [watchdog/0]
2 11 0 0 ? -1 S 0 0:00 \_ [watchdog/1]
2 12 0 0 ? -1 S 0 0:00 \_ [migration/1]
2 13 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/1]
2 15 0 0 ? -1 S< 0 0:00 \_ [kworker/1:0H]
2 16 0 0 ? -1 S 0 0:00 \_ [watchdog/2]
2 17 0 0 ? -1 S 0 0:00 \_ [migration/2]
2 18 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/2]
2 20 0 0 ? -1 S< 0 0:00 \_ [kworker/2:0H]
2 21 0 0 ? -1 S 0 0:00 \_ [watchdog/3]
2 22 0 0 ? -1 S 0 0:00 \_ [migration/3]
2 23 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/3]
2 25 0 0 ? -1 S< 0 0:00 \_ [kworker/3:0H]
2 27 0 0 ? -1 S 0 0:00 \_ [kdevtmpfs]
2 28 0 0 ? -1 S< 0 0:00 \_ [netns]
2 29 0 0 ? -1 S 0 0:00 \_ [khungtaskd]
2 30 0 0 ? -1 S< 0 0:00 \_ [writeback]
2 31 0 0 ? -1 S< 0 0:00 \_ [kintegrityd]
2 32 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 33 0 0 ? -1 S< 0 0:00 \_ [kblockd]
2 34 0 0 ? -1 S< 0 0:00 \_ [md]
2 35 0 0 ? -1 S 0 0:00 \_ [kworker/0:1]
2 36 0 0 ? -1 S 0 0:00 \_ [kworker/1:1]
2 37 0 0 ? -1 S 0 0:00 \_ [kworker/2:1]
2 38 0 0 ? -1 S 0 0:00 \_ [kworker/3:1]
2 40 0 0 ? -1 S 0 0:00 \_ [kswapd0]
2 41 0 0 ? -1 SN 0 0:00 \_ [ksmd]
2 42 0 0 ? -1 SN 0 0:00 \_ [khugepaged]
2 43 0 0 ? -1 S< 0 0:00 \_ [crypto]
2 51 0 0 ? -1 S< 0 0:00 \_ [kthrotld]
2 52 0 0 ? -1 S 0 0:00 \_ [kworker/u8:1]
2 53 0 0 ? -1 S< 0 0:00 \_ [kmpath_rdacd]
2 54 0 0 ? -1 S< 0 0:00 \_ [kpsmoused]
2 55 0 0 ? -1 S< 0 0:00 \_ [ipv6_addrconf]
2 74 0 0 ? -1 S< 0 0:00 \_ [deferwq]
2 106 0 0 ? -1 S 0 0:00 \_ [kauditd]
2 286 0 0 ? -1 S< 0 0:00 \_ [ata_sff]
2 300 0 0 ? -1 S 0 0:00 \_ [scsi_eh_0]
2 301 0 0 ? -1 S< 0 0:00 \_ [scsi_tmf_0]
2 302 0 0 ? -1 S 0 0:00 \_ [scsi_eh_1]
2 303 0 0 ? -1 S< 0 0:00 \_ [scsi_tmf_1]
2 304 0 0 ? -1 S< 0 0:00 \_ [ttm_swap]
2 356 0 0 ? -1 S 0 0:00 \_ [kworker/3:2]
2 357 0 0 ? -1 S< 0 0:00 \_ [kworker/2:1H]
2 399 0 0 ? -1 S< 0 0:00 \_ [kdmflush]
2 400 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 411 0 0 ? -1 S< 0 0:00 \_ [kdmflush]
2 412 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 425 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 426 0 0 ? -1 S< 0 0:00 \_ [xfsalloc]
2 427 0 0 ? -1 S< 0 0:00 \_ [xfs_mru_cache]
2 428 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/dm-0]
2 429 0 0 ? -1 S< 0 0:00 \_ [xfs-data/dm-0]
2 430 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/dm-0]
2 431 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/dm-0]
2 432 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/dm-]
2 433 0 0 ? -1 S< 0 0:00 \_ [xfs-log/dm-0]
2 434 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/d]
2 435 0 0 ? -1 S 0 0:00 \_ [xfsaild/dm-0]
2 542 0 0 ? -1 S< 0 0:00 \_ [rpciod]
2 543 0 0 ? -1 S< 0 0:00 \_ [xprtiod]
2 592 0 0 ? -1 S< 0 0:00 \_ [kworker/0:1H]
2 602 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/vda1]
2 603 0 0 ? -1 S< 0 0:00 \_ [xfs-data/vda1]
2 604 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/vda1]
2 605 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/vda1]
2 606 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/vda]
2 607 0 0 ? -1 S< 0 0:00 \_ [xfs-log/vda1]
2 608 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/v]
2 609 0 0 ? -1 S 0 0:00 \_ [xfsaild/vda1]
2 611 0 0 ? -1 S< 0 0:00 \_ [kworker/3:1H]
2 614 0 0 ? -1 S< 0 0:00 \_ [kdmflush]
2 615 0 0 ? -1 S< 0 0:00 \_ [bioset]
2 622 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/dm-2]
2 623 0 0 ? -1 S< 0 0:00 \_ [xfs-data/dm-2]
2 624 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/dm-2]
2 625 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/dm-2]
2 626 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/dm-]
2 627 0 0 ? -1 S< 0 0:00 \_ [xfs-log/dm-2]
2 628 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/d]
2 629 0 0 ? -1 S 0 0:00 \_ [xfsaild/dm-2]
2 782 0 0 ? -1 S< 0 0:00 \_ [kworker/1:1H]
2 976 0 0 ? -1 S< 0 0:00 \_ [nfsiod]
2 12394 0 0 ? -1 S 0 0:00 \_ [kworker/2:3]
2 12528 0 0 ? -1 S 0 0:00 \_ [kworker/0:2]
2 12770 0 0 ? -1 S 0 0:00 \_ [kworker/1:0]
2 12925 0 0 ? -1 S 0 0:00 \_ [kworker/1:3]
2 12940 0 0 ? -1 S 0 0:00 \_ [kworker/3:0]
2 13265 0 0 ? -1 S 0 0:00 \_ [kworker/2:0]
2 13745 0 0 ? -1 S 0 0:00 \_ [kworker/3:3]
2 13902 0 0 ? -1 S 0 0:00 \_ [kworker/0:0]
0 1 1 1 ? -1 Ss 0 0:01 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
1 505 505 505 ? -1 Ss 0 0:12 /usr/lib/systemd/systemd-journald
1 533 533 533 ? -1 Ss 0 0:00 /usr/sbin/lvmetad -f
1 541 541 541 ? -1 Ss 0 0:00 /usr/lib/systemd/systemd-udevd
1 654 654 654 ? -1 S<sl 0 0:00 /sbin/auditd
1 682 682 682 ? -1 Ss 0 0:00 /usr/sbin/irqbalance --foreground
1 683 683 683 ? -1 Ss 0 0:00 /usr/lib/systemd/systemd-logind
1 684 684 684 ? -1 Ssl 0 0:16 /usr/sbin/rsyslogd -n
1 685 685 685 ? -1 Ssl 999 0:00 /usr/lib/polkit-1/polkitd --no-debug
1 687 687 687 ? -1 Ss 81 0:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
1 703 703 703 ? -1 Ssl 0 0:00 /usr/sbin/gssproxy -D
1 715 715 715 ? -1 Ssl 0 0:00 /usr/sbin/NetworkManager --no-daemon
1 953 953 953 ? -1 Ssl 0 0:00 /usr/bin/python -Es /usr/sbin/tuned -l -P
1 961 961 961 ? -1 Ss 0 0:00 /usr/sbin/sshd -D
961 12568 12568 12568 ? -1 Ss 0 0:00 \_ sshd: root@pts/2
12568 12571 12571 12571 pts/2 12571 Ds+ 0 0:00 | \_ -bash
961 12817 12817 12817 ? -1 Ss 0 0:00 \_ sshd: root@pts/3
12817 12819 12819 12819 pts/3 12945 Ss 0 0:00 | \_ -bash
12819 12945 12945 12819 pts/3 12945 D+ 0 0:00 | \_ ls --color=auto -la /hesperiamount2/
961 13072 13072 13072 ? -1 Ss 0 0:00 \_ sshd: root@pts/4
13072 13101 13101 13101 pts/4 13946 Ss 0 0:00 \_ -bash
13101 13946 13946 13101 pts/4 13946 R+ 0 0:00 \_ ps axjf
1 986 986 986 ? -1 Ss 27 0:00 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
986 1293 986 986 ? -1 Sl 27 0:01 \_ /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --log
1 1021 1021 1021 ? -1 Ss 0 0:00 /usr/sbin/httpd -DFOREGROUND
1021 2826 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2828 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2829 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2832 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 2835 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 4799 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1021 8606 1021 1021 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND
1 1040 1040 1040 ? -1 Ss 0 0:00 /usr/sbin/crond -n
1040 13678 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
13678 13697 13697 13697 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/check_processes_work_well > /dev/null 2>&1
13697 13710 13697 13697 ? -1 S 1001 0:00 | \_ /usr/bin/python /hesperiamount/isnet1/check_processes_work_well
1040 13680 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
13680 13699 13699 13699 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/GetUmasepLastFile >> /hesperiamount/isnet1/log/UMASEP_ftp_get_
13699 13702 13699 13699 ? -1 S 1001 0:00 | \_ /bin/sh /hesperiamount/isnet1/GetUmasepLastFile
13702 13703 13699 13699 ? -1 S 1001 0:00 | \_ ftp -p -n -v spaceweather.uma.es
1040 13853 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
13853 13876 13876 13876 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/check_processes_work_well > /dev/null 2>&1
13876 13886 13876 13876 ? -1 S 1001 0:00 | \_ /usr/bin/python /hesperiamount/isnet1/check_processes_work_well
1040 13855 1040 1040 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n
13855 13864 13864 13864 ? -1 Ss 1001 0:00 \_ /bin/sh -c /hesperiamount/isnet1/GetUmasepLastFile >> /hesperiamount/isnet1/log/UMASEP_ftp_get_
13864 13885 13864 13864 ? -1 S 1001 0:00 \_ /bin/sh /hesperiamount/isnet1/GetUmasepLastFile
13885 13888 13864 13864 ? -1 S 1001 0:00 \_ ftp -p -n -v spaceweather.uma.es
1 1121 1121 1121 tty1 1121 Ss+ 0 0:00 /sbin/agetty --noclear tty1 linux
1 1365 1365 1365 ? -1 Ss 998 0:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
1 1620 1620 1620 ? -1 Ss 0 0:00 /usr/libexec/postfix/master -w
1620 1687 1620 1620 ? -1 S 89 0:00 \_ pickup -l -t unix -u
1620 1688 1620 1620 ? -1 S 89 0:00 \_ qmgr -l -t unix -u
1 12447 12445 12425 ? -1 D 0 0:01 rsync -azv --del --stats --progress /hesperiamount/isnet1/ /hesperiamount2/isnet1
12447 12448 12445 12425 ? -1 Z 0 0:00 \_ [rsync] <defunct>
1 13736 13696 13696 ? -1 S 1001 0:00 python umasep500_1_minute
[root@hesperia1 ~]#
[root@hesperia1 ~]# ps -l 12447
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
1 D 0 12447 1 0 80 0 - 29685 rpc_wa ? 0:01 rsync -azv --del --stats --progress /hesperiamount/isnet1/ /hesperiamount2/isnet1
[root@hesperia1 log]# cat /proc/self/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,size=1929660k,nr_inodes=482415,mode=755 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct,cpu 0 0
cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio,net_cls 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
configfs /sys/kernel/config configfs rw,relatime 0 0
/dev/mapper/centos-root / xfs rw,relatime,attr2,inode64,noquota 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=26,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=12052 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,relatime 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
/dev/mapper/vg2-lv1 /hesperiamount xfs rw,relatime,attr2,inode64,noquota 0 0
/dev/vda1 /boot xfs rw,relatime,attr2,inode64,noquota 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
10.201.40.34:/data/col1/noc-bkups-1 /mnt/dd2500-1 nfs rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.201.40.34,mountvers=3,mountport=2052,mountproto=tcp,local_lock=all,addr=10.201.40.34 0 0
10.201.40.34:/data/col1/hesperia-mount /hesperiamount2 nfs rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.201.40.34,mountvers=3,mountport=2052,mountproto=tcp,local_lock=all,addr=10.201.40.34 0 0
tmpfs /run/user/998 tmpfs rw,nosuid,nodev,relatime,size=388144k,mode=700,uid=998,gid=997 0 0
tmpfs /run/user/1001 tmpfs rw,nosuid,nodev,relatime,size=388144k,mode=700,uid=1001,gid=1002 0 0
tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=388144k,mode=700 0 0
[root@hesperia1 log]# showmount --all
clnt_create: RPC: Program not registered
[root@hesperia1 log]# mount -l -t nfs
10.201.40.34:/data/col1/noc-bkups-1 on /mnt/dd2500-1 type nfs (rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.201.40.34,mountvers=3,mountport=2052,mountproto=tcp,local_lock=all,addr=10.201.40.34)
10.201.40.34:/data/col1/hesperia-mount on /hesperiamount2 type nfs (rw,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,acregmin=1800,acregmax=1800,acdirmin=1800,acdirmax=1800,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.201.40.34,mountvers=3,mountport=2052,mountproto=tcp,local_lock=all,addr=10.201.40.34)
[root@hesperia1 log]# showmount -e 10.201.40.34
Export list for 10.201.40.34:
/data/col1/hesperia-mount 195.251.204.197
/data/col1/noc-bkups-1 195.251.204.192/28
Problem solved by changing the NFS Export Options (of the NFS shared directory, at the data storage system) from secure to insecure. That is, I changed from: rw,no_root_squash,no_all_squash,secure,nolog to: rw,no_root_squash,no_all_squash,insecure,nolog I don't know if the behavior I had described can be explained/expected by using the "secure" option, but after I changed to "insecure" everything works fine, using the latest packages - latest kernel and latest rpms on CentOS 7.4 (3.10.0-693.2.2.el7.x86_64 and rpcbind-0.2.0-42.el7.x86_64). I can't tell whether this issue needs further examination and/or source code changes/improvements. The problem, after a couple of days, started occurring again, so the above setting evidently did not resolve the issue in the end. Here is a test performed today (2017-10-06), for which I am attaching a TCPdump between the box under investigation and the storage server (which exports directories). I have booted using kernel 3.10.0-693.2.2.el7.x86_64 with debugging. I attach a TCP dump for this session (recorded using the command you see at Terminal Window 1 below), named hesperia-nfs-003.zip I also attach the messages log for the session (hesperia-messages-20171006-01.txt). The nfs mounts in /etc/fstab are as follows: ---------------------------------------------------------- /etc/fstab: ----------- [root@hesperia1 ~]# cat /etc/fstab # # /etc/fstab # Created by anaconda on Mon Jul 6 14:29:42 2015 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # /dev/mapper/centos-root / xfs defaults 0 0 UUID=7a3ae70a-8ef3-463b-8f5b-be4e2e7be894 /boot xfs defaults 0 0 /dev/mapper/centos-swap swap swap defaults 0 0 /dev/mapper/vg2-lv1 /hesperiamount xfs defaults 0 0 # 10.201.40.34:/data/col1/noc-bkups-1 /mnt/dd2500-1 nfs hard,intr,nolock,nfsvers=3,tcp,rsize=1048600,wsize=1048600,bg 0 0 10.201.40.34:/data/col1/hesperia-mount /hesperiamount2 nfs hard,intr,nolock,nfsvers=3,tcp,rsize=1048600,wsize=1048600,bg 0 0 # # 10.201.40.34:/data/col1/noc-bkups-1 /mnt/dd2500-1 nfs auto,noatime,nolock,bg,nfsvers=3,intr,tcp,actimeo=1800 0 0 # 10.201.40.34:/data/col1/hesperia-mount /hesperiamount2 nfs auto,noatime,nolock,bg,nfsvers=3,intr,tcp,actimeo=1800 0 0 ---------------------------------------------------------- As you can see below, I run the rsync command, and a bit later all sessions hang. ---------------------------------------------------------- Terminal Window 1 ----------------- [root@hesperia1 ~]# rpcdebug -v -m rpc -s all rpc xprt call debug nfs auth bind sched trans svcsock svcdsp misc cache Module Valid flags rpc xprt call debug nfs auth bind sched trans svcsock svcdsp misc cache [root@hesperia1 ~]# rpcdebug -v -m nfs -s all nfs vfs dircache lookupcache pagecache proc xdr file root callback client mount fscache pnfs pnfs_ld state Module Valid flags nfs vfs dircache lookupcache pagecache proc xdr file root callback client mount fscache pnfs pnfs_ld state [root@hesperia1 ~]# [root@hesperia1 ~]# [root@hesperia1 ~]# tcpdump -w dumps/hesperia-nfs-003 -i eth0 -s 0 host 10.201.40.34 & [1] 1608 [root@hesperia1 ~]# tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes <Later...> [root@hesperia1 ~]# Disconnecting: Timeout, server not responding. ---------------------------------------------------------- ---------------------------------------------------------- Terminal Window 2 ----------------- [root@hesperia1 ~]# rsync -azv --del --stats --progress /hesperiamount/isnet1/ /hesperiamount2/isnet1 sending incremental file list RELEASE_ALERT_IMAGES/release_alert_merged_plots.png 315851 100% 8.44MB/s 0:00:00 (xfer#1, to-check=1062/1153) <Later...> Disconnecting: Timeout, server not responding. --------------------------------------------------------- ---------------------------------------------------------- Terminal Window 3 ----------------- [root@hesperia1 ~]# ps axjf PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND 0 2 0 0 ? -1 S 0 0:00 [kthreadd] 2 3 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/0] 2 4 0 0 ? -1 S 0 0:00 \_ [kworker/0:0] 2 5 0 0 ? -1 S< 0 0:00 \_ [kworker/0:0H] 2 6 0 0 ? -1 S 0 0:00 \_ [kworker/u8:0] 2 7 0 0 ? -1 S 0 0:00 \_ [migration/0] 2 8 0 0 ? -1 S 0 0:00 \_ [rcu_bh] 2 9 0 0 ? -1 S 0 0:00 \_ [rcu_sched] 2 10 0 0 ? -1 S 0 0:00 \_ [watchdog/0] 2 11 0 0 ? -1 S 0 0:00 \_ [watchdog/1] 2 12 0 0 ? -1 S 0 0:00 \_ [migration/1] 2 13 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/1] 2 14 0 0 ? -1 S 0 0:00 \_ [kworker/1:0] 2 15 0 0 ? -1 S< 0 0:00 \_ [kworker/1:0H] 2 16 0 0 ? -1 S 0 0:00 \_ [watchdog/2] 2 17 0 0 ? -1 S 0 0:00 \_ [migration/2] 2 18 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/2] 2 19 0 0 ? -1 S 0 0:00 \_ [kworker/2:0] 2 20 0 0 ? -1 S< 0 0:00 \_ [kworker/2:0H] 2 21 0 0 ? -1 S 0 0:00 \_ [watchdog/3] 2 22 0 0 ? -1 S 0 0:00 \_ [migration/3] 2 23 0 0 ? -1 S 0 0:00 \_ [ksoftirqd/3] 2 24 0 0 ? -1 S 0 0:00 \_ [kworker/3:0] 2 25 0 0 ? -1 S< 0 0:00 \_ [kworker/3:0H] 2 27 0 0 ? -1 S 0 0:00 \_ [kdevtmpfs] 2 28 0 0 ? -1 S< 0 0:00 \_ [netns] 2 29 0 0 ? -1 S 0 0:00 \_ [khungtaskd] 2 30 0 0 ? -1 S< 0 0:00 \_ [writeback] 2 31 0 0 ? -1 S< 0 0:00 \_ [kintegrityd] 2 32 0 0 ? -1 S< 0 0:00 \_ [bioset] 2 33 0 0 ? -1 S< 0 0:00 \_ [kblockd] 2 34 0 0 ? -1 S< 0 0:00 \_ [md] 2 35 0 0 ? -1 S 0 0:00 \_ [kworker/0:1] 2 36 0 0 ? -1 S 0 0:00 \_ [kworker/1:1] 2 37 0 0 ? -1 S 0 0:00 \_ [kworker/2:1] 2 38 0 0 ? -1 S 0 0:00 \_ [kworker/3:1] 2 40 0 0 ? -1 S 0 0:00 \_ [kswapd0] 2 41 0 0 ? -1 SN 0 0:00 \_ [ksmd] 2 42 0 0 ? -1 SN 0 0:00 \_ [khugepaged] 2 43 0 0 ? -1 S< 0 0:00 \_ [crypto] 2 51 0 0 ? -1 S< 0 0:00 \_ [kthrotld] 2 52 0 0 ? -1 S 0 0:00 \_ [kworker/u8:1] 2 53 0 0 ? -1 S< 0 0:00 \_ [kmpath_rdacd] 2 54 0 0 ? -1 S< 0 0:00 \_ [kpsmoused] 2 55 0 0 ? -1 S< 0 0:00 \_ [ipv6_addrconf] 2 74 0 0 ? -1 S< 0 0:00 \_ [deferwq] 2 106 0 0 ? -1 S 0 0:00 \_ [kworker/3:2] 2 107 0 0 ? -1 S 0 0:00 \_ [kauditd] 2 226 0 0 ? -1 S 0 0:00 \_ [kworker/0:2] 2 288 0 0 ? -1 S< 0 0:00 \_ [ata_sff] 2 296 0 0 ? -1 S 0 0:00 \_ [scsi_eh_0] 2 299 0 0 ? -1 S< 0 0:00 \_ [scsi_tmf_0] 2 300 0 0 ? -1 S 0 0:00 \_ [scsi_eh_1] 2 301 0 0 ? -1 S< 0 0:00 \_ [scsi_tmf_1] 2 303 0 0 ? -1 S 0 0:00 \_ [kworker/u8:2] 2 304 0 0 ? -1 S 0 0:00 \_ [kworker/u8:3] 2 305 0 0 ? -1 S< 0 0:00 \_ [ttm_swap] 2 316 0 0 ? -1 S 0 0:00 \_ [kworker/1:2] 2 320 0 0 ? -1 S< 0 0:00 \_ [kworker/2:1H] 2 331 0 0 ? -1 S 0 0:00 \_ [kworker/2:2] 2 400 0 0 ? -1 S< 0 0:00 \_ [kdmflush] 2 401 0 0 ? -1 S< 0 0:00 \_ [bioset] 2 412 0 0 ? -1 S< 0 0:00 \_ [kdmflush] 2 413 0 0 ? -1 S< 0 0:00 \_ [bioset] 2 426 0 0 ? -1 S< 0 0:00 \_ [bioset] 2 427 0 0 ? -1 S< 0 0:00 \_ [xfsalloc] 2 428 0 0 ? -1 S< 0 0:00 \_ [xfs_mru_cache] 2 429 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/dm-0] 2 430 0 0 ? -1 S< 0 0:00 \_ [xfs-data/dm-0] 2 431 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/dm-0] 2 432 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/dm-0] 2 433 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/dm-] 2 434 0 0 ? -1 S< 0 0:00 \_ [xfs-log/dm-0] 2 435 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/d] 2 436 0 0 ? -1 S 0 0:00 \_ [xfsaild/dm-0] 2 538 0 0 ? -1 S< 0 0:00 \_ [rpciod] 2 539 0 0 ? -1 S< 0 0:00 \_ [xprtiod] 2 596 0 0 ? -1 S< 0 0:00 \_ [kworker/0:1H] 2 597 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/vda1] 2 598 0 0 ? -1 S< 0 0:00 \_ [xfs-data/vda1] 2 599 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/vda1] 2 600 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/vda1] 2 601 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/vda] 2 602 0 0 ? -1 S< 0 0:00 \_ [xfs-log/vda1] 2 603 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/v] 2 604 0 0 ? -1 S 0 0:00 \_ [xfsaild/vda1] 2 607 0 0 ? -1 S< 0 0:00 \_ [kworker/3:1H] 2 608 0 0 ? -1 S< 0 0:00 \_ [kworker/1:1H] 2 612 0 0 ? -1 S< 0 0:00 \_ [kdmflush] 2 613 0 0 ? -1 S< 0 0:00 \_ [bioset] 2 620 0 0 ? -1 S< 0 0:00 \_ [xfs-buf/dm-2] 2 621 0 0 ? -1 S< 0 0:00 \_ [xfs-data/dm-2] 2 622 0 0 ? -1 S< 0 0:00 \_ [xfs-conv/dm-2] 2 623 0 0 ? -1 S< 0 0:00 \_ [xfs-cil/dm-2] 2 624 0 0 ? -1 S< 0 0:00 \_ [xfs-reclaim/dm-] 2 625 0 0 ? -1 S< 0 0:00 \_ [xfs-log/dm-2] 2 626 0 0 ? -1 S< 0 0:00 \_ [xfs-eofblocks/d] 2 627 0 0 ? -1 S 0 0:00 \_ [xfsaild/dm-2] 2 963 0 0 ? -1 S< 0 0:00 \_ [nfsiod] 2 1785 0 0 ? -1 S 0 0:00 \_ [kworker/3:3] 0 1 1 1 ? -1 Ss 0 0:01 /usr/lib/systemd/systemd --switched-root --system --deserialize 20 1 506 506 506 ? -1 Rs 0 0:56 /usr/lib/systemd/systemd-journald 1 541 541 541 ? -1 Ss 0 0:00 /usr/lib/systemd/systemd-udevd 1 549 549 549 ? -1 Ss 0 0:00 /usr/sbin/lvmetad -f 1 652 652 652 ? -1 S<sl 0 0:00 /sbin/auditd 1 675 675 675 ? -1 Ssl 999 0:00 /usr/lib/polkit-1/polkitd --no-debug 1 677 677 677 ? -1 Ss 0 0:00 /usr/lib/systemd/systemd-logind 1 679 679 679 ? -1 Ssl 0 1:03 /usr/sbin/rsyslogd -n 1 684 684 684 ? -1 Ss 0 0:00 /usr/sbin/irqbalance --foreground 1 686 686 686 ? -1 Ss 81 0:00 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --sys 1 705 705 705 ? -1 Ssl 0 0:00 /usr/sbin/gssproxy -D 1 713 713 713 ? -1 Ssl 0 0:00 /usr/sbin/NetworkManager --no-daemon 1 943 943 943 ? -1 Ssl 0 0:00 /usr/bin/python -Es /usr/sbin/tuned -l -P 1 957 957 957 ? -1 Ss 0 0:00 /usr/sbin/sshd -D 957 1507 1507 1507 ? -1 Ss 0 0:00 \_ sshd: root@pts/0 1507 1510 1510 1510 pts/0 1510 Ss+ 0 0:00 | \_ -bash 1510 1608 1608 1510 pts/0 1510 S 72 0:00 | \_ tcpdump -w dumps/hesperia-nfs-003 -i eth0 -s 0 host 10.201. 957 1655 1655 1655 ? -1 Ss 0 0:00 \_ sshd: root@pts/1 1655 1658 1658 1658 pts/1 1688 Ss 0 0:00 | \_ -bash 1658 1688 1688 1658 pts/1 1688 D+ 0 0:01 | \_ rsync -azv --del --stats --progress /hesperiamount/isnet1/ 1688 1689 1688 1658 pts/1 1688 S+ 0 0:07 | \_ rsync -azv --del --stats --progress /hesperiamount/isne 1689 1690 1688 1658 pts/1 1688 S+ 0 0:00 | \_ rsync -azv --del --stats --progress /hesperiamount/ 957 1803 1803 1803 ? -1 Ss 0 0:00 \_ sshd: root@pts/2 1803 1806 1806 1806 pts/2 1833 Ss 0 0:00 \_ -bash 1806 1833 1833 1806 pts/2 1833 R+ 0 0:00 \_ ps axjf 1 974 974 974 ? -1 Ss 0 0:00 /usr/sbin/httpd -DFOREGROUND 974 1448 974 974 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND 974 1449 974 974 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND 974 1450 974 974 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND 974 1451 974 974 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND 974 1452 974 974 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND 974 1485 974 974 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND 974 1541 974 974 ? -1 S 48 0:00 \_ /usr/sbin/httpd -DFOREGROUND 1 978 978 978 ? -1 Ss 0 0:00 /usr/sbin/crond -n 978 1559 978 978 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n 1559 1565 1565 1565 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/check_processes_work_well > /d 1565 1582 1565 1565 ? -1 S 1001 0:00 | \_ /usr/bin/python /hesperiamount/isnet1/check_processes_work_ 978 1561 978 978 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n 1561 1569 1569 1569 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/GetUmasepLastFile >> /hesperia 1569 1579 1569 1569 ? -1 S 1001 0:00 | \_ /bin/sh /hesperiamount/isnet1/GetUmasepLastFile 1579 1594 1569 1569 ? -1 S 1001 0:00 | \_ ftp -p -n -v spaceweather.uma.es 978 1724 978 978 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n 1724 1740 1740 1740 ? -1 Ss 1001 0:00 | \_ /bin/sh -c /hesperiamount/isnet1/check_processes_work_well > /d 1740 1750 1740 1740 ? -1 S 1001 0:00 | \_ /usr/bin/python /hesperiamount/isnet1/check_processes_work_ 978 1726 978 978 ? -1 S 0 0:00 \_ /usr/sbin/CROND -n 1726 1742 1742 1742 ? -1 Ss 1001 0:00 \_ /bin/sh -c /hesperiamount/isnet1/GetUmasepLastFile >> /hesperia 1742 1744 1742 1742 ? -1 S 1001 0:00 \_ /bin/sh /hesperiamount/isnet1/GetUmasepLastFile 1744 1751 1742 1742 ? -1 S 1001 0:00 \_ ftp -p -n -v spaceweather.uma.es 1 1011 1011 1011 tty1 1011 Ss+ 0 0:00 /sbin/agetty --noclear tty1 linux 1 1040 1040 1040 ? -1 Ss 27 0:00 /bin/sh /usr/bin/mysqld_safe --basedir=/usr 1040 1312 1040 1040 ? -1 Sl 27 0:00 \_ /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-d 1 1229 1229 1229 ? -1 Ss 998 0:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d 1 1358 1358 1358 ? -1 Ss 0 0:00 /usr/libexec/postfix/master -w 1358 1360 1358 1358 ? -1 S 89 0:00 \_ pickup -l -t unix -u 1358 1361 1358 1358 ? -1 S 89 0:00 \_ qmgr -l -t unix -u 1 1607 1576 1576 ? -1 S 1001 0:00 python umasep500_1_minute [root@hesperia1 ~]# [root@hesperia1 ~]# [root@hesperia1 ~]# [root@hesperia1 ~]# ls -la /hesperiamount2/ total 12 drwxrwxrwx 6 root root 274 Mar 2 2017 . dr-xr-xr-x. 22 root root 4096 Sep 22 06:50 .. drwxr-xr-x 16 isnet1 isnet 3604 Apr 26 15:26 isnet1 drwxr-xr-x 3 root root 153 Mar 2 2017 ocloud_store drwxrwxr-x 19 release1 isnet 1660 Feb 28 2017 release1 drwxrwxrwx 6 root root 457 Oct 6 08:31 .snapshot [root@hesperia1 ~]# <Session hangs - Much later...> [root@hesperia1 ~]# Disconnecting: Timeout, server not responding. ---------------------------------------------------------- In any new terminal window (SSH Session) that I open, if I attempt to list the mounted directory, the session hangs: ---------------------------------------------------------- Terminal Window 4 ----------------- [root@hesperia1 ~]# ls -la /hesperiamount2 <hangs forever> ---------------------------------------------------------- What is being wrong? Created attachment 1335178 [details]
TCP Dump between the box and the NFS server - Test on 2017-10-06
The TCP dump records packets during the test performed on 2017-10-06; please see the associated report below
Created attachment 1335179 [details]
/var/log/messages file for the period that the test on 2017-10-06 was performed
The problem was finally traced down to a Cisco ASA bug (this firewall device lies between the connected networks); bug CSCuq80704 was resolved by an ASA software update. NFS packets were incorrectly being dropped by ASA: Drop-reason: (tcp-paws-fail) TCP packet failed PAWS test ...and were causing nfs traffic to stall. After ASA software upgrade the problem has not occurred again. I can't tell why this was not happening for many months, but only lately. I think this case may be closed. |