Bug 1120961

Summary: [autofs] submount test fail
Product: Red Hat Enterprise Linux 6 Reporter: JianHong Yin <jiyin>
Component: autofsAssignee: Ian Kent <ikent>
Status: CLOSED CURRENTRELEASE QA Contact: XuWang <xuw>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.6CC: eguan, ikent, jiyin
Target Milestone: rcFlags: jiyin: needinfo-
Target Release: ---   
Hardware: s390x   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-13 01:17:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description JianHong Yin 2014-07-18 04:09:52 UTC
Description of problem:
[autofs] submount test fail

Version-Release number of selected component (if applicable):
------------------------------------------------
Time & CURDIR : [2014-07-17 08:21:23 @/mnt/tests/CoreOS/autofs/submount-test]
Case Name     : /CoreOS/autofs/submount-test
$HOSTNAME     : ibm-z10-38.rhts.eng.bos.redhat.com
Distro Info   : RedHatEnterpriseServer 6.6 : RHEL-6.6-20140716.0
NVR & host    : Linux ibm-z10-38.rhts.eng.bos.redhat.com 2.6.32-491.el6.s390x #1 SMP Sat Jul 12 11:37:48 EDT 2014 s390x s390x s390x GNU/Linux
cmdline       :
	root=/dev/mapper/vg_ibmz1038-lv_root rd_NO_LUKS rd_LVM_LV=vg_ibmz1038/lv_swap LANG=en_US.UTF-8 rd_DASD=0.0.23b1 rd_NO_MD rd_DASD=0.0.21b1  KEYTABLE=us rd_DASD=0.0.20b1  rd_DASD=0.0.22b1 SYSFONT=latarcyrheb-sun16 rd_LVM_LV=vg_ibmz1038/lv_root rd_NO_DM BOOT_IMAGE=0
Package Info  :
	autofs-5.0.5-106.el6.s390x
------------------------------------------------

How reproducible:
-

Steps to Reproduce:
1. run submount test case

Actual results:
MARK-LWD-LOOP -- 2014-07-17 10:13:26 --
{Info} Submount test seems to be hung
[New LWP 35451]
[New LWP 35446]
[New LWP 35444]
[New LWP 35442]
[New LWP 35423]
[New LWP 35421]
[New LWP 35005]
[New LWP 34958]
[New LWP 3248]
[New LWP 3245]
[New LWP 3244]
[New LWP 35461]
[New LWP 35460]
[Thread debugging using libthread_db enabled]
0x000003fffd795788 in sigwait () from /lib64/libpthread.so.0

Thread 14 (Thread 0x3ffdf3cd910 (LWP 35460)):
#0  0x000003fffd78f686 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000003ffdffd84c6 in mount_mount () from /usr/lib64/autofs/mount_autofs.so
#2  0x000002aac665672e in do_mount ()
#3  0x000003fffc89b2c0 in ?? () from /usr/lib64/autofs/parse_sun.so
#4  0x000003fffc89bbec in ?? () from /usr/lib64/autofs/parse_sun.so
#5  0x000003fffc89dad2 in parse_mount () from /usr/lib64/autofs/parse_sun.so
#6  0x000003fffc8d222a in lookup_mount () from /usr/lib64/autofs/lookup_file.so
#7  0x000002aac66579a6 in ?? ()
#8  0x000002aac6657cd8 in lookup_nss_mount ()
#9  0x000002aac6651cda in ?? ()
#10 0x000003fffd78a43e in start_thread () from /lib64/libpthread.so.0
#11 0x000003fffd4538f2 in thread_start () from /lib64/libc.so.6

Thread 13 (Thread 0x3ffdf7cd910 (LWP 35461)):
#0  0x000003fffd443b1a in _xmknod () from /lib64/libc.so.6
#1  0x000003fffd4438ee in mkfifo () from /lib64/libc.so.6
#2  0x000002aac664e16e in handle_mounts ()
#3  0x000003fffd78a43e in start_thread () from /lib64/libpthread.so.0
#4  0x000003fffd4538f2 in thread_start () from /lib64/libc.so.6

Thread 12 (Thread 0x3fffd319910 (LWP 3244)):
#0  0x000003fffd78f686 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x000002aac6667488 in ?? ()
#2  0x000003fffd78a43e in start_thread () from /lib64/libpthread.so.0
#3  0x000003fffd4538f2 in thread_start () from /lib64/libc.so.6
...

Expected results:
test pass

Additional info:
  J:696911 	autofs(1)-M.1.1-:/submount-test.0@2014-07-17_16:30:47.. -
  https://beaker.engineering.redhat.com/jobs/696911
  http://lab-02.rhts.eng.bos.redhat.com/beaker/logs/tasks/22789+/22789980/TESTOUT.log

Comment 2 Ian Kent 2014-07-18 04:31:52 UTC
The backtrace looks more like the test running too slowly than
an actual hang.

I'll check.

Comment 3 Ian Kent 2014-07-18 12:17:49 UTC
I extended the time the test is allowed to run and submitted
two jobs but neither ran as long as the one above.

It would be worth while to check if the test is actually
continuing when this happens so we can find out if there
are some particularly slow machines in the lab or if in
fact automount has actually stopped working.

I can give instructions on how to do that if you're willing
to check.

Jobs were:
https://beaker.engineering.redhat.com/jobs/698198
https://beaker.engineering.redhat.com/jobs/698130

Comment 4 Ian Kent 2015-02-12 08:18:22 UTC
See comment #3, can we re-run this test please.

Comment 5 JianHong Yin 2015-02-12 10:04:21 UTC
(In reply to Ian Kent from comment #4)
> See comment #3, can we re-run this test please.

re-run at RHEL-6.6 and RHEL-6.7-20150211.n.0
  https://beaker.engineering.redhat.com/jobs/880896
  https://beaker.engineering.redhat.com/jobs/880897

  waiting "done"