Bug 510025

Summary: automount debug mode seems to change more than log behavior
Product: [Fedora] Fedora Reporter: Fabien Dupont <evenit>
Component: autofsAssignee: Ian Kent <ikent>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 11CC: ikent, jmoyer, mike.hanafey
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 5.0.4-42 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-11-04 12:23:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
TestCase #1 : strace file
none
/var/log/messages
none
Fix map stale initialization for file map instance none

Description Fabien Dupont 2009-07-07 13:22:13 UTC
Description of problem:

I have problems in the process of automounting NFSv3 shares located on a NetApp appliance. Some of them are not mounted, even though all shares have the same configuration. This is one problem and I thought I could find what was wrong by activating debug logging.

But when debug logging is activated, the shares are normally mounted, leaving me with no trace of the problem exposed before. Automount seems to have a different behavior in debug mode (except log level, of course).

Version-Release number of selected component (if applicable): 5.0.4-33


How reproducible:

I don't know exactly as I can't get the point on the supposed NFS share configuration problem.

Comment 1 Ian Kent 2009-07-07 13:53:09 UTC
(In reply to comment #0)
> Description of problem:
> 
> I have problems in the process of automounting NFSv3 shares located on a NetApp
> appliance. Some of them are not mounted, even though all shares have the same
> configuration. This is one problem and I thought I could find what was wrong by
> activating debug logging.
> 
> But when debug logging is activated, the shares are normally mounted, leaving
> me with no trace of the problem exposed before. Automount seems to have a
> different behavior in debug mode (except log level, of course).

No, it doesn't.
Enabling logging slows things down so some problems can take
somewhat longer to show up.

Continue to run with debug logging enabled to see if the problem
shows up.

What is your autofs map configuration?
What are the actual symptoms, exactly?

Ian

Comment 2 Fabien Dupont 2009-07-08 07:11:55 UTC
Created attachment 350898 [details]
TestCase #1 : strace file

Comment 3 Fabien Dupont 2009-07-08 07:12:14 UTC
We've gone a bit further by tracing automount process launched from command line with a simple '/usr/sbin/automount'. And the result is still quite strange...

Our configuration is :
   - 5 NetApp appliances available as nfs[1..5].local (DNS A records).
   - Each NFS server has its own indirect map '/etc/auto.nfsX'.
   - Each indirect map references all the shares with the same options, except for one share on nfs1 which has -fstype=nfs4, as it's an NFSv4 test share.
   - All shares from nfsX server are mounted in /env/export/nfsX : nfs2.local:/vol/pub is mounted in /env/export/nfs2/pub.

Symptoms have been detected while running test for fc10 to fc11 migration. Everything works fine on fc10, but we have this problem on fc11 : not all the shares are mounted by automount.

We've checked autofs module and it's present in 'lsmod' output.
We've added 'ipv6' module which seems mandatory for NFSv4 share (googled this).

Thus we have performed the following experiment :

1. Start automount manually, "stracing" it :

root# strace -v -f -o automount-20090708-001.log /usr/sbin/automount

2. 'ls' some shares on nfs1 : everything works fine.

root# ls /env/export/nfs1/home
[prints list of home directories]

root# ls /env/export/nfs1/testHome
[print list of NFSv4 test content]

3. 'ls' some shares on nfs2 : works great for the first, then fails on any other share on nfs2.

root# ls /env/export/nfs2/source
[prints share's content]

root# ls /env/export/nfs2/pub
ls: cannot access /env/export/nfs2/pub: No such file or directory

4. 'ls' some shares on nfs[3..5] : same as nfs2, works great for the first share on nfsX, then fails on any other share on the same nfsX, whichever is the first.

I join the trace file as attachment to the report.
Your help is really appreciated.

Comment 4 Fabien Dupont 2009-07-08 07:26:03 UTC
I have forgotten to mention that, in /var/log/messages, we have the following error message for each failure :

update_negative_cache: key "<shareName>" not found in map.

Comment 5 Ian Kent 2009-07-08 08:27:09 UTC
(In reply to comment #4)
> I have forgotten to mention that, in /var/log/messages, we have the following
> error message for each failure :
> 
> update_negative_cache: key "<shareName>" not found in map.  

Somehow we need to get a debug log showing this problem.
Ian

Comment 6 Fabien Dupont 2009-07-08 11:39:08 UTC
I understand fully your point of view. But as soon as we put automount in debug mode everything works fine.

Then, we wanted to put some debug messages in source code from SRPM download here : http://download.fedora.redhat.com/pub/fedora/linux/updates/11/SRPMS/autofs-5.0.4-33.src.rpm. We compiled it with 'rpmbuild -bc --short-circuit' to just generate binaries, so that we could use the binary without installing a new RPM. Unfortunately, we couldn't manage to reproduce the issue, even in normal logging mode.

This gave us a strange idea : what if the binary package was the bad guy...
We made a fresh install of our computer : autofs package is autofs-5.0.4-33.i586.rpm. Then we applied the following protocol :

1. Download the SRPM : same source as upward.

# wget http://download.fedora.redhat.com/pub/fedora/linux/updates/11/SRPMS/autofs-5.0.4-33.src.rpm

2. Install dependencies for the package compilation :

# yum install rpm-build
# yum install hesiod-devel openldap-devel bison flex libxml2-devel
# yum install cyrus-sasl-devel openssl-devel libtirpc-devel

3. Recompile the package :

# rpmbuild --rebuild --target i586 autofs-5.0.4-33.src.rpm

4. Replace the package bundled with fresh install :

# service autofs stop
# yum erase autofs
# yum localinstall --nogpgcheck /root/rpmbuild/RPMS/i586/autofs-5.0.4-33.i586.rpm
# cp /etc/auto.master.rpmsave /etc/auto.master
# service autofs start

Then believe it or not, but everything works fine with this recompiled package !

A difference between recompiled and original package is that /usr/sbin/automount is not stripped in recompiled package. I'm not sure why stripping would/could modify its behavior : just a thing we've seen...

Comment 7 Mike Hanafey 2009-10-13 17:58:23 UTC
I have the same problem.

When running "automount -vdf" it works, but running as a service it does not.

I turned logging to "debug" in /etc/sysconfig/autofs, and the problem still occurs. Here is some failures from the log:

Oct 13 13:48:06 java automount[6857]: handle_packet: type = 3
Oct 13 13:48:06 java automount[6857]: handle_packet_missing_indirect: token 248, name site, request pid 6881
Oct 13 13:48:06 java automount[6857]: attempting to mount entry /data/site
Oct 13 13:48:06 java automount[6857]: lookup_mount: lookup(file): looking up site
Oct 13 13:48:06 java automount[6857]: update_negative_cache: key "site" not found in map.
Oct 13 13:48:06 java automount[6857]: dev_ioctl_send_fail: token = 248
Oct 13 13:48:06 java automount[6857]: failed to mount /data/site
Oct 13 13:48:06 java automount[6857]: handle_packet: type = 3
Oct 13 13:48:06 java automount[6857]: handle_packet_missing_indirect: token 249, name site, request pid 6881
Oct 13 13:48:06 java automount[6857]: attempting to mount entry /data/site
Oct 13 13:48:06 java automount[6857]: lookup_mount: lookup(file): looking up site
Oct 13 13:48:06 java automount[6857]: dev_ioctl_send_fail: token = 249
Oct 13 13:48:06 java automount[6857]: failed to mount /data/site
Oct 13 13:48:20 java automount[6857]: handle_packet: type = 3
Oct 13 13:48:20 java automount[6857]: handle_packet_missing_indirect: token 250, name TractorTest, request pid 6888
Oct 13 13:48:20 java automount[6857]: attempting to mount entry /data/TractorTest
Oct 13 13:48:20 java automount[6857]: lookup_mount: lookup(file): looking up TractorTest
Oct 13 13:48:20 java automount[6857]: update_negative_cache: key "TractorTest" not found in map.
Oct 13 13:48:20 java automount[6857]: dev_ioctl_send_fail: token = 250
Oct 13 13:48:20 java automount[6857]: handle_packet: type = 3
Oct 13 13:48:20 java automount[6857]: handle_packet_missing_indirect: token 251, name TractorTest, request pid 6888
Oct 13 13:48:20 java automount[6857]: failed to mount /data/TractorTest
Oct 13 13:48:20 java automount[6857]: attempting to mount entry /data/TractorTest
Oct 13 13:48:20 java automount[6857]: lookup_mount: lookup(file): looking up TractorTest
Oct 13 13:48:20 java automount[6857]: dev_ioctl_send_fail: token = 251
Oct 13 13:48:20 java automount[6857]: failed to mount /data/TractorTest

Comment 8 Mike Hanafey 2009-10-13 18:14:00 UTC
Additional details: After a "service autofs stop", "service autofs start" the first automount request works. So from the comment above, a reference to "/data/site" works fine if it comes immediately after automount starts, but the next automount attempt again fails.

Here is the successful followed by failure log:
Oct 13 14:01:43 java automount[7346]: handle_packet: type = 3
Oct 13 14:01:43 java automount[7346]: handle_packet_missing_indirect: token 257, name site, request pid 7365
Oct 13 14:01:43 java automount[7346]: attempting to mount entry /data/site
Oct 13 14:01:43 java automount[7346]: lookup_mount: lookup(file): looking up site
Oct 13 14:01:43 java automount[7346]: lookup_mount: lookup(file): site -> dna:/export/cx12g/site
Oct 13 14:01:43 java automount[7346]: parse_mount: parse(sun): expanded entry: dna:/export/cx12g/site
Oct 13 14:01:43 java automount[7346]: parse_mount: parse(sun): gathered options:
Oct 13 14:01:43 java automount[7346]: parse_mount: parse(sun): dequote("dna:/export/cx12g/site") -> dna:/export/cx12g/site
Oct 13 14:01:43 java automount[7346]: parse_mount: parse(sun): core of entry: options=, loc=dna:/export/cx12g/site
Oct 13 14:01:43 java automount[7346]: sun_mount: parse(sun): mounting root /data, mountpoint site, what dna:/export/cx12g/site, fstype nfs, options (null)
Oct 13 14:01:43 java automount[7346]: mount_mount: mount(nfs): root=/data name=site what=dna:/export/cx12g/site, fstype=nfs, options=(null)
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto udp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v3 random selection time: 0.140485
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v2 random selection time: 0.333052
Oct 13 14:01:43 java automount[7346]: get_nfs_info: host dna cost 236768 weight 0
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto tcp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto udp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v3 random selection time: 0.359474
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v2 random selection time: 0.833121
Oct 13 14:01:43 java automount[7346]: get_nfs_info: host dna cost 596297 weight 0
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto tcp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto udp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v3 random selection time: 0.735065
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v2 random selection time: 0.489497
Oct 13 14:01:43 java automount[7346]: get_nfs_info: host dna cost 612280 weight 0
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto tcp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto udp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v3 random selection time: 0.327940
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v2 random selection time: 0.568922
Oct 13 14:01:43 java automount[7346]: get_nfs_info: host dna cost 448430 weight 0
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto tcp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto udp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v3 random selection time: 0.943978
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v2 random selection time: 0.389946
Oct 13 14:01:43 java automount[7346]: get_nfs_info: host dna cost 666961 weight 0
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto tcp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto udp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v3 random selection time: 0.943978
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v2 random selection time: 0.389946
Oct 13 14:01:43 java automount[7346]: get_nfs_info: host dna cost 666961 weight 0
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto tcp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto udp version 0x30
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v3 random selection time: 0.551631
Oct 13 14:01:43 java automount[7346]: get_nfs_info: nfs v2 random selection time: 0.208786
Oct 13 14:01:43 java automount[7346]: get_nfs_info: host dna cost 380208 weight 0
Oct 13 14:01:43 java automount[7346]: get_nfs_info: called for host dna proto tcp version 0x30
Oct 13 14:01:43 java automount[7346]: prune_host_list: selected subset of hosts that support NFS3 over UDP
Oct 13 14:01:43 java automount[7346]: mount_mount: mount(nfs): calling mkdir_path /data/site
Oct 13 14:01:43 java automount[7346]: mount_mount: mount(nfs): calling mount -t nfs dna:/export/cx12g/site /data/site
Oct 13 14:01:43 java automount[7346]: mount(nfs): mounted dna:/export/cx12g/site on /data/site
Oct 13 14:01:43 java automount[7346]: dev_ioctl_send_ready: token = 257
Oct 13 14:01:43 java automount[7346]: mounted /data/site
Oct 13 14:01:49 java automount[7346]: handle_packet: type = 3
Oct 13 14:01:49 java automount[7346]: handle_packet_missing_indirect: token 258, name TractorTest, request pid 7373
Oct 13 14:01:49 java automount[7346]: attempting to mount entry /data/TractorTest
Oct 13 14:01:49 java automount[7346]: lookup_mount: lookup(file): looking up TractorTest
Oct 13 14:01:49 java automount[7346]: update_negative_cache: key "TractorTest" not found in map.
Oct 13 14:01:49 java automount[7346]: dev_ioctl_send_fail: token = 258
Oct 13 14:01:49 java automount[7346]: failed to mount /data/TractorTest
Oct 13 14:01:49 java automount[7346]: handle_packet: type = 3
Oct 13 14:01:49 java automount[7346]: handle_packet_missing_indirect: token 259, name TractorTest, request pid 7373
Oct 13 14:01:49 java automount[7346]: attempting to mount entry /data/TractorTest
Oct 13 14:01:49 java automount[7346]: lookup_mount: lookup(file): looking up TractorTest
Oct 13 14:01:49 java automount[7346]: dev_ioctl_send_fail: token = 259
Oct 13 14:01:49 java automount[7346]: failed to mount /data/TractorTest

Comment 9 Ian Kent 2009-10-14 02:00:26 UTC
(In reply to comment #8)
> Additional details: After a "service autofs stop", "service autofs start" the
> first automount request works. So from the comment above, a reference to
> "/data/site" works fine if it comes immediately after automount starts, but the
> next automount attempt again fails.

Post the whole debug log.

Comment 10 Mike Hanafey 2009-10-14 14:58:04 UTC
Created attachment 364760 [details]
/var/log/messages

/var/log/messages is attached:

[mkh@java] ~
% ls /data/pihome
lost+found  lsf  mkh-pi  root  SystemRCS

[mkh@java] ~
% ls /data/pidata
ls: cannot access /data/pidata: No such file or directory

# service autofs start (in root window)

[mkh@java] ~
% ls /data/pihome
lost+found  lsf  mkh-pi  root  SystemRCS

[mkh@java] ~
% ls /data/pidata
ls: cannot access /data/pidata: No such file or directory

[mkh@java] ~
% ls /data/pidata
ls: cannot access /data/pidata: No such file or directory

# umount /data/pihome (in root window)

[mkh@java] ~
% ls /data/pihome
lost+found  lsf  mkh-pi  root  SystemRCS

# umount /data/pihome (in root window)

[mkh@java] ~
% ls /data/pidata
ls: cannot access /data/pidata: No such file or directory

# umount /data/pihome (in root window)

[mkh@java] ~
% ls /data/pihome
lost+found  lsf  mkh-pi  root  SystemRCS

[mkh@java] ~

Comment 11 Ian Kent 2009-10-14 15:17:48 UTC
(In reply to comment #10)
> Created an attachment (id=364760) [details]
> /var/log/messages

Right, what's in the file map /etc/auto.data?

Ian

Comment 12 Mike Hanafey 2009-10-15 17:42:45 UTC
Here are auto.master and auto.data. fyi: pi is RedHat 5.3, dna is Solaris 8
% cat /etc/auto.master
#
#/net	-hosts

/cDNA/com	auto.cDNA
/cDNA.test/com	auto.cDNAtest
/data		auto.data
/apps		auto.apps
/apps.test	auto.appsTest
/home		auto.home


#==============================================================================

% cat /etc/auto.data
# $Id$
#
pidata					pi:/data01
pihome					pi:/export/home
scratch					espsfs03:/cropgen
Tractor					helix:/export/home/Tractor
TractorTest				helix:/export/home/TractorTest
site					dna:/export/cx12g/site
GenomeScan				dna:/export/cx13g/data/GenomeScan
miRNA					dna:/export/cx13h/data/miRNA
SvnWebSite				dna:/export/cx12g/svn/SvnWebSite
Icoria					dna:/export/cx11h/Icoria
Analysis				dna:/export/cx11h/Analysis
454RNAi					dna:/export/cx11h/454RNAi
lspdb					dna:/export/cx9g/cdna.prod/lspdb
working					dna:/export/cx10f/working
MyriadTrace				dna:/export/cx10g/MyriadTrace
MyriadTrace2				dna:/export/cx10f/MyriadTrace
IncyteOvergos				dna:/export/cx15g/data/lassap/IncyteOvergos
ssrFingerprint				dna:/export/cx14a/ssrFingerprint
SoybeanImport				dna:/export/cx14e/data/SoybeanImport
Resequence				dna:/export/cx11e/data/Resequence
genomes					dna:/export/cx9e/genomes
romeo					dna:/export/cx13f/data/romeo
cvs					dna:/export/cx12f/cvs
AcrobatDocs				dna:/export/cx12f/data/AcrobatDocs
patches					dna:/export/cx10d/data/patches
CAT					dna:/export/cx11e/data/CAT
snps-discover				dna:/export/cx12d/snps-discover
Teamware				dna:/export/cx11f/data/Teamware
Arabidopsis				dna:/export/cx10g/Arabidopsis
maizeBAC				dna:/export/cx13e/data/maizeBAC
lassap					dna:/export/cx15g/data/lassap
mpss					dna:/export/cx13e/data/mpss
hpcdb					eshpcfs01g:/biolib/db
hpctrace				espsfs:/biolib/trace/AgBiotech
admin					dna:/export/cx11f/data/admin
legato					dna:/export/cx12f/data/legato
phrap.screen				dna:/export/cx9g/cdna.prod/phrap.screen
download				dna:/export/cx10d/data/download
xpress					dna:/export/cx10a/data/xpress
misc					dna:/export/cx10d/data/misc
ws					dna:/export/cx10d/ws
tmp					dna:/export/cx10d/data/tmp
gt					dna:/export/cx10d/data/gt
stage					dna:/export/cx11f/data/stage
microb					bioservrp:/data3/microb/seq
microb2					bioservrp:/data1/microb
ricarc					dna:/export/cx11d/ricarc
JT					dna:/export/cx11h/Icoria/JT

Comment 13 Ian Kent 2009-10-16 06:54:43 UTC
Created attachment 365021 [details]
Fix map stale initialization for file map instance

Don't know how this got through testing of the file map
read optimization changes, but it did.

Comment 14 Ian Kent 2009-10-16 06:58:15 UTC
A package with the change posted above is included in the
package located at:
http://kojipkgs.fedoraproject.org/packages/autofs/5.0.4/42

I will request the package be pushed to the testing repository
but could you test this now please.

Comment 15 Ian Kent 2009-10-16 07:02:05 UTC
(In reply to comment #6)
> 
> Then believe it or not, but everything works fine with this recompiled package
> !
> 
> A difference between recompiled and original package is that
> /usr/sbin/automount is not stripped in recompiled package. I'm not sure why
> stripping would/could modify its behavior : just a thing we've seen...  

Based on what I've just seen it looks like you were mistaken.
There was definitely a bug here.

Comment 16 Mike Hanafey 2009-10-16 13:21:01 UTC
(In reply to comment #14)
> A package with the change posted above is included in the
> package located at:
> http://kojipkgs.fedoraproject.org/packages/autofs/5.0.4/42
> 
> I will request the package be pushed to the testing repository
> but could you test this now please.  

Works for me!

Comment 17 Fedora Update System 2009-10-20 00:42:40 UTC
autofs-5.0.4-42 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/autofs-5.0.4-42

Comment 18 Fedora Update System 2009-10-21 00:49:27 UTC
autofs-5.0.4-42 has been pushed to the Fedora 11 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update autofs'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F11/FEDORA-2009-10652

Comment 19 Fedora Update System 2009-11-04 12:22:55 UTC
autofs-5.0.4-42 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.