Bug 64119 - Initial samba service start failed on service creation
Summary: Initial samba service start failed on service creation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: clumanager
Version: 2.1
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jason Baron
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2002-04-25 21:49 UTC by Tim Burke
Modified: 2013-03-06 05:55 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2002-04-26 14:55:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2002:226 0 normal SHIPPED_LIVE Fixes for clumanager addressing starvation and service hangs 2002-10-08 04:00:00 UTC

Description Tim Burke 2002-04-25 21:49:11 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.2.1) Gecko/20010901

Description of problem:
I created a new samba service, running cluadmin on clug.  When it asked which
cluster member to start the service on, I referred to the other cluster member
cluh.  The service start failed.  But subsequent starts succeeded.

Version-Release number of selected component (if applicable):


How reproducible:
Didn't try

Steps to Reproduce:

	Added a new samba service:

Service name: smb1
Preferred member [None]: ?
Choose the preferred member for the service.
Enter clug (or 0), cluh (or 1), or None.
Preferred member [None]: 1
Relocate when the preferred member joins the cluster (yes/no/?) [no]: yes
User script (e.g., /usr/foo/script or None) [None]:
Status check interval [0]: 5
Do you want to add an IP address to the service (yes/no/?) [no]: yes

        IP Address Information

IP address: 172.16.65.161
Netmask (e.g. 255.255.255.0 or None) [None]:
Broadcast (e.g. X.Y.Z.255 or None) [None]:
Do you want to (a)dd, (m)odify, (d)elete or (s)how an IP address, or are you
(f)inished adding IP addresses [f]:
Do you want to add a disk device to the service (yes/no/?) [no]: yes

Disk Device Information

Device special file (e.g., /dev/sdb4): /dev/sdb6
Filesystem type (e.g., ext2, or ext3): ext3
Mount point (e.g., /usr/mnt/service1) [None]: /mnt/smb1
Mount options (e.g., rw,nosuid,sync): rw,sync
Forced unmount support (yes/no/?) [yes]:
Would you like to allow NFS access to this filesystem (yes/no/?)  [no]:
Would you like to share to Windows clients (yes/no/?)  [no]: yes

You will now be prompted for the Samba configuration:
Samba share name: smb1

The samba config file /etc/samba/smb.conf.smb1 does not exist.

Would you like a default config file created (yes/no/?)  [no]: yes

Successfully created daemon lock directory /var/cache/samba/smb1.
Please run `mkdir /var/cache/samba/smb1` on the other cluster member.

Successfully created /etc/samba/smb.conf.smb1.
Please remember to make necessary customizations and then copy the file
over to the other cluster member.

########################
At this point, on another window on this same system:
vi /etc/samba/smb.conf.smb1 - just to set the share writable
scp !$ cluh:/etc/samba/
#########################

Do you want to (a)dd, (m)odify, (d)elete or (s)how DEVICES, or are you
(f)inished adding DEVICES [f]:
name: smb1
preferred node: cluh
relocate: yes
user script: None
monitor interval: 5
IP address 0: 172.16.65.161
  netmask 0: None
  broadcast 0: None
device 0: /dev/sdb6
  mount point, device 0: /mnt/smb1
  mount fstype, device 0: ext3
  mount options, device 0: rw,sync
  force unmount, device 0: yes
  samba share, device 0: smb1
Add smb1 service as shown? (yes/no/?) yes
  0) clug
  1) cluh
  c) cancel


Choose member to start service on: 1
Error: Failed to start service smb1
cluadmin>

Looking at cluh's /var/log/messages:
Apr 25 17:13:16 cluh syslogd 1.4.1: restart.
Apr 25 17:14:06 cluh kernel: kjournald starting.  Commit interval 5 seconds
Apr 25 17:14:06 cluh kernel: EXT3 FS 2.4-0.9.11, 3 Oct 2001 on sd(8,21),
internal journal
Apr 25 17:14:06 cluh kernel: EXT3-fs: mounted filesystem with ordered data
mode.Apr 25 17:15:29 cluh rpc.mountd: export request from 172.16.65.159
Apr 25 17:16:55 cluh sshd(pam_unix)[3333]: session opened for user root by (uid=0)
Apr 25 17:17:20 cluh sshd(pam_unix)[3333]: session closed for user root
Apr 25 17:18:14 cluh rpc.mountd: authenticated unmount request from
tim.boston.redhat.com:1005 for /mnt/nfs1 (/mnt/nfs1)
Apr 25 17:33:45 cluh sshd(pam_unix)[18261]: session opened for user root by (uid=0)
Apr 25 17:33:45 cluh sshd(pam_unix)[18261]: session closed for user root
Apr 25 17:33:54 cluh clusvcmgrd[1071]: <warning> Cannot get service name for
service #1
Apr 25 17:33:54 cluh clusvcmgrd[18342]: <warning> Cannot get service name for
service #1
Apr 25 17:33:54 cluh clusvcmgrd[18342]: <warning> Cannot get service name for
service #1
Apr 25 17:33:54 cluh clusvcmgrd: [18343]: <err> service error: Cannot get
service name for service entry 1, err=2
Apr 25 17:33:54 cluh clusvcmgrd[18342]: <warning> Cannot get service name for
service #1
Apr 25 17:33:54 cluh clusvcmgrd[18342]: <warning> Cannot get service name for
service #1
Apr 25 17:33:54 cluh clusvcmgrd: [18361]: <err> service error: Cannot get
service name for service entry 1, err=2
Apr 25 17:33:54 cluh clusvcmgrd[18342]: <warning> Cannot get service name for
service #1

##################################
These systems are running the new stuff:
[root@cluh nfs]# rpm -qa | grep clumanager
clumanager-1.0.11-1

I thought those can't get service name things were cleaned up?  Or is it the
case that they got cleaned up and submitted after version 11?
###################################

So now I try to start the service on clug instead (this is the node on which
I am running cluadmin), this succeeds

cluadmin> service enable
  0) smb1
  c) cancel

Choose service to enable: 0
Are you sure? (yes/no/?) yes
  0) clug
  1) cluh
  c) cancel

Choose member: 0
Enabling smb1 on member clug. Service  enabled.
cluadmin>

OK, that worked, now try disabling the service and then starting it over
on the other member which previously failed:

cluadmin> service disable smb1
Are you sure? (yes/no/?) yes
Disabling smb1. Service smb1 disabled.
cluadmin> service enable smb1
  0) clug  1) cluh
  c) cancel

Choose member: 1
Are you sure? (yes/no/?) yes
Enabling smb1 on member cluh. Service smb1 enabled.
cluadmin>

Now this one worked.  I didn't do anything over on cluh.  So why did it
fail the first time?  (resarray krap?)

Looking at /var/log/messages over on cluh, there are no new entries complaining
that it can't find the service name.



Actual Results:  Initally the service failed to start.

(Although, my prior service creation of an nfs service to start on the other
cluster member worked fine.)

Expected Results:  Service should have started fine initially.

Additional info:

Comment 1 Jason Baron 2002-04-26 14:55:09 UTC
This has nothing to do with resarray.
Based on the log file, the service mananger tried to start the service
but errored out b/c the service script could not get the service name.
Subsequently, the service manager tried to stop and and disable the
service. These subsequent operations failed as well b/c the service
script could not get the service name once again... This is likely
due to the fact that the getconfig program as used by the service 
scripts uses the /etc/cluster.conf file to get service config info,
which likely has not been written yet to be consistent with the 
shared database.



Comment 2 Lon Hohberger 2002-07-17 22:05:24 UTC
Patch in pool.


Note You need to log in before you can comment on or make changes to this bug.