Bug 993046

Summary: Change group failing for Active Directory "domain users" on rhs2.1 samba share directory
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Lalatendu Mohanty <lmohanty>
Component: sambaAssignee: Christopher R. Hertel <crh>
Status: CLOSED NOTABUG QA Contact: Lalatendu Mohanty <lmohanty>
Severity: high Docs Contact:
Priority: high    
Version: 2.1CC: jarrpa, rwheeler, sac, sbhaloth, sbose, sdharane, shaines, vagarwal
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-19 08:46:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 956495    
Attachments:
Description Flags
smb.conf file from the RHS node
none
strace log while running chgrp command
none
winbind logs none

Description Lalatendu Mohanty 2013-08-05 13:52:59 UTC
Created attachment 782845 [details]
smb.conf file from the RHS node

Description of problem:

To integrate RHS2.1 +samba share with Windows Active Directory we need to change the group of samba share/directory to "<AD domain>+domain users". So that domain users in the active directory can access it. The "domain users" is a user group in active directory. Any user created in the AD domain automatically gets added to "domain users" group. So for samba share to work correctly with AD, the share/directory should belong to "<AD domain>+domain users".

In our case, the domain name is "RHSQE-DC". So the below command should work

chgrp "RHSQE-DC+domain users" /mnt/samba/gfs-vol1/rhsdata01
chgrp: invalid group: `RHSQE-DC+domain users'

Note: We tried the same steps on Anshi i.e. RHS2.0 and this worked fine. 

Steps to Reproduce:

Below is the document we are referring for Active directory integration and it has exact steps for it.
https://access.redhat.com/site/articles/410303

1. Create an Active directory setup 
2. Install RHS-2.1-20130805.n.1 ISO
3. Join the rhs node to the Active directory domain
4. Mount the samba volume as fuse mount with "-o acl" option
   e.g: mount -t glusterfs -o acl 10.70.35.174:/gfs-vol1 /mnt/samba/gfs-vol1/
5. create a directory in the mount point. e.g: mkdir /mnt/samba/gfs-vol1/rhsdata01
6. run chgrp command for the directory
e.g.: chgrp "RHSQE-DC+domain users" /mnt/samba/gfs-vol1/rhsdata01

Note: We are doing step #4 because the shared directory should belong to  "RHSQE-DC+domain users" and with glusterfs vfs plug-in samba does not automatically mount the volume on RHS nodes.

During our testing the join domain works fine for the RHS node.
root@BigBend-1 gfs-vol1]# net join -S rhsqe-dc1 -U administrator
Enter administrator's password:

Using short domain name -- RHSQE-DC
Joined 'BIGBEND-1' to dns domain 'rhsqe-dc.com'
No DNS domain configured for bigbend-1. Unable to perform DNS Update.
DNS update failed!

[root@BigBend-1 gfs-vol1]# net ads testjoin
Join is OK

The "wbinfo --domain-groups" i.e. winbind command to give groups information,correctly lists "domain users"

[root@BigBend-1 gfs-vol1]# wbinfo --domain-groups
winrmremotewmiusers__
domain computers
domain controllers
schema admins
enterprise admins
cert publishers
domain admins
domain users
domain guests
group policy creator owners
ras and ias servers
allowed rodc password replication group
denied rodc password replication group
read-only domain controllers
enterprise read-only domain controllers
cloneable domain controllers
dnsadmins
dnsupdateproxy
hobbits

[root@BigBend-1 gfs-vol1]# gluster v info
 
Volume Name: gfs-vol1
Type: Distribute
Volume ID: 2f032a2d-b970-49fd-a8b0-30c0e9a3952d
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.70.35.174:/bricks/gfs-vol1-b1
Brick2: 10.70.35.174:/bricks/gfs-vol1-b2


10.70.35.174:/gfs-vol1 on /mnt/samba/gfs-vol1 type fuse.glusterfs (rw,allow_other,max_read=131072)


The gLobal settings in /etc/samba/smb.conf for AD integration. I have also attached smb.conf to this bug for reference 

[global]
#--authconfig--start-line--

# Generated by authconfig on 2013/08/05 07:16:30
# DO NOT EDIT THIS SECTION (delimited by --start-line--/--end-line--)
# Any modification may be deleted or altered by authconfig in future

   workgroup = RHSQE-DC
   password server = RHSQE-DC1.RHSQE-DC.COM
   realm = RHSQE-DC.COM
   security = ads
   netbios name = BigBend-1

  idmap uid = 10000-19999
  idmap gid = 10000-19999
  idmap config RHSQE-DC:backend = ad
  idmap config RHSQE-DC:default = yes
  idmap config RHSQE-DC:range = 10000000-19999999
  idmap config RHSQE-DC:schema_mode = rfc2307
  winbind nss info = rfc2307
  winbind enum users = no
  winbind enum groups = no
  winbind separator = +
  winbind use default domain = yes
  winbind nested groups = yes

   template shell = /bin/bash
   winbind offline logon = false

 

Version-Release number of selected component (if applicable):

rpm -qa | grep -i samba
samba-client-3.6.9-156.2.el6rhs.x86_64
samba-common-3.6.9-156.2.el6rhs.x86_64
samba4-libs-4.0.0-55.el6.rc4.x86_64
samba-glusterfs-3.6.9-156.2.el6rhs.x86_64
samba-winbind-clients-3.6.9-156.2.el6rhs.x86_64
samba-3.6.9-156.2.el6rhs.x86_64
samba-winbind-3.6.9-156.2.el6rhs.x86_64

How reproducible:

Always

Actual results:

chgrp "RHSQE-DC+domain users" /mnt/samba/gfs-vol1/rhsdata01
chgrp: invalid group: `RHSQE-DC+domain users'

Expected results:

Above chgrp  command should pass

Additional info:

To me it looks like a winbid issue. It should correctly get group information from AD when we run the command and the command should pass.

Comment 4 Lalatendu Mohanty 2013-08-06 13:34:56 UTC
I have tested the same AD integration against latest BigBend ISO today and faced the same issue.
To see if RHEL 6.4 latest have the issue too, I tested against RHEL6.4 and everything is working fine with it.

After working through this issue, I am kind of positive that winbind is the culprit here. Winbindd is responsible for getting the SID from AD and converting to UID/GID and it failing to do that here.  

Check the winbindd log from the BigBend RHS node below. Check for "Could not fetch our SID - did we join?" below. I don't see similar log for Anshi or RHEL 6.4.

cat /var/log/samba/log.winbindd

[2013/08/06 05:10:35,  0] winbindd/winbindd.c:1376(main)
  winbindd version 3.6.9-156.2.el6rhs started.
  Copyright Andrew Tridgell and the Samba Team 1992-2011
[2013/08/06 05:10:35.050191,  0] winbindd/winbindd_cache.c:3168(initialize_winbindd_cache)
  initialize_winbindd_cache: clearing cache and re-creating with version number 2
[2013/08/06 05:10:35.164180,  0] winbindd/winbindd_util.c:635(init_domain_list)
  Could not fetch our SID - did we join?
[2013/08/06 05:10:35.164233,  0] winbindd/winbindd.c:1136(winbindd_register_handlers)
  unable to initialize domain list
[2013/08/06 05:11:47,  0] winbindd/winbindd.c:1376(main)
  winbindd version 3.6.9-156.2.el6rhs started.
  Copyright Andrew Tridgell and the Samba Team 1992-2011
[2013/08/06 05:11:47.214599,  0] winbindd/winbindd_cache.c:3168(initialize_winbindd_cache)
  initialize_winbindd_cache: clearing cache and re-creating with version number 2
[2013/08/06 05:21:31.380774,  0] winbindd/winbindd.c:240(winbindd_sig_term_handler)
  Got sig[15] terminate (is_parent=1)
[2013/08/06 05:26:06,  0] winbindd/winbindd.c:1376(main)
  winbindd version 3.6.9-156.2.el6rhs started.
  Copyright Andrew Tridgell and the Samba Team 1992-2011
[2013/08/06 05:26:06.267879,  0] winbindd/winbindd_cache.c:3168(initialize_winbindd_cache)
  initialize_winbindd_cache: clearing cache and re-creating with version number 2

Comment 5 Sumit Bose 2013-08-06 15:18:13 UTC
Just two thing which comes to my mind which should be checked. Please make sure that winbind is listed in /etc/nsswitch.conf for passwd and group.

It is not recommended to use 'winbind use default domain = yes', see man smb.conf for details. Setting it to 'yes' might force you to use just 'domain users' instead of `RHSQE-DC+domain users' with chgrp.

Comment 6 Lalatendu Mohanty 2013-08-06 17:06:44 UTC
Hei Sumit,

I tried the two things you mentioned. I have copied the commands, output and copied the contents of nsswitch.conf which has winbind. Let me know if yoy want me to try something else.

#############################################

net join -S rhsqe-dc1 -U administrator
Enter administrator's password:

Using short domain name -- RHSQE-DC
Joined 'BIGBEND-1' to dns domain 'rhsqe-dc.com'
No DNS domain configured for bigbend-1. Unable to perform DNS Update.
DNS update failed!

[root@BigBend-1 ~]# net ads testjoin
Join is OK
[root@BigBend-1 ~]# wbinfo -u
administrator
guest
krbtgt
hobbit1
[root@BigBend-1 ~]# wbinfo -g
winrmremotewmiusers__
domain computers
domain controllers
schema admins
enterprise admins
cert publishers
domain admins
domain users
domain guests
group policy creator owners
ras and ias servers
allowed rodc password replication group
denied rodc password replication group
read-only domain controllers
enterprise read-only domain controllers
cloneable domain controllers
dnsadmins
dnsupdateproxy
hobbits

[root@BigBend-1 ~]# chgrp "domain users" /mnt/samba/gfs-vol1/rhsdata01
chgrp: invalid group: `domain users'

In /etc/nsswitch.conf I have below entries

passwd:     files winbind
shadow:     files winbind
group:      files winbind

Comment 7 Sumit Bose 2013-08-06 19:56:04 UTC
Can you attach the output of 'strace -f -s 128 chgrp "domain users" /mnt/samba/gfs-vol1/rhsdata0' ?

Does 6.4 work with the same AD server or a different one? I'm asking because you idmap configuration expects that the POSIX UIDs and GIDs are set in the AD site in rfc2307 attributes. Does the 'domain user' group on your AD server has a gidNumber attribute?

Can you check with "wbinfo -n 'domain users'" and "wbinfo -Y S-....." if winbind can translate the name to a SID and then the SID to a GID?

Comment 8 Lalatendu Mohanty 2013-08-07 07:23:10 UTC
Created attachment 783700 [details]
strace log while running chgrp command

Comment 9 Lalatendu Mohanty 2013-08-07 07:37:02 UTC
Hei Sumit,

I have attached the strace log for the chgrp command  in my previous comment. 

yes, the RHEL 6.4 works with the same AD domin which rhs2.1 i.e. BigBend server is joined. I have only one AD server and one domain.

Not sure if 'domain user' group on the AD server has a gidNumber. I tried to check but couldn't confirm it. But the "wbinfo -n 'domain users'" and "wbinfo -Y <GID>" gives more information about it as "wbinfo -Y <GID>" fails on rhs2.1 server but passes on RHEL 6.4.

Below commands/output are From rhs2.1
####################################
[root@BigBend-1 ~]# wbinfo -n 'domain users'
S-1-5-21-218531155-2581107591-316442423-513 SID_DOM_GROUP (2)


[root@BigBend-1 ~]# wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
failed to call wbcSidToGid: WBC_ERR_DOMAIN_NOT_FOUND
Could not convert sid S-1-5-21-218531155-2581107591-316442423-513 to gid

Below commands/output are From RHEL6.4
###################################
[root@RHEL6 ~]# wbinfo -n 'domain users'
S-1-5-21-218531155-2581107591-316442423-513 SID_DOM_GROUP (2)

[root@RHEL6 ~]# wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
16777222

So it seems winbind is not able to convert sid to gid on the RHS2.1/BigBend server

Comment 10 Sumit Bose 2013-08-07 09:04:38 UTC
Can you set 'log level = 10' in smb.conf, stop winbind, remove winbind logs, start winbind, call "wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513" and then attach the winbind log files to the ticket?

Comment 11 Lalatendu Mohanty 2013-08-07 14:37:55 UTC
Sumit,

Today I tried on few permutations of settings on smb.conf in rhs2.1 server and I come to conclusion that if I remove "idmap config RHSQE-DC:backend = ad" from the smb.conf the "wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513" and chgrp command "chgrp "RHSQE-DC+domain users" /mnt/samba/gfs-vol1/rhsdata01/" works fine.

If I don't mention  "idmap config RHSQE-DC:backend = ad" in smb.conf idmap by default uses tbd as database. I am not sure if "ad" should also work as back end database for idmap. I followed these settings (e.g. rfc2307, default domain yes and ad database) by referring the AD integration document from redhat reference architecture team i.e. https://access.redhat.com/site/articles/410303.

So the below settings in smb.conf works fine  

[global]
#--authconfig--start-line--

# Generated by authconfig on 2013/08/07 09:06:12
# DO NOT EDIT THIS SECTION (delimited by --start-line--/--end-line--)
# Any modification may be deleted or altered by authconfig in future

   workgroup = RHSQE-DC
   password server = RHSQE-DC1.RHSQE-DC.COM
   realm = RHSQE-DC.COM
   security = ads
   netbios name = rhs21ISoTest

   idmap uid = 10000-19999
   idmap gid = 10000-19999
   idmap config RHSQE-DC:default = yes
   idmap config RHSQE-DC:range = 10000000-19999999

   idmap config RHSQE-DC:schema_mode = rfc2307
   winbind nss info = rfc2307

   winbind enum users = false
   winbind enum groups = false
   winbind separator = +
   winbind use default domain = yes
   winbind nested groups = yes

   template shell = /bin/bash
   winbind offline logon = false
#--authconfig--end-line--

##########################

With the above smb.conf file , I getting the gid of "domain users" gid as 10002 and sometimes 10006 for same AD if I redo the settings on the rhs2.1 server.

[root@rhs21ISoTest ~]# wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
10002

#re-do the settings (I mean stopping samba, winbind, remove cached tbd files, do a net join to ad, start winbind and samba. I have mentioned the steps in details at the end)

[root@rhs21ISoTest ~]# wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
10006

is it a bug? I am not able to find the reason behind the gid change. Kindly let me know.

Also below are the steps I perform after any smb.conf change. Kindly let me know if it is fine.

 1. Change smb.conf, vi /etc/samba/smb.conf
 2. service smb stop;service winbind stop
 3. rm -f /var/lib/samba/*
 4. kdestroy;klist
 5. net join -S rhsqe-dc1 -U administrator
 6. service winbind start;service smb start

Comment 12 Lalatendu Mohanty 2013-08-07 14:57:44 UTC
If you guys think that removing "idmap config RHSQE-DC:backend = ad" is not a big issue, we can remove "TestBlocker" from the bug

Comment 13 Sumit Bose 2013-08-07 16:08:16 UTC
I would not recommend to use the tdb idmap backend. When running in a cluster node might have a different set of POSIX UIDs and GIDs. Additionally when the related tdb is deleted, new IDs are assigned, this is what you see and it is expected behavior for the tdb backend.

Please try with the AD backend again and attach the winbind logs.

Comment 14 Lalatendu Mohanty 2013-08-07 18:48:35 UTC
Created attachment 784074 [details]
winbind logs


Below are the steps performed for collecting the logs

service smb stop;service winbind stop
cd /var/log/samba
mv log.winbindd* old/
rm -f /var/lib/samba/*
kdestroy;klist
net join -S rhsqe-dc1 -U administrator
service winbind start;service smb start
wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
tar cvzf winbindd-logs.tar.gz log.winbindd*

Comment 15 Lalatendu Mohanty 2013-08-07 19:09:29 UTC
I can see below logs in log.winbindd for "wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513"


                  sid                      : S-1-5-21-218531155-2581107591-316442423-513
[2013/08/07 14:41:02.426741,  1] ../librpc/ndr/ndr.c:284(ndr_print_function_debug)
       wbint_Sid2Gid: struct wbint_Sid2Gid
          out: struct wbint_Sid2Gid
              gid                      : *
                  gid                      : 0x0000000000000000 (0)
              result                   : NT_STATUS_NONE_MAPPED
[2013/08/07 14:41:02.426877,  5] winbindd/winbindd_sid_to_gid.c:90(winbindd_sid_to_gid_recv)
  Could not convert sid S-1-5-21-218531155-2581107591-316442423-513: NT_STATUS_NONE_MAPPED
[2013/08/07 14:41:02.426942, 10] winbindd/winbindd.c:707(wb_request_done)
  wb_request_done[4326:SID_TO_GID]: NT_STATUS_NONE_MAPPED

And also in log.winbindd-idmap

[2013/08/07 14:41:02.426352, 10] winbindd/idmap_util.c:268(idmap_sid_to_gid)
  sid [S-1-5-21-218531155-2581107591-316442423-513] is not mapped
[2013/08/07 14:41:02.426404, 10] lib/gencache.c:183(gencache_set_data_blob)
  Adding cache entry with key = IDMAP/SID2GID/S-1-5-21-218531155-2581107591-316442423-513 and timeout = Wed Aug  7 14:43:02 2013
   (120 seconds ahead)
[2013/08/07 14:41:02.426495,  1] ../librpc/ndr/ndr.c:284(ndr_print_function_debug)
       wbint_Sid2Gid: struct wbint_Sid2Gid
          out: struct wbint_Sid2Gid
              gid                      : *
                  gid                      : 0x0000000000000000 (0)
              result                   : NT_STATUS_NONE_MAPPED
[2013/08/07 14:41:02.426611,  4] winbindd/winbindd_dual.c:1317(child_handler)
  Finished processing child request 59
[2013/08/07 14:41:02.426647, 10] winbindd/winbindd_dual.c:1333(child_handler)
  Writing 3508 bytes to parent

Comment 16 Sumit Bose 2013-08-09 15:05:54 UTC
For the logs I get the impression that the POSIX ID attributes are missing here. Can you check on AD with 'Active Directory Users and Computers' if the group has the 'Unix Attributes' tab and if a GID is set there?

If there is no 'Unix Attributes' tab, then the Unix services are not installed on the server. In this case I would recommend to use the idmap rid backend for testing. This gives consistent UIDs and GIDs on multiple servers (if the idmap configuration in smb.conf is the same) and does not require any settings on the AD side. It still would make me wonder why it works with 6.4.

Comment 17 Lalatendu Mohanty 2013-08-12 11:43:46 UTC
I checked the AD and as you have suspected the Unix attributes were missing. Because in Windows 2012(AD server) the NIS server i.e. " Identity Management for UNIX"  is deprecated and not installed by default. 

I have installed the " Identity Management for UNIX"/"Unix attributes" by following below technet article. 

http://technet.microsoft.com/en-us/library/cc731178.aspx#BKMK_command

Also explicitly added the "domain users" to the NIS group, assigned GID, cleared the cache from the RHS node and checked the  below command to check if it can convert the SID to GID. However it failed again with a new error. 

wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
failed to call wbcSidToGid: WBC_ERR_DOMAIN_NOT_FOUND
Could not convert sid S-1-5-21-218531155-2581107591-316442423-513 to gid

I also tried rid as the idmap "rid" back-end (i.e. idmap config RHSQE-DC:backend = rid
" and works fine (see below)

[root@rhs21ISoTest ~]# wbinfo -n 'domain users'
S-1-5-21-218531155-2581107591-316442423-513 SID_DOM_GROUP (2)
[root@rhs21ISoTest ~]# wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
10000513

Regrading the RHEL 6.4, the smb.conf doesn't have idmap "ad" backend settings as the document I referred  did not suggest those entries for RHEL 6.4. The other reason is, when I started with the integration I was not much aware of these entries meaning for smb.conf, Hence never thought a single entry would make so much difference. I will also check in RHEL 6.4 with ad backend

Comment 19 Sumit Bose 2013-08-12 13:11:48 UTC
Since the GID is now 10000 you might need to tune 'idmap config RHSQE-DC:range = 10000000-19999999' accordingly for the AD backend. But if the RID backend works for you I would suggest to use this for further testing, since it does not require and modifications in AD.

Comment 20 Lalatendu Mohanty 2013-08-12 13:31:20 UTC
Yup, after changing it from 
"'idmap config RHSQE-DC:range = 10000000-19999999"  to "idmap config RHSQE-DC:range = 10000-19999999" it works fine now. 

[root@rhs21ISoTest ~]# wbinfo -Y S-1-5-21-218531155-2581107591-316442423-513
10000

chgrp command is working fine as expected

[root@rhs21ISoTest ~]# chgrp "domain users" /mnt/samba/gfs-vol1/rhsdata01/
[root@rhs21ISoTest ~]# echo $?
0

I also agree with using rid as beck-end as it does not require change on AD. I think we can remove the test blocker,but we need to document the preferred back-end as rid and  also the steps if somebody wants use "ad" as back-end (i.e. for win 2012 server, NIS need to be installed explicitly and "domain users"/required groups should be added to the NIS domain)

I will also test it against win2008 AD server to check how it fairs with it.

Comment 21 Lalatendu Mohanty 2013-08-12 14:08:45 UTC
Below are the conclusions we can draw from the bug. 

For Active directory integration of RHS 2.1 with Windows 2012 server we will prefer to use rid as idmap backend "idmap config RHSQE-DC:backend = rid". The default back-end tbd is not recommended, even though it works on some configuration.

For customers already using "Identity Management for UNIX" the AD back-end would be useful. For other/new customers the RID back-end is easier to set up.

To use "ad" as idmap back-end we need to  install  the "Identity Management for UNIX"/"Unix attributes" on the AD (win server 2012) and the required  groups/users should be added to the NIS domain. Also equivalent changes about the GID range should be modified in smb.conf. I not sure how it works in Windows 2008 Active Directory. 



Following technet article can be referred for installing  "Identity Management for UNIX" on Windows AD server . 
http://technet.microsoft.com/en-us/library/cc731178.aspx#BKMK_command

Comment 22 Scott Haines 2013-08-16 13:20:30 UTC
Per 08/16 scrum-of-scrum meeting, not a blocker.

Comment 23 Lalatendu Mohanty 2013-08-19 08:46:03 UTC
We have communicated our findings to the original authors of https://access.redhat.com/site/articles/410303 from the reference architecture team.

I don't think we have any action item pending on this bug. hence closing this