Bug 994683

Summary: App Create Error: 'Cannot validate input uid'
Product: OpenShift Online Reporter: Thomas Wiest <twiest>
Component: PodAssignee: Dan McPherson <dmcphers>
Status: CLOSED DUPLICATE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: abhgupta, dmcphers, dtrainor, jhonce, twiest
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1003647 (view as bug list) Environment:
Last Closed: 2014-04-16 17:59:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1003647    
Attachments:
Description Flags
Stack trace from gear creation providing more information on this error none

Description Thomas Wiest 2013-08-07 18:47:43 UTC
Description of problem:
We're sporadically getting this error when trying to create apps in PROD.

DEBUG: rhc app create -k -a 'd1308070951drupal' -t 'php-5.3' 
Application Options
-------------------
  Namespace:  openshiftnagios
  Cartridges: php-5.3
  Gear Size:  default
  Scaling:    no

Creating application 'd1308070951drupal' ... 
Unable to complete the requested operation due to: Cannot validate input uid: value should be a number.
Reference ID: 58852561d1deebdb2171df7fb28dd1e9
Reference ID: 58852561d1deebdb2171df7fb28dd1e9
DEBUG: rhc cartridge add -k -a 'd1308070951drupal' -c 'mysql-5.1' 
Adding mysql-5.1 to application 'd1308070951drupal' ... Application 'd1308070951drupal' not found.
Adding mysql-5.1 to application 'd1308070951drupal' ... Application 'd1308070951drupal' not found.
Exception: No such file or directory - /tmp/d20130807-2873-1mwdmsh/d1308070951drupal
/usr/local/lib/rhc_helper.rb:183:in `chdir'
/usr/local/lib/rhc_helper.rb:183:in `create_drupal'
/usr/local/bin/nagios-ctl-app:132:in `main'
/usr/local/bin/nagios-ctl-app:161
Connection to ex-srv1.prod.rhcloud.com closed.
Exit Code: 0



Version-Release number of selected component (if applicable):
We saw it in both 2.0.30 and after upgrading to 2.0.31.


How reproducible:
Very sporadic, only seen in PROD.

Steps to Reproduce:
1. unknown, just seen in PROD.


Actual results:
Sporadically fails with "Cannot validate input uid"


Expected results:
No error

Comment 1 Abhishek Gupta 2013-08-15 22:41:59 UTC
Can you please provide the broker, mcollective and platform (node) logs for a request that fails with this error?

Comment 3 Abhishek Gupta 2013-08-19 20:59:40 UTC
Providing snippets of logs from the broker and mcollective in the previous comment. Seems like useradd is failing because it is unable to lock /etc/passwd .

Comment 4 Mrunal Patel 2013-08-19 21:13:03 UTC
Most likely some of these files were left behind by a crashing useradd/userdel operation.

/etc/passwd.lock
/etc/shadow.lock
/etc/group.lock
/etc/gshadow.lock

Comment 5 Mrunal Patel 2013-09-04 19:42:42 UTC
Have we seen this issue again in PROD?

Thanks,
Mrunal

Comment 6 Thomas Wiest 2013-09-05 16:10:18 UTC
We've changed how we're doing these creates now.

We're now using the --from-code method of creating a drupal quickstart.

So, no, we're no longer seeing this problem, but I don't know if that's because it's fixed or because we changed how we're doing the creates.

Comment 7 Mrunal Patel 2013-09-10 19:28:24 UTC
Lowering severity for now since we haven't been able to root cause it looking at the logs and also we haven't seen it again. It will be easier to debug, if we can inspect the nodes if/when the issue is seen.

Comment 8 Dan Trainor 2013-09-25 00:55:34 UTC
Hey Thomas, I'm at a client site right now and they've encountered this.

Can you please tell me more information about in which component this --from-code method was changed in?  I'd like to settle this one.  I've attached a full stack trace in the hopes it helps.

Comment 9 Dan Trainor 2013-09-25 00:56:34 UTC
Created attachment 802523 [details]
Stack trace from gear creation providing more information on this error

Comment 10 Mrunal Patel 2014-02-10 23:34:10 UTC
Dan Trainor,
Did you see any of these files when the issue occured? 
/etc/passwd.lock
/etc/shadow.lock
/etc/group.lock
/etc/gshadow.lock

If so, do you have timestamps so we can correlate with the node call?

Comment 11 Jhon Honce 2014-04-14 23:21:18 UTC
The message "Cannot validate input uid: value should be a number" is being emitted from the MCollective client on the Broker when attempting to query a Node.  From the openshift.ddl, 

action "has_uid_or_gid", :description => "Returns whether this system has already taken the uid or gid" do
    display :always

    input :uid,
        :prompt         => "uid/gid",
        :description    => "uid/gid",
        :type           => :number,
        :optional       => false

mcollective_application_container_proxy.rb#has_uid_or_gid? is populating +uid+ with some value that is not considered :numeric by MCollective validators.

Comment 12 Abhishek Gupta 2014-04-16 16:55:02 UTC
This seems to be a duplicate of bug 1039641 and was fixed by Dan with --> https://github.com/openshift/origin-server/pull/4300

Comment 13 Dan McPherson 2014-04-16 17:59:47 UTC

*** This bug has been marked as a duplicate of bug 1039641 ***