This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1274591 - No execute permission on node when running debug.sh from master
No execute permission on node when running debug.sh from master
Status: CLOSED CURRENTRELEASE
Product: OpenShift Origin
Classification: Red Hat
Component: Networking (Show other bugs)
3.x
All Unspecified
medium Severity low
: ---
: ---
Assigned To: Dan Winship
zhaozhanqi
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-23 01:37 EDT by zhaozhanqi
Modified: 2015-11-23 16:14 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-23 16:14:37 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description zhaozhanqi 2015-10-23 01:37:16 EDT
Description of problem:
when running this script https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh on OSE master. Met no execute permission error:


+ ssh root@10.66.79.154 env KUBECONFIG=/tmp/openshift-sdn-debug-v9h5Nnl0A/.kubeconfig /tmp/openshift-sdn-debug-v9h5Nnl0A/debug.sh --node
Warning: Permanently added '10.66.79.154' (ECDSA) to the list of known hosts.
env: /tmp/openshift-sdn-debug-v9h5Nnl0A/debug.sh: Permission denied


Version-Release number of selected component (if applicable):
https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh

How reproducible:
always

Steps to Reproduce:
1. setup ose multi-node env
2. make sure master can ssh all node without password
3. run below debug.sh on master
   https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh


Actual results:
As description

Expected results:
no this error and can collect node information

Additional info:
Comment 1 zhaozhanqi 2015-10-23 04:49:24 EDT
Also met the following error when running the script on local(non-master/node):

Analyzing openshift-161.lab.eng.nay.redhat.com (10.66.79.161)
sed: can't read ${CONFIG_FILE}: No such file or directory
Could not find node name in ${CONFIG_FILE}
Comment 2 Dan Winship 2015-10-23 10:04:41 EDT
Both of these problems are fixed by https://github.com/openshift/openshift-sdn/pull/191
Comment 3 Dan Winship 2015-10-27 14:48:13 EDT
fixed in openshift-sdn master
Comment 4 zhaozhanqi 2015-10-27 22:58:17 EDT
still meet some error, you can see https://paste.fedoraproject.org/284326/44600099/
Comment 5 Dan Winship 2015-10-28 11:18:58 EDT
paste urls get deleted after a while; it's better to attach files to the bug. Anyway, the relevant bit here is:

> /tmp/openshift-sdn-debug-41J6TeEeo/debug.sh: line 106: no: No such file or directory
> /tmp/openshift-sdn-debug-41J6TeEeo/debug.sh: eval: line 103: syntax error near unexpected token `newline'
> /tmp/openshift-sdn-debug-41J6TeEeo/debug.sh: eval: line 103: `pod_node=value>'

That suggests that this command is unexpectedly outputting junk:

> oc get pods --all-namespaces --template '{{range .items}}{{if .status.containerStatuses}}{{if not .spec.hostNetwork}}{{.spec.nodeName}}:{{.metadata.name}}:{{.metadata.namespace}}:{{.status.podIP}}:{{printf "%.21s" (index .status.containerStatuses 0).containerID}} {{end}}{{end}}{{end}}'

what does that output if you run it?

(Also, this seems unrelated to the original 'execute permission' problem. At first I thought it might be because you were using "sh -x", but I can't reproduce the problem here even with that. So I think it's because there's something unexpected about your setup that debug.sh isn't dealing with.)
Comment 6 zhaozhanqi 2015-10-28 22:54:09 EDT
Here get the wrong container Id, should filter off 'docker://':

openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0nyal:default:10.1.1.5:docker://042d5f6ff8b3

oc get pods --all-namespaces --template '{{range .items}}{{if .status.containerStatuses}}{{if not .spec.hostNetwork}}{{.spec.nodeName}}:{{.metadata.name}}:{{.metadata.namespace}}:{{.status.podIP}}:{{printf "%.21s" (index .status.containerStatuses 0).containerID}} {{end}}{{end}}{{end}}'
openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0nyal:default:10.1.1.5:docker://042d5f6ff8b3 openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0swp1:default:10.1.1.6:docker://a67b8a16750f openshift-145.lab.eng.nay.redhat.com:nodejs-example-1-build:xiama:<no value>:docker://6e5c97786244 openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-build:xiama:<no value>:docker://ae39ee4ccefd openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-x4oan:xiama:10.1.1.13:docker://00ce994dece2
Comment 7 zhaozhanqi 2015-10-28 23:34:12 EDT
oh, sorry my fault, please ignore comment 6

The root reason should '<no value>' make the pod_id as nil

output:

> oc get pods --all-namespaces --template '{{range .items}}{{if .status.containerStatuses}}{{if not .spec.hostNetwork}}{{.spec.nodeName}}:{{.metadata.name}}:{{.metadata.namespace}}:{{.status.podIP}}:{{printf "%.21s" (index .status.containerStatuses 0).containerID}} {{end}}{{end}}{{end}}'
openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0nyal:default:10.1.1.5:docker://042d5f6ff8b3 openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0swp1:default:10.1.1.6:docker://a67b8a16750f openshift-145.lab.eng.nay.redhat.com:nodejs-example-1-build:xiama:<no value>:docker://6e5c97786244 openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-build:xiama:<no value>:docker://ae39ee4ccefd openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-x4oan:xiama:10.1.1.13:docker://00ce994dece2

some pods do not have 'PodIP" likes the build pod:

# oc get pod nodejs-example-1-build
NAME                     READY     STATUS      RESTARTS   AGE
nodejs-example-1-build   0/1       Completed   0          2h 

<--snip--->

"status": {
        "phase": "Succeeded",
        "conditions": [
            {
                "type": "Ready",
                "status": "False",
                "lastProbeTime": null,
                "lastTransitionTime": "2015-10-29T01:16:42Z",
                "reason": "ContainersNotReady",
                "message": "containers with unready status: [sti-build]"
            }
        ],
        "hostIP": "10.66.79.145",
        "startTime": "2015-10-29T01:13:31Z",
        "containerStatuses": [
Comment 9 zhaozhanqi 2015-11-02 04:17:20 EST
I think you mean 'https://raw.githubusercontent.com/danwinship/openshift-sdn/debug-pods-with-no-ip/hack/debug.sh

I tested using this script, this issue has been fixed

will verify this issue once it's merged.
Comment 10 Dan Winship 2015-11-02 08:31:25 EST
oops, yes, that branch. it's merged now.
Comment 11 zhaozhanqi 2015-11-04 01:06:12 EST
verified this bug

Note You need to log in before you can comment on or make changes to this bug.