Bug 1274591 - No execute permission on node when running debug.sh from master
Summary: No execute permission on node when running debug.sh from master
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Networking
Version: 3.x
Hardware: All
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: Dan Winship
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-10-23 05:37 UTC by zhaozhanqi
Modified: 2015-11-23 21:14 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-11-23 21:14:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description zhaozhanqi 2015-10-23 05:37:16 UTC
Description of problem:
when running this script https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh on OSE master. Met no execute permission error:


+ ssh root.79.154 env KUBECONFIG=/tmp/openshift-sdn-debug-v9h5Nnl0A/.kubeconfig /tmp/openshift-sdn-debug-v9h5Nnl0A/debug.sh --node
Warning: Permanently added '10.66.79.154' (ECDSA) to the list of known hosts.
env: /tmp/openshift-sdn-debug-v9h5Nnl0A/debug.sh: Permission denied


Version-Release number of selected component (if applicable):
https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh

How reproducible:
always

Steps to Reproduce:
1. setup ose multi-node env
2. make sure master can ssh all node without password
3. run below debug.sh on master
   https://raw.githubusercontent.com/openshift/openshift-sdn/master/hack/debug.sh


Actual results:
As description

Expected results:
no this error and can collect node information

Additional info:

Comment 1 zhaozhanqi 2015-10-23 08:49:24 UTC
Also met the following error when running the script on local(non-master/node):

Analyzing openshift-161.lab.eng.nay.redhat.com (10.66.79.161)
sed: can't read ${CONFIG_FILE}: No such file or directory
Could not find node name in ${CONFIG_FILE}

Comment 2 Dan Winship 2015-10-23 14:04:41 UTC
Both of these problems are fixed by https://github.com/openshift/openshift-sdn/pull/191

Comment 3 Dan Winship 2015-10-27 18:48:13 UTC
fixed in openshift-sdn master

Comment 4 zhaozhanqi 2015-10-28 02:58:17 UTC
still meet some error, you can see https://paste.fedoraproject.org/284326/44600099/

Comment 5 Dan Winship 2015-10-28 15:18:58 UTC
paste urls get deleted after a while; it's better to attach files to the bug. Anyway, the relevant bit here is:

> /tmp/openshift-sdn-debug-41J6TeEeo/debug.sh: line 106: no: No such file or directory
> /tmp/openshift-sdn-debug-41J6TeEeo/debug.sh: eval: line 103: syntax error near unexpected token `newline'
> /tmp/openshift-sdn-debug-41J6TeEeo/debug.sh: eval: line 103: `pod_node=value>'

That suggests that this command is unexpectedly outputting junk:

> oc get pods --all-namespaces --template '{{range .items}}{{if .status.containerStatuses}}{{if not .spec.hostNetwork}}{{.spec.nodeName}}:{{.metadata.name}}:{{.metadata.namespace}}:{{.status.podIP}}:{{printf "%.21s" (index .status.containerStatuses 0).containerID}} {{end}}{{end}}{{end}}'

what does that output if you run it?

(Also, this seems unrelated to the original 'execute permission' problem. At first I thought it might be because you were using "sh -x", but I can't reproduce the problem here even with that. So I think it's because there's something unexpected about your setup that debug.sh isn't dealing with.)

Comment 6 zhaozhanqi 2015-10-29 02:54:09 UTC
Here get the wrong container Id, should filter off 'docker://':

openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0nyal:default:10.1.1.5:docker://042d5f6ff8b3

oc get pods --all-namespaces --template '{{range .items}}{{if .status.containerStatuses}}{{if not .spec.hostNetwork}}{{.spec.nodeName}}:{{.metadata.name}}:{{.metadata.namespace}}:{{.status.podIP}}:{{printf "%.21s" (index .status.containerStatuses 0).containerID}} {{end}}{{end}}{{end}}'
openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0nyal:default:10.1.1.5:docker://042d5f6ff8b3 openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0swp1:default:10.1.1.6:docker://a67b8a16750f openshift-145.lab.eng.nay.redhat.com:nodejs-example-1-build:xiama:<no value>:docker://6e5c97786244 openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-build:xiama:<no value>:docker://ae39ee4ccefd openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-x4oan:xiama:10.1.1.13:docker://00ce994dece2

Comment 7 zhaozhanqi 2015-10-29 03:34:12 UTC
oh, sorry my fault, please ignore comment 6

The root reason should '<no value>' make the pod_id as nil

output:

> oc get pods --all-namespaces --template '{{range .items}}{{if .status.containerStatuses}}{{if not .spec.hostNetwork}}{{.spec.nodeName}}:{{.metadata.name}}:{{.metadata.namespace}}:{{.status.podIP}}:{{printf "%.21s" (index .status.containerStatuses 0).containerID}} {{end}}{{end}}{{end}}'
openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0nyal:default:10.1.1.5:docker://042d5f6ff8b3 openshift-145.lab.eng.nay.redhat.com:docker-registry-2-0swp1:default:10.1.1.6:docker://a67b8a16750f openshift-145.lab.eng.nay.redhat.com:nodejs-example-1-build:xiama:<no value>:docker://6e5c97786244 openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-build:xiama:<no value>:docker://ae39ee4ccefd openshift-145.lab.eng.nay.redhat.com:nodejs-example-2-x4oan:xiama:10.1.1.13:docker://00ce994dece2

some pods do not have 'PodIP" likes the build pod:

# oc get pod nodejs-example-1-build
NAME                     READY     STATUS      RESTARTS   AGE
nodejs-example-1-build   0/1       Completed   0          2h 

<--snip--->

"status": {
        "phase": "Succeeded",
        "conditions": [
            {
                "type": "Ready",
                "status": "False",
                "lastProbeTime": null,
                "lastTransitionTime": "2015-10-29T01:16:42Z",
                "reason": "ContainersNotReady",
                "message": "containers with unready status: [sti-build]"
            }
        ],
        "hostIP": "10.66.79.145",
        "startTime": "2015-10-29T01:13:31Z",
        "containerStatuses": [

Comment 9 zhaozhanqi 2015-11-02 09:17:20 UTC
I think you mean 'https://raw.githubusercontent.com/danwinship/openshift-sdn/debug-pods-with-no-ip/hack/debug.sh

I tested using this script, this issue has been fixed

will verify this issue once it's merged.

Comment 10 Dan Winship 2015-11-02 13:31:25 UTC
oops, yes, that branch. it's merged now.

Comment 11 zhaozhanqi 2015-11-04 06:06:12 UTC
verified this bug


Note You need to log in before you can comment on or make changes to this bug.