Bug 1410160 - [TRACKER] User Acceptance Test (UAT) Feedback on CNS 3.4
Summary: [TRACKER] User Acceptance Test (UAT) Feedback on CNS 3.4
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: doc-Container_Native_Storage_with_OpenShift
Version: cns-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: storage-doc
QA Contact: storage-qa-internal@redhat.com
URL:
Whiteboard:
Depends On:
Blocks: 1385248
TreeView+ depends on / blocked
 
Reported: 2017-01-04 15:52 UTC by Anjana Suparna Sriram
Modified: 2017-01-23 07:20 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-01-23 07:20:52 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Anjana Suparna Sriram 2017-01-04 15:52:21 UTC
Additional info: This tracker aims to capture all the feedback from the UAT team.

Comment 2 Erin Boyd 2017-01-17 16:07:10 UTC
This link should be updated to point to 3.4 when it's available:
The OpenShift cluster must be up and running. For information on setting up OpenShift cluster, see https://access.redhat.com/documentation/en/openshift-container-platform/3.3/paged/installation-and-configuration.

Comment 3 Erin Boyd 2017-01-17 16:07:33 UTC
This command will not reload IP tables:

[root@cns11 ~]# systemctl reload iptables
Failed to reload iptables.service: Unit is masked.
[root@cns11 ~]#

Comment 4 Divya 2017-01-17 16:22:39 UTC
(In reply to Erin Boyd from comment #2)
> This link should be updated to point to 3.4 when it's available:
> The OpenShift cluster must be up and running. For information on setting up
> OpenShift cluster, see
> https://access.redhat.com/documentation/en/openshift-container-platform/3.3/
> paged/installation-and-configuration.

All the links in the CNS guide will be updated to point to 3.4 tomorrow. Bug https://bugzilla.redhat.com/show_bug.cgi?id=1389070 is used to track the same.

Comment 5 Erin Boyd 2017-01-17 16:38:58 UTC
This link also needs to be updated to 3.4:
It is recommended to persist the logs for the Heketi container. For more information on persisting logs, refer https://access.redhat.com/documentation/en/openshift-container-platform/3.3/paged/installation-and-configuration/chapter-27-aggregating-container-logs.

Comment 6 Erin Boyd 2017-01-17 16:39:36 UTC
The 'NOTE' needs to be moved above this section so they don't see it after the fact:

 After the router is running, the clients have to be setup to access the services in the OpenShift cluster. Execute the following steps to set up the DNS.

    On the client, edit the /etc/dnsmasq.conf file and add the following line to the file:

    address=/.cloudapps.mystorage.com/<Router_IP_Address>

    where, Router_IP_Address is the IP address of one of the nodes running the router.

    Note
    Ensure you do not edit the /etc/dnsmasq.conf file until the router has started.

Comment 7 Erin Boyd 2017-01-17 16:58:19 UTC
We are blocked as it cannot find the router package with the current repos listed in the doc.

Comment 8 Scott Creeley 2017-01-17 17:05:07 UTC
(In reply to Erin Boyd from comment #7)
> We are blocked as it cannot find the router package with the current repos
> listed in the doc.

Events:
  FirstSeen	LastSeen	Count	From			SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  4m		4m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned storage-project-router-1-deploy to cns3.rhs
  4m		1m		5	{kubelet cns3.rhs}			Warning		FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "Error: image openshift3/ose-pod not found"

  3m	5s	15	{kubelet cns3.rhs}		Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"openshift3/ose-pod:v3.4.0.39\""

Comment 10 Scott Creeley 2017-01-17 18:57:14 UTC
topology file must match what 'oc get nodes' returns...maybe add a note or some instructions to section 4.2 step 1 for building topology.json file

[root@cnsmaster ansible]# oc get nodes
NAME       STATUS    AGE
cns1.rhs   Ready     3h
cns2.rhs   Ready     3h
cns3.rhs   Ready     3h

I used ipaddress for both node.hostnames.manage section and node.hostnames.storage section

example:

            "nodes": [
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.122.211"
                            ],
                            "storage": [
                                "192.168.122.211"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/vdc",
                        "/dev/vdd"
                    ]
                },

then I ran deploy but get the following failures (notice the nodes Not Found during labeling):

Using OpenShift CLI.
NAME              STATUS    AGE
storage-project   Active    2h
Using namespace "storage-project".
template "deploy-heketi" created
serviceaccount "heketi-service-account" created
template "heketi" created
template "glusterfs" created
Error from server: nodes "192.168.122.211" not found
Error from server: nodes "192.168.122.212" not found
Error from server: nodes "192.168.122.213" not found
daemonset "glusterfs" created
Waiting for GlusterFS pods to start ... Timed out waiting for pods matching 'glusterfs-node=pod'.
No resources found
Error from server: deploymentconfig "heketi" not found
Error from server: services "heketi" not found
Error from server: routes "heketi" not found
Error from server: services "heketi-storage-endpoints" not found
serviceaccount "heketi-service-account" deleted
template "deploy-heketi" deleted
template "heketi" deleted
Error from server: nodes "192.168.122.211" not found
Error from server: nodes "192.168.122.212" not found
Error from server: nodes "192.168.122.213" not found
daemonset "glusterfs" deleted
template "glusterfs" deleted

I changed the topology to actual node/hostname for the manage section based on 'oc get nodes' values (left ip for storage section):

        {
            "nodes": [
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "cns1.rhs"
                            ],
                            "storage": [
                                "192.168.122.211"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/vdc",
                        "/dev/vdd"
                    ]
                },


resulting in better cns-deploy

Using OpenShift CLI.
NAME              STATUS    AGE
storage-project   Active    2h
Using namespace "storage-project".
template "deploy-heketi" created
serviceaccount "heketi-service-account" created
template "heketi" created
template "glusterfs" created
node "cns1.rhs" labeled
node "cns2.rhs" labeled
node "cns3.rhs" labeled
daemonset "glusterfs" created
Waiting for GlusterFS pods to start ... OK
service "deploy-heketi" created
route "deploy-heketi" created

Comment 11 Scott Creeley 2017-01-17 19:03:10 UTC
section 4.1 - is there an easy way to validate the router is configured and working properly?  If so, can we add to docs?

Error:
Failed to communicate with deploy-heketi service.
Please verify that a router has been properly configured.

Output:
[O]penShift, [K]ubernetes? [O/o/K/k]: O
Using OpenShift CLI.
NAME              STATUS    AGE
storage-project   Active    2h
Using namespace "storage-project".
template "deploy-heketi" created
serviceaccount "heketi-service-account" created
template "heketi" created
template "glusterfs" created
node "cns1.rhs" labeled
node "cns2.rhs" labeled
node "cns3.rhs" labeled
daemonset "glusterfs" created
Waiting for GlusterFS pods to start ... OK
service "deploy-heketi" created
route "deploy-heketi" created
deploymentconfig "deploy-heketi" created
Waiting for deploy-heketi pod to start ... OK
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4818    0  4818    0     0  11201      0 --:--:-- --:--:-- --:--:-- 11204
Failed to communicate with deploy-heketi service.
Please verify that a router has been properly configured.
deploymentconfig "deploy-heketi" deleted
route "deploy-heketi" deleted
service "deploy-heketi" deleted
pod "deploy-heketi-1-y0dyr" deleted
Error from server: deploymentconfig "heketi" not found
Error from server: services "heketi" not found
Error from server: routes "heketi" not found
Error from server: services "heketi-storage-endpoints" not found
serviceaccount "heketi-service-account" deleted
template "deploy-heketi" deleted
template "heketi" deleted
node "cns1.rhs" labeled
node "cns2.rhs" labeled
node "cns3.rhs" labeled
daemonset "glusterfs" deleted
template "glusterfs" deleted

Comment 12 Scott Creeley 2017-01-17 19:10:27 UTC
(In reply to Scott Creeley from comment #11)
> section 4.1 - is there an easy way to validate the router is configured and
> working properly?  If so, can we add to docs?
> 
> Error:
> Failed to communicate with deploy-heketi service.
> Please verify that a router has been properly configured.
> 
> Output:
> [O]penShift, [K]ubernetes? [O/o/K/k]: O
> Using OpenShift CLI.
> NAME              STATUS    AGE
> storage-project   Active    2h
> Using namespace "storage-project".
> template "deploy-heketi" created
> serviceaccount "heketi-service-account" created
> template "heketi" created
> template "glusterfs" created
> node "cns1.rhs" labeled
> node "cns2.rhs" labeled
> node "cns3.rhs" labeled
> daemonset "glusterfs" created
> Waiting for GlusterFS pods to start ... OK
> service "deploy-heketi" created
> route "deploy-heketi" created
> deploymentconfig "deploy-heketi" created
> Waiting for deploy-heketi pod to start ... OK
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time 
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  4818    0  4818    0     0  11201      0 --:--:-- --:--:-- --:--:--
> 11204
> Failed to communicate with deploy-heketi service.
> Please verify that a router has been properly configured.
> deploymentconfig "deploy-heketi" deleted
> route "deploy-heketi" deleted
> service "deploy-heketi" deleted
> pod "deploy-heketi-1-y0dyr" deleted
> Error from server: deploymentconfig "heketi" not found
> Error from server: services "heketi" not found
> Error from server: routes "heketi" not found
> Error from server: services "heketi-storage-endpoints" not found
> serviceaccount "heketi-service-account" deleted
> template "deploy-heketi" deleted
> template "heketi" deleted
> node "cns1.rhs" labeled
> node "cns2.rhs" labeled
> node "cns3.rhs" labeled
> daemonset "glusterfs" deleted
> template "glusterfs" deleted

seems running? firewall issue maybe?
[root@cnsmaster heketi]# oc get service
NAME                     CLUSTER-IP       EXTERNAL-IP   PORT(S)                   AGE
storage-project-router   172.30.197.228   <none>        80/TCP,443/TCP,1936/TCP   1h
[root@cnsmaster heketi]# oc get pods
NAME                             READY     STATUS    RESTARTS   AGE
storage-project-router-1-pinov   1/1       Running   0          1h

Comment 13 Scott Creeley 2017-01-17 19:12:36 UTC
(In reply to Scott Creeley from comment #10)
> topology file must match what 'oc get nodes' returns...maybe add a note or
> some instructions to section 4.2 step 1 for building topology.json file
> 
> [root@cnsmaster ansible]# oc get nodes
> NAME       STATUS    AGE
> cns1.rhs   Ready     3h
> cns2.rhs   Ready     3h
> cns3.rhs   Ready     3h
> 
> I used ipaddress for both node.hostnames.manage section and
> node.hostnames.storage section
> 
> example:
> 
>             "nodes": [
>                 {
>                     "node": {
>                         "hostnames": {
>                             "manage": [
>                                 "192.168.122.211"
>                             ],
>                             "storage": [
>                                 "192.168.122.211"
>                             ]
>                         },
>                         "zone": 1
>                     },
>                     "devices": [
>                         "/dev/vdc",
>                         "/dev/vdd"
>                     ]
>                 },
> 
> then I ran deploy but get the following failures (notice the nodes Not Found
> during labeling):
> 
> Using OpenShift CLI.
> NAME              STATUS    AGE
> storage-project   Active    2h
> Using namespace "storage-project".
> template "deploy-heketi" created
> serviceaccount "heketi-service-account" created
> template "heketi" created
> template "glusterfs" created
> Error from server: nodes "192.168.122.211" not found
> Error from server: nodes "192.168.122.212" not found
> Error from server: nodes "192.168.122.213" not found
> daemonset "glusterfs" created
> Waiting for GlusterFS pods to start ... Timed out waiting for pods matching
> 'glusterfs-node=pod'.
> No resources found
> Error from server: deploymentconfig "heketi" not found
> Error from server: services "heketi" not found
> Error from server: routes "heketi" not found
> Error from server: services "heketi-storage-endpoints" not found
> serviceaccount "heketi-service-account" deleted
> template "deploy-heketi" deleted
> template "heketi" deleted
> Error from server: nodes "192.168.122.211" not found
> Error from server: nodes "192.168.122.212" not found
> Error from server: nodes "192.168.122.213" not found
> daemonset "glusterfs" deleted
> template "glusterfs" deleted
> 
> I changed the topology to actual node/hostname for the manage section based
> on 'oc get nodes' values (left ip for storage section):
> 
>         {
>             "nodes": [
>                 {
>                     "node": {
>                         "hostnames": {
>                             "manage": [
>                                 "cns1.rhs"
>                             ],
>                             "storage": [
>                                 "192.168.122.211"
>                             ]
>                         },
>                         "zone": 1
>                     },
>                     "devices": [
>                         "/dev/vdc",
>                         "/dev/vdd"
>                     ]
>                 },
> 
> 
> resulting in better cns-deploy
> 
> Using OpenShift CLI.
> NAME              STATUS    AGE
> storage-project   Active    2h
> Using namespace "storage-project".
> template "deploy-heketi" created
> serviceaccount "heketi-service-account" created
> template "heketi" created
> template "glusterfs" created
> node "cns1.rhs" labeled
> node "cns2.rhs" labeled
> node "cns3.rhs" labeled
> daemonset "glusterfs" created
> Waiting for GlusterFS pods to start ... OK
> service "deploy-heketi" created
> route "deploy-heketi" created

or maybe make the sample topology.json match the doc - where you do show the fqdn of the node/host - not the ipaddr??

Comment 14 Erin Boyd 2017-01-17 21:24:28 UTC
After cns-deploy ran I had to do this manually for heketi-cli command to work in the next steps:
 export  HEKETI_CLI_SERVER=http://heketi-storage-project.cloudapps.mystorage.com

Comment 15 Erin Boyd 2017-01-17 22:53:38 UTC
In this part:
# cat glusterfs-storageclass.yaml


apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  name:gluster_container
provisioner: kubernetes.io/glusterfs
parameters:
  resturl: "http://127.0.0.1:8081"
  restuser: "admin"
  secretNamespace: "default"
  secretName: "heketi-secret"

Need a space after name, and I believe you cannot use underscore (so use hyphen instead)

Comment 16 Divya 2017-01-18 05:15:46 UTC
(In reply to Erin Boyd from comment #6)
> The 'NOTE' needs to be moved above this section so they don't see it after
> the fact:
> 
>  After the router is running, the clients have to be setup to access the
> services in the OpenShift cluster. Execute the following steps to set up the
> DNS.
> 
>     On the client, edit the /etc/dnsmasq.conf file and add the following
> line to the file:
> 
>     address=/.cloudapps.mystorage.com/<Router_IP_Address>
> 
>     where, Router_IP_Address is the IP address of one of the nodes running
> the router.
> 
>     Note
>     Ensure you do not edit the /etc/dnsmasq.conf file until the router has
> started.

I have moved the Note above Step 6. 

Link to the doc: http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#chap-Documentation-Red_Hat_Gluster_Storage_Container_Native_with_OpenShift_Platform-Setting_the_environment-Preparing_RHOE

Comment 17 Humble Chirammal 2017-01-18 05:50:04 UTC
(In reply to Scott Creeley from comment #13)
> (In reply to Scott Creeley from comment #10)

> or maybe make the sample topology.json match the doc - where you do show the
> fqdn of the node/host - not the ipaddr??

The only requirement is, 'storage' filed value has to be IP address. The 'manage' can be hostname or IP. Can you please share the cns-deploy command you used and the topology file used in this setup?

Comment 18 Humble Chirammal 2017-01-18 06:28:59 UTC
(In reply to Humble Chirammal from comment #17)
> (In reply to Scott Creeley from comment #13)
> > (In reply to Scott Creeley from comment #10)
> 
> > or maybe make the sample topology.json match the doc - where you do show the
> > fqdn of the node/host - not the ipaddr??
> 
> The only requirement is, 'storage' filed value has to be IP address. The
> 'manage' can be hostname or IP. Can you please share the cns-deploy command
> you used and the topology file used in this setup?

To clarify above: in standalone heketi setup and storage cluster, we can have 'hostname or IP' in 'manage' field. In kube or CNS setup, we need to have the hostname as 'manage' field value and it is also documented here http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#chap-Documentation-Red_Hat_Gluster_Storage_Container_Native_with_OpenShift_Platform-Setting_the_environment-Preparing_RHOE.

The pending issue here is, 

---snip--
> deploymentconfig "deploy-heketi" created
> Waiting for deploy-heketi pod to start ... OK
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time 
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  4818    0  4818    0     0  11201      0 --:--:-- --:--:-- --:--:--
> 11204
> Failed to communicate with deploy-heketi service.

--/snip--

This looks to be a 'router' issue which cause the 'curl' to fail while deploying CNS. Mostly caused by the wrong entry in the dnsmasq IP address of the router node.

Comment 19 Bhavana 2017-01-18 07:12:29 UTC
(In reply to Erin Boyd from comment #15)
> In this part:
> # cat glusterfs-storageclass.yaml
> 
> 
> apiVersion: storage.k8s.io/v1beta1
> kind: StorageClass
> metadata:
>   name:gluster_container
> provisioner: kubernetes.io/glusterfs
> parameters:
>   resturl: "http://127.0.0.1:8081"
>   restuser: "admin"
>   secretNamespace: "default"
>   secretName: "heketi-secret"
> 
> Need a space after name, and I believe you cannot use underscore (so use
> hyphen instead)

The space is introduced after name and the underscore with a hyphen.

http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#chap-Documentation-Red_Hat_Gluster_Storage_Container_Native_with_OpenShift_Platform-OpenShift_Creating_Persistent_Volumes-Dynamic_Prov

Comment 20 Bhavana 2017-01-18 07:14:28 UTC
(In reply to Erin Boyd from comment #14)
> After cns-deploy ran I had to do this manually for heketi-cli command to
> work in the next steps:
>  export 
> HEKETI_CLI_SERVER=http://heketi-storage-project.cloudapps.mystorage.com

An additional step is added in the "Deploying Container-Native Storage" chapter:

http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#idm139689974892432

Comment 21 Mohamed Ashiq 2017-01-18 07:42:22 UTC
(In reply to Erin Boyd from comment #3)
> This command will not reload IP tables:
> 
> [root@cns11 ~]# systemctl reload iptables
> Failed to reload iptables.service: Unit is masked.
> [root@cns11 ~]#

From Openshift installation part, There is no reason to land in this state(iptables getting masked). I think this is setup issue. QE has created many setup's, Have you guys ever faced these or Is this expected(did I miss something)?

Comment 22 Humble Chirammal 2017-01-18 08:10:42 UTC
(In reply to Erin Boyd from comment #14)
> After cns-deploy ran I had to do this manually for heketi-cli command to
> work in the next steps:
>  export 
> HEKETI_CLI_SERVER=http://heketi-storage-project.cloudapps.mystorage.com

This is a requirement as mentioned in the doc.

Comment 23 Humble Chirammal 2017-01-18 08:22:42 UTC
Summary of the issues reported:
--------------------------------------------------

1) The image pull error ( "Back-off pulling image \"openshift3/ose-pod:v3.4.0.39\"")  : 

Solution : Add internal repos and its the requirement for pre GA test.

2) iptables error:

> [root@cns11 ~]# systemctl reload iptables
> Failed to reload iptables.service: Unit is masked.
> [root@cns11 ~]#

This is definitely beyond the scope of CNS. May be openshift installer issue where it masked and failed to unmask at time of installation.

3) cns-deploy failure 

> Error from server: nodes "192.168.122.211" not found
> Error from server: nodes "192.168.122.212" not found
> Error from server: nodes "192.168.122.213" not found

In CNS deployment, the 'manage' field has to be filled with the hostname as mentioned in section 4.2 http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html#chap-Documentation-Red_Hat_Gluster_Storage_Container_Native_with_OpenShift_Platform-Setting_the_environment-Preparing_RHOE

4) 'curl' command failing:


> Waiting for deploy-heketi pod to start ... OK
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time 
> Current
>                                  Dload  Upload   Total   Spent    Left  Speed
> 100  4818    0  4818    0     0  11201      0 --:--:-- --:--:-- --:--:--
> 11204
> Failed to communicate with deploy-heketi service.

This is an issue with the router configuration. Mostly due to the wrong dns settings. 

@Erin/Scott, I was not able to fetch any issue/bug at CNS side from these reported scenarios. Please feel free to point out if I missed anything here.

Comment 24 Scott Creeley 2017-01-18 14:07:24 UTC
(In reply to Humble Chirammal from comment #17)
> (In reply to Scott Creeley from comment #13)
> > (In reply to Scott Creeley from comment #10)
> 
> > or maybe make the sample topology.json match the doc - where you do show the
> > fqdn of the node/host - not the ipaddr??
> 
> The only requirement is, 'storage' filed value has to be IP address. The
> 'manage' can be hostname or IP. Can you please share the cns-deploy command
> you used and the topology file used in this setup?

@Humble - see original comment 10 - I used the ip address rather than the hostname in the manage section and had cns deploy fail (once changed to using hostname in the manage section, the "Not Found" error went away).

command used:

cns-deploy -n storage-project -g topology.json

Here was bad topology.json before I changed to hostname

{
    "clusters": [
        {
            "nodes": [
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.122.211"
                            ],
                            "storage": [
                                "192.168.122.211"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/vdc",
                        "/dev/vdd"
                    ]
                },
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.122.212"
                            ],
                            "storage": [
                                "192.168.122.212"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/vdc",
                        "/dev/vdd"
                    ]
                },
                {
                    "node": {
                        "hostnames": {
                            "manage": [
                                "192.168.122.213"
                            ],
                            "storage": [
                                "192.168.122.213"
                            ]
                        },
                        "zone": 1
                    },
                    "devices": [
                        "/dev/vdc",
                        "/dev/vdd"
                    ]
                }
            ]
        }
    ]
}

I can reproduce each time:

template "deploy-heketi" created
serviceaccount "heketi-service-account" created
template "heketi" created
template "glusterfs" created
Error from server: nodes "192.168.122.211" not found
Error from server: nodes "192.168.122.212" not found
Error from server: nodes "192.168.122.213" not found
daemonset "glusterfs" created

possibly something with my setup, not sure?

Comment 25 Scott Creeley 2017-01-18 20:49:53 UTC
UAT Testing Notes:

-----------------
Environment:
-----------------
1.  4 VM Nodes running RHEL 7.3
       cnsmaster.rhs (192.168.122.210) = OCP Master + CNS Client + Docker 1.12.5
       cns1.rhs (192.168.122.211) = OCP Node + Storage Node + 2 RAW 500GB devices + Docker 1.12.5
       cns2.rhs (192.168.122.212) = OCP Node + Storage Node + 2 RAW 500GB devices + Docker 1.12.5
       cns3.rhs (192.168.122.213) = OCP Node + Storage Node + 2 RAW 500GB devices + Docker 1.12.5

2.  OCP 3.4 installed using official "quick installation guide" - interactive mode


3.  Verification of OCP

[root@cnsmaster heketi]# oc get nodes
NAME            STATUS                     AGE
cns1.rhs        Ready                      7m
cns2.rhs        Ready                      7m
cns3.rhs        Ready                      7m
cnsmaster.rhs   Ready,SchedulingDisabled   7m

[root@cnsmaster heketi]# oc get pods -o wide
NAME                       READY     STATUS    RESTARTS   AGE       IP                NODE
docker-registry-2-n981p    1/1       Running   0          2m        10.131.0.3        cns2.rhs
docker-registry-2-nmczg    1/1       Running   0          2m        10.130.0.4        cns1.rhs
registry-console-1-fvjmr   1/1       Running   0          2m        10.129.0.3        cns3.rhs
router-1-5525o             1/1       Running   0          3m        192.168.122.212   cns2.rhs
router-1-tufuk             1/1       Running   0          3m        192.168.122.211   cns1.rhs





----------------
Installation of CNS following: https://doc-stage.usersys.redhat.com/documentation/en/red-hat-gluster-storage/3.1/single/container-native-storage-for-openshift-container-platform/
----------------
                   commands:    Result:    doc instructions:
                 ------------   -------    -----------------
section 3        - good         success    good

section 4.1      - good         success    good with  minor nits
                                           - point of confusion, 4.1 Step 6 - mentions "clients" plural...should that be a single client?  
                                             Assuming "client" == "heketi-client node or OCP Master+Heketi-client"
                                             Meaning, the commands in these steps should only be run on the client and not the storage nodes correct?  
                                             I could see someone getting confused and thinking "clients" = "storage nodes"
                                           - Also, should more clearly specify that the <ip address of router> should be where the router is actually running, otherwise they might pick any of the nodes.

section 4.2      - good         success    good

section 5.1      - good         success    good

section 5.2.1.1  - good         success    bug
                                           - step 2 references "gluster_container" - should be "gluster-container"  (no underscores for names)

section 5.2.1.2  - good         success    good

section 5.2.1.3  - good         success    bugs
                                           - step 1 uses wrong storage class name "gluster_container" should be "gluster-container"
                                           - step 1 calls glusterfs-pvc-claim1.json while step 2 references glusterfs-pvc-claim1.yaml  (names and yaml vs json should be consistent - yaml is more accepted format)        

section 5.2.1.4  - good         success    minor nit
                                           - step 1 command "oc get persistentvolume,persistentvolumeclaim"  - can be shortened to "oc get pv,pvc" - so we are consistent, standard is using shortened names

section 5.2.1.5  - good         success    good

section 5.2.1.6  - good         success    good

Comment 26 Bhavana 2017-01-19 07:53:36 UTC
The suggested doc comments in comment 25  are incorporated. Following is the updated link:

http://ccs-jenkins.gsslab.brq.redhat.com:8080/job/doc-Red_Hat_Gluster_Storage-3.4-Container_Native_Storage_with_OpenShift_Platform-branch-master/lastSuccessfulBuild/artifact/tmp/en-US/html-single/index.html

Regarding the comment on section 4.1. I had a discussion with Ashiq and Talur and made the minor chnages accordingly to reduce the confusion about the client and the ip address of the router. Detailed information about the master and client is already explained in Section 3.2. Environment Requirements.

The link includes all the other UAT doc comments suggested in this bug.


Note You need to log in before you can comment on or make changes to this bug.