Bug 1611635 - infra: softserve machines, regression tests fails
Summary: infra: softserve machines, regression tests fails
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: project-infrastructure
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-02 14:10 UTC by Kotresh HR
Modified: 2018-08-13 10:09 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-13 10:09:15 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)
Client logs with TRACE enabled (109.45 KB, text/plain)
2018-08-02 14:11 UTC, Kotresh HR
no flags Details

Description Kotresh HR 2018-08-02 14:10:28 UTC
Description of problem:
On the machines got from softserve, all the test cases fails to run with fuse mount being failed

Please find the client logs attached.

Version-Release number of selected component (if applicable):
mainline (source installed)

How reproducible:
I had 2/2 on two machines

Steps to Reproduce:
1. Build gluster source code
2. Run any of .t file (prove -v <.t file>)


Actual results:
Testcases failed

Expected results:
Test cases should not fail.

Additional info:

Comment 1 Kotresh HR 2018-08-02 14:11:05 UTC
Created attachment 1472715 [details]
Client logs with TRACE enabled

Comment 2 Nigel Babu 2018-08-02 14:31:10 UTC
Did you follow the very specific and detailed instructions that you need to do?

https://github.com/gluster/softserve/wiki/Running-Regressions-on-clean-Centos-7-machine

After you get the machine, you need to run an ansible playbook on it. Restart and re-run it again to get the last few ipv6 disable fixes. Only then can you run tests.

Comment 3 Kotresh HR 2018-08-02 14:42:15 UTC
Yes, I did follow all the steps.

Look up on root is very trivial. While the other observation is that. If I create volume on cli and mount it. It perfectly works. But if I run the test case, the test case fails at those lines where it mounts the volume.

Comment 4 Raghavendra G 2018-08-02 17:17:20 UTC
[2018-07-31 10:41:38.308584] I [rpc-clnt.c:2087:rpc_clnt_reconfig] 0-master-client-0: changing port to 49152 (from 0)

> Got a valid port 49152 for brick. reconfiguring.

[2018-07-31 10:41:38.308606] T [socket.c:861:__socket_disconnect] 0-master-client-0: disconnecting 0x7f36f0078be0, sock=12
[2018-07-31 10:41:38.308819] T [socket.c:865:__socket_disconnect] (--> /usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x1ee)[0x7f3703988368] (--> /usr/local/lib/glusterfs/4.2dev/rpc-transport/socket.so(+0x608e)[0x7f36f810d08e] (--> /usr/local/lib/glusterfs/4.2dev/rpc-transport/socket.so(+0xe02b)[0x7f36f811502b] (--> /usr/local/lib/libgfrpc.so.0(rpc_transport_disconnect+0x96)[0x7f370374a45b] (--> /usr/local/lib/glusterfs/4.2dev/xlator/protocol/client.so(+0x54292)[0x7f36f5e0f292] ))))) 0-master-client-0: tearing down socket connection
[2018-07-31 10:41:38.308877] T [socket.c:3002:socket_event_handler] 0-master-client-0: (sock:12) socket_event_poll_in returned 0
[2018-07-31 10:41:38.308898] T [socket.c:2960:socket_event_handler] 0-master-client-0: client (sock:12) in:1, out:0, err:16
[2018-07-31 10:41:38.308923] T [socket.c:236:socket_dump_info] 0-master-client-0: $$$ client: disconnecting from (af:2,sock:12) 23.253.56.86 non-SSL (errno:0:Success)
[2018-07-31 10:41:38.308938] D [socket.c:3021:socket_event_handler] 0-transport: EPOLLERR - disconnecting (sock:12) (non-SSL)
[2018-07-31 10:41:38.308959] D [MSGID: 0] [client.c:2242:client_rpc_notify] 0-master-client-0: got RPC_CLNT_DISCONNECT
[2018-07-31 10:41:38.308977] D [MSGID: 0] [client.c:2284:client_rpc_notify] 0-master-client-0: disconnected (skipped notify)
[2018-07-31 10:41:38.308997] T [rpc-clnt.c:404:rpc_clnt_reconnect] 0-master-client-0: attempting reconnect
[2018-07-31 10:41:38.309011] T [socket.c:3409:socket_connect] 0-master-client-0: connecting 0x7f36f0078be0, sock=-1
[2018-07-31 10:41:38.309028] T [name.c:243:af_inet_client_get_remote_sockaddr] 0-master-client-0: option remote-port missing in volume master-client-0. Defaulting to 24007

> Even after getting a valid port for remote brick, why is this defaulting to 24007 again? Something fishy. Need a deeper look. I observed similar patter for client-1 and client-2 too

[2018-07-31 10:41:38.309568] D [MSGID: 0] [common-utils.c:339:gf_resolve_ip6] 0-resolver: returning ip-23.253.56.86 (port-24007) for hostname: builderhrk500.cloud.gluster.org and port: 24007
[2018-07-31 10:41:38.312685] D [MSGID: 0] [common-utils.c:339:gf_resolve_ip6] 0-resolver: returning ip-104.130.69.104 (port-24007) for hostname: builderhrk500.cloud.gluster.org and port: 24007
[2018-07-31 10:41:38.312749] T [socket.c:961:__socket_nodelay] 0-master-client-0: NODELAY enabled for socket 14
[2018-07-31 10:41:38.312779] T [socket.c:1049:__socket_keepalive] 0-master-client-0: Keep-alive enabled for socket: 14, (idle: 20, interval: 2, max-probes: 9, timeout: 0)

Comment 5 Nigel Babu 2018-08-13 10:09:15 UTC
This was because of a bug in ansible somewhere. The /etc/hosts file had entries for an IP that was incorrect. Rather than tracking this down, we're deprecating the old instructions in favor of new ones which will run faster in any case.


Note You need to log in before you can comment on or make changes to this bug.