Description of problem: The OCS upgrade playbook misses enabling server.tcp-user-timeout on all volumes. The ansible playbook should be enabling the server.tcp-user-timeout as it is an important part of OCS upgrades. The "server.tcp-user-timeout" option specifies the maximum amount of the time (in seconds) the transmitted data from the application can remain unacknowledged from the brick. It is used to detect force disconnections and dead connections (if a node dies unexpectedly, a firewall is activated, etc.,) early and make it possible for applications to reduce the overall failover time. OCS upgrade playbook at the path : ../openshift-ansible/playbooks/openshift-glusterfs/upgrade.yml Version-Release number of selected component (if applicable): All OCS versions How reproducible: always. Actual results: OCS upgrade playbook does not enable server.tcp-user-timeout to 42 for all the volumes. Expected results: OCS upgrade playbook should enable server.tcp-user-timeout to 42 for all the volumes.
Seems like a legit bug.
The requirement to set the server.tcp-user-timeout came because RHGS 3.4.0 was missing it in the code defaults. With https://bugzilla.redhat.com/show_bug.cgi?id=1623874 and https://code.engineering.redhat.com/gerrit/#/c/150699/, it is part of the RHGS defaults. If we ask all users to upgrade to v3.11.0 or onwards then we don't have to make any changes in the openshift-ansible. Closing the bug for the same reason.