Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1334501

Summary: Watch cache regression: changes behavior of "resource version too old" error
Product: OpenShift Container Platform Reporter: Jessica Forrester <jforrest>
Component: NodeAssignee: Jordan Liggitt <jliggitt>
Status: CLOSED ERRATA QA Contact: DeShuai Ma <dma>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: adellape, aos-bugs, jliggitt, jokerman, mmccomas, qixuan.wang
Target Milestone: ---   
Target Release: 3.2.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously when etcd watch cache was enabled, the API server would deliver a 410 HTTP response when a watch was attempted with a resourceVersion that was too old. The expected result was a 200 HTTP status, with a single watch event of type ERROR. This bug fix updates the API server to produce the same results in this case, regardless of whether watch cache is enabled. The "410 Gone" error is now returned as a watch error event, rather than as a HTTP 410 response.
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-27 15:07:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jessica Forrester 2016-05-09 19:05:43 UTC
Tracking this kubernetes issue https://github.com/kubernetes/kubernetes/issues/25151

Comment 5 Qixuan Wang 2016-06-07 16:33:19 UTC
https://github.com/kubernetes/kubernetes/pull/25369

Could you give me some suggestions on how to verify this bug? Thanks.

Comment 6 Jordan Liggitt 2016-06-07 18:19:01 UTC
The steps in https://github.com/kubernetes/kubernetes/issues/25151#issue-153080519 should be good... request a watch on a version that is too old, and ensure the 410 Gone error is returned as a watch error event, rather than as a HTTP 410 response.

To make testing easier, set the watch cache size to something low for a particular resource type, and create enough of that resource type to exceed the cache size, then request resourceVersion=1

Comment 7 DeShuai Ma 2016-06-12 06:33:52 UTC
Test on openshift v3.2.1.1-1-g33fa4ea

Steps to verfify:
1. enable watch-cache
kubernetesMasterConfig:
  apiServerArguments:
    watch-cache: ["true"]
    watch-cache-sizes: ["builds#50","deploymentconfigs#50"]

2. watch a old resource by curl, should return a Gone 410 event.
    [root@dhcp-128-7 Desktop]# curl -k -vvv -H "Authorization: Bearer 82z8aFmWWHrzBH8-nwPPgRxs2sLepbw0re75hqaJgTs" "https://104.197.173.141:8443/oapi/v1/namespaces/dma/builds?watch=1&resourceVersion=1"
    * About to connect() to 104.197.173.141 port 8443 (#0)
    *   Trying 104.197.173.141... connected
    * Connected to 104.197.173.141 (104.197.173.141) port 8443 (#0)
    * Initializing NSS with certpath: sql:/etc/pki/nssdb
    * warning: ignoring value of ssl.verifyhost
    * skipping SSL peer certificate verification
    * NSS: client certificate not found (nickname not specified)
    * SSL connection using TLS_RSA_WITH_AES_128_CBC_SHA
    * Server certificate:
    *       subject: CN=10.240.0.29
    *       start date: Jun 12 03:16:37 2016 GMT
    *       expire date: Jun 12 03:16:38 2018 GMT
    *       common name: 10.240.0.29
    *       issuer: CN=openshift-signer@1465701391
    > GET /oapi/v1/namespaces/dma/builds?watch=1&resourceVersion=1 HTTP/1.1
    > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.3.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
    > Host: 104.197.173.141:8443
    > Accept: */*
    > Authorization: Bearer 82z8aFmWWHrzBH8-nwPPgRxs2sLepbw0re75hqaJgTs
    >
    < HTTP/1.1 200 OK
    < Cache-Control: no-store
    < Transfer-Encoding: chunked
    < Date: Sun, 12 Jun 2016 05:40:07 GMT
    < Content-Type: text/plain; charset=utf-8
    < Transfer-Encoding: chunked
    <
    {"type":"ERROR","object":{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"too old resource version: 1 (1046)","reason":"Gone","code":410}}
    * Connection #0 to host 104.197.173.141 left intact
    * Closing connection #0

Comment 9 errata-xmlrpc 2016-06-27 15:07:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1343