Bug 1469259 - Async Hypervisor Checkin payload should be checked on submission
Async Hypervisor Checkin payload should be checked on submission
Status: NEW
Product: Candlepin
Classification: Community
Component: candlepin (Show other bugs)
Unspecified Unspecified
medium Severity low
: ---
: ---
Assigned To: candlepin-bugs
Katello QA List
: Triaged
Depends On:
  Show dependency treegraph
Reported: 2017-07-10 14:27 EDT by Shayne Riley
Modified: 2017-08-01 14:41 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Shayne Riley 2017-07-10 14:27:46 EDT
Description of problem:
It is possible for a client to submit an async hypervisor checkin and get a successful 200 response back without ever realizing that their payload is incorrect, which would normally return a 400 for other requests.

BUT, if the client is prudent enough to use the job id from the response and retrieves said job, it is only then that the client knows that the job failed, but that's it.

Additionally, even if the client was able to look at the logs and find the stack trace for it (which it can't), the stack trace doesn't make any mention of the fact that the payload wasn't formatted right. Instead the stack trace is for a Job Excecution Exception "Caused by: java.lang.NullPointerException: null
	at org.candlepin.pinsetter.tasks.HypervisorUpdateJob.toExecute(HypervisorUpdateJob.java:214)"

If one were to look up that line (214) in source it reads:
            log.debug("Hypervisor consumers for create/update: {}", hypervisors.getHypervisors().size());

But the crazy part is in the line JUST BEFORE that (line 213):
            HypervisorList hypervisors = (HypervisorList) Util.fromJson(json, HypervisorList.class);
...which indicates that Util.fromJson(String, Class) EATS the exception that Jackson throws, and returns null instead (and looking at that source, the analysis appears to be correct).

To sum up the issue in its entirety:
- A client can submit an async hypervisor checkin with 1 or more random characters of text, as long as the owner is real, and gets back a 200 success, even though the job is doomed to fail.
- No sanitization of the client input takes place before it is stored into the DB. Rather, there is only an insufficient check for null or empty text, at which point the text is compressed via Deflate and then stored.
- Once the job starts, the text is decompressed and then deserialized via Util.fromJson method, which eats an exception and returns null.
- HypervisorUpdateJob.toExecute doesn't check to see if the resulting HyervisorList hypervisors variable is null, NOR does it check to see if hypervisors.getHypervisors() is null (which can happen if the customer sends an empty json object for a payload: {}
- An NPE is thrown, and the job state becomes "FAILED"
- The client checks the job, and finds out that the job is failed, but has no idea why.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

1. Submit any non-JSON formatted text to "Content-Type: text/plain" POST $CPHOST/hypervisors/$owner  ... make sure the owner exists, though.

OR, Submit this JSON formatted text: {}
..it'll have the same effect.

2. Use the job id from the response and look up the job: GET $CPHOST/jobs/$jobId

Actual results:
After Step 1, a 200 response is returned.
After Step 2, the state of the job is "FAILED" with no explanation why (Hint: it was the client's fault).

Expected results:
The payload is checked BEFORE getting stored in the DB for correctness and a 400-series status is returned with a sufficient explaination why it was no good.

Note You need to log in before you can comment on or make changes to this bug.