Bug 1302857

Summary: matrix report filtering regression
Product: [Retired] Beaker Reporter: Bill Peck <bpeck>
Component: web UIAssignee: Dan Callaghan <dcallagh>
Status: CLOSED CURRENTRELEASE QA Contact: tools-bugs <tools-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 22CC: asavkov, dcallagh, dowang, jburke, jstancek, mjia, pbunyan, rjoost
Target Milestone: 22.2Keywords: Patch, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-15 01:51:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bill Peck 2016-01-28 19:19:48 UTC
Description of problem:
Unable to generate report from whiteboard filter

Version-Release number of selected component (if applicable):
webui

How reproducible:
everytime

Steps to Reproduce:
1. go to https://beaker.engineering.redhat.com/matrix
2. enter "kernel-2.6.32-609.el6 KernelTier1 [restraint]] into filter
3. click Filter
4. Select "kernel-2.6.32-609.el6 KernelTier1 [restraint]"
5. click Generate

Actual results:
No report

Expected results:
report

Additional info:
some whiteboard entries will generate a report, but most do not.

Comment 1 Bill Peck 2016-01-29 16:28:32 UTC
It looks like the problematic whiteboards are ones that have whitespace in front.  either a tab or spaces.   If you edit the whiteboard to remove this white space it works as expected. 

For example here is a cloned job xml:
  <whiteboard>
		kernel-2.6.18-408.el5.kpq1 KernelTier1 - Jenkins
	 [RS:1953781]</whiteboard>

Notice the extra tab and newline.

This whiteboard text is generated from the following code:

            cloned_jobxml = self.proxy.hub.taskactions.to_xml('RS:' + rs_id, True, False)
            clonedroot = etree.XML(cloned_jobxml)
.
.
.
            whiteboard = clonedroot.find("whiteboard")
            whiteboard.text = whiteboard.text + ' [RS:%s%s]' % (str(rs_id), machine_text)


Since this code hasn't changed I think the problem is somewhere in the generation of the xml from beaker.

Thanks.

Comment 2 Bill Peck 2016-01-29 20:46:34 UTC
I think we have found a work around.  We were generating the job with --pretty-xml and then submitting that.  If we remove pretty-xml it looks like we don't get the extra tab or newline.

I'm curious to hear your opinion on what is the expected behavior.

Comment 3 Dan Callaghan 2016-01-31 22:57:09 UTC
One thing that changed with the XML refactoring we did in 22.0 is that cloning now preserves the whiteboard whitespace exactly as is -- which means if you submit with surrounding whitespace, it will stay there when you clone it and we will store it as well.

I think previously there was some automatic stripping happening in xmltramp and then the pretty printing was adding it back in. (I need to confirm if that's right or not.)

So I'm guessing the job matrix doesn't handle that properly, now that we have job whiteboards containing surrounding whitespace or tabs and newlines etc.

I think the right answer is to make sure that whatever is generating your job XML is not inserting extra unwanted whitespace, so you probably want it to produce:

  <whiteboard>kernel-2.6.18-408.el5.kpq1 KernelTier1 - Jenkins [RS:1953781]</whiteboard>

and not:

  <whiteboard>
		kernel-2.6.18-408.el5.kpq1 KernelTier1 - Jenkins
	 [RS:1953781]</whiteboard>

However... we clearly have something to fix in the job matrix.

And if there is something we can change in how the server or client handles whitespace in the whiteboard, to make this easier/better, we can do that too. I will have to dig into it more to figure out the best option.

Comment 4 Dan Callaghan 2016-02-09 02:10:50 UTC
(In reply to Dan Callaghan from comment #3)
> I think previously there was some automatic stripping happening in xmltramp
> and then the pretty printing was adding it back in. (I need to confirm if
> that's right or not.)

It is right. xmltramp implicitly strips surrounding whitespace from element text, and xml.dom.minidom inserts surrounding spaces around element text when pretty printing.

Both of those behaviours seem dodgy to me, since they are changing the element's text content. I guess they are designed with HTML in mind, where whitespace is insignificant. With the switch to lxml in Beaker 22 both of those behaviours are gone on the server side, we always preserve the exact text content without stripping or inserting whitespace.

The problem is that the bkr workflow commands are still using xml.dom.minidom with its pretty printing behaviour assuming that the server will then strip the whitespace back out. Even if we fixed that now on the beaker-client side, there would still be old client versions floating around, and it's possible that people have other job XML generators which are making the same assumption about the whiteboard stripping.

So I think that means we really need to restore the whitespace stripping behaviour on the server side during job parsing, that we were implicitly getting from xmltramp. It actually affects more than just <whiteboard/> as well, I noticed that the notify cc suffers the same problem when the workflow commands pretty print it. Basically any piece of the job XML where data is conveyed as text content rather than an attribute value.

Comment 5 Dan Callaghan 2016-02-09 02:16:03 UTC
I do think we should fix up the job matrix filtering stuff to handle whiteboards with surrounding whitespace too.

Comment 6 Dan Callaghan 2016-02-09 02:30:36 UTC
Looks like the <cc/> element text is already stripped, so that one is fine.

The only other place in the job XML where we look at text content is for <ks_append/> and <kickstart/>. The workflow commands already do some CDATA mangling to ensure that xml.dom.minidom doesn't mess them up with its pretty-printing. Also, the previous server code using xmltramp wasn't implicitly stripping those either: it wasn't calling unicode() on them but iterating their text nodes and concatenating them, which preserves all whitespace. Also we *can't* strip the kickstart and ksappend text content because the trailing newline after a kickstart command or %end block *is* significant.

So the only thing that needs changing is whitespace stripping for <whiteboard/>.

Comment 7 Dan Callaghan 2016-02-09 02:34:23 UTC
http://gerrit.beaker-project.org/4672

Comment 8 Dan Callaghan 2016-02-09 02:38:43 UTC
The job matrix *almost* already handles whiteboards containing weird whitespace. The issue seems to be just with embedded newlines. The LF is being turned into CRLF somewhere...

Comment 9 Dan Callaghan 2016-02-09 03:09:25 UTC
It's actually done by the browser. When an <option/>'s value attribute contains an embedded LF, Firefox converts it to CRLF at form submission time. It's step 4 in this algorithm for constructing form data:

https://html.spec.whatwg.org/multipage/forms.html#constructing-form-data-set

Doesn't seem to be something we can easily control or avoid on the browser side, short of rewriting the job matrix filtering to be done by Javascript instead of old-fashioned form submission. That's on our list to do eventually, but in the meantime I guess we can just convert CRLF back to LF on the server side...

http://gerrit.beaker-project.org/4673

Comment 12 Dan Callaghan 2016-03-15 01:51:23 UTC
Beaker 22.2 has been released.