Red Hat Bugzilla – Bug 1302857
matrix report filtering regression
Last modified: 2016-03-14 21:51:23 EDT
Description of problem:
Unable to generate report from whiteboard filter
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. go to https://beaker.engineering.redhat.com/matrix
2. enter "kernel-2.6.32-609.el6 KernelTier1 [restraint]] into filter
3. click Filter
4. Select "kernel-2.6.32-609.el6 KernelTier1 [restraint]"
5. click Generate
some whiteboard entries will generate a report, but most do not.
It looks like the problematic whiteboards are ones that have whitespace in front. either a tab or spaces. If you edit the whiteboard to remove this white space it works as expected.
For example here is a cloned job xml:
kernel-2.6.18-408.el5.kpq1 KernelTier1 - Jenkins
Notice the extra tab and newline.
This whiteboard text is generated from the following code:
cloned_jobxml = self.proxy.hub.taskactions.to_xml('RS:' + rs_id, True, False)
clonedroot = etree.XML(cloned_jobxml)
whiteboard = clonedroot.find("whiteboard")
whiteboard.text = whiteboard.text + ' [RS:%s%s]' % (str(rs_id), machine_text)
Since this code hasn't changed I think the problem is somewhere in the generation of the xml from beaker.
I think we have found a work around. We were generating the job with --pretty-xml and then submitting that. If we remove pretty-xml it looks like we don't get the extra tab or newline.
I'm curious to hear your opinion on what is the expected behavior.
One thing that changed with the XML refactoring we did in 22.0 is that cloning now preserves the whiteboard whitespace exactly as is -- which means if you submit with surrounding whitespace, it will stay there when you clone it and we will store it as well.
I think previously there was some automatic stripping happening in xmltramp and then the pretty printing was adding it back in. (I need to confirm if that's right or not.)
So I'm guessing the job matrix doesn't handle that properly, now that we have job whiteboards containing surrounding whitespace or tabs and newlines etc.
I think the right answer is to make sure that whatever is generating your job XML is not inserting extra unwanted whitespace, so you probably want it to produce:
<whiteboard>kernel-2.6.18-408.el5.kpq1 KernelTier1 - Jenkins [RS:1953781]</whiteboard>
kernel-2.6.18-408.el5.kpq1 KernelTier1 - Jenkins
However... we clearly have something to fix in the job matrix.
And if there is something we can change in how the server or client handles whitespace in the whiteboard, to make this easier/better, we can do that too. I will have to dig into it more to figure out the best option.
(In reply to Dan Callaghan from comment #3)
> I think previously there was some automatic stripping happening in xmltramp
> and then the pretty printing was adding it back in. (I need to confirm if
> that's right or not.)
It is right. xmltramp implicitly strips surrounding whitespace from element text, and xml.dom.minidom inserts surrounding spaces around element text when pretty printing.
Both of those behaviours seem dodgy to me, since they are changing the element's text content. I guess they are designed with HTML in mind, where whitespace is insignificant. With the switch to lxml in Beaker 22 both of those behaviours are gone on the server side, we always preserve the exact text content without stripping or inserting whitespace.
The problem is that the bkr workflow commands are still using xml.dom.minidom with its pretty printing behaviour assuming that the server will then strip the whitespace back out. Even if we fixed that now on the beaker-client side, there would still be old client versions floating around, and it's possible that people have other job XML generators which are making the same assumption about the whiteboard stripping.
So I think that means we really need to restore the whitespace stripping behaviour on the server side during job parsing, that we were implicitly getting from xmltramp. It actually affects more than just <whiteboard/> as well, I noticed that the notify cc suffers the same problem when the workflow commands pretty print it. Basically any piece of the job XML where data is conveyed as text content rather than an attribute value.
I do think we should fix up the job matrix filtering stuff to handle whiteboards with surrounding whitespace too.
Looks like the <cc/> element text is already stripped, so that one is fine.
The only other place in the job XML where we look at text content is for <ks_append/> and <kickstart/>. The workflow commands already do some CDATA mangling to ensure that xml.dom.minidom doesn't mess them up with its pretty-printing. Also, the previous server code using xmltramp wasn't implicitly stripping those either: it wasn't calling unicode() on them but iterating their text nodes and concatenating them, which preserves all whitespace. Also we *can't* strip the kickstart and ksappend text content because the trailing newline after a kickstart command or %end block *is* significant.
So the only thing that needs changing is whitespace stripping for <whiteboard/>.
The job matrix *almost* already handles whiteboards containing weird whitespace. The issue seems to be just with embedded newlines. The LF is being turned into CRLF somewhere...
It's actually done by the browser. When an <option/>'s value attribute contains an embedded LF, Firefox converts it to CRLF at form submission time. It's step 4 in this algorithm for constructing form data:
Beaker 22.2 has been released.