The __pmDecodeResult function contains multiple security vulnerabilities. The description below is based on the 64-bit variant of the code. Parts of this also apply to the 32-bit variant, but it is probably best to remove the code duplication and use the 64 bit variant everywhere. The value of numpid is not validated against the overall PDU size. Processing a crafted PDU could read past the end of the PDU, crashing the process or disclosing information. The embedded numval counts are not checked, either, with similar results. In the valfmt != PM_VAL_INSITU case, the extracted pointer may point outside the area which holds such values. This can result in crashes or information disclosure. The length field inside the value is not validated against the PDU size. Values can overlap with each other or with other parts of the PDU, but this might not be a problem. The nvsize/vbsize calculations can overflow. This will be fixed implicitly when the contributing values are checked against the PDU size. On 32-bit architectures, the malloc size calculation overflow leads to a heap-based buffer overflow. The nvsize/vbsize overflows look as if they could lead to a heap-based buffer overflow on all architectures. pmcd uses __pmDecodeResult, but apparently only after store authorization, so the function is only exposed to localhost in the default configuration.
Created attachment 600963 [details] Proposed patch to resolve issues in PCP result PDU handling This one is obviously the most complex of all the decoding routines. I went with the approach of not collapsing the 32bit variant into the 64bit variant, mainly to not make additional functional change without first discussing with the original author. The 32bit case is much, much simpler anyway. I think I covered everything else listed, review would be much appreciated. Thanks!
(In reply to comment #2) > Created attachment 600963 [details] > Proposed patch to resolve issues in PCP result PDU handling > > This one is obviously the most complex of all the decoding routines. Indeed, this one is a bit of a challenge. pduvp = &vlp->vlist[j]; if (sizeof(__pmValue_PDU) > (size_t)(pduend - (char *)pduvp)) goto corrupt; index = ntohl(vlp->vlist[j].value.pval); pduvbp = (pmValueBlock *)&pdubuf[index]; I think you need to check here that index isn't negative and that it points into the pmValueBlocks area, and not the earlier parts in the PDU. &pdubuf[index] contains a multiplication that an overflow, so you also need an upper limit for the index (perhaps the overall PDU length). if (pduvbp->vlen < 0 || pduvbp->vlen > pp->hdr.len) goto corrupt; vbsize += PM_PDU_SIZE_BYTES(pduvbp->vlen); need = nvsize + vbsize; pduvb->vlen should be checked accurately against the PDU end, otherwise you can trigger an overread by up to 64 KB (the PDU size limit), quite likely triggering a crash. Can multiple values share the same external data? The code does not guard against that. Aggressive sharing leads to quadratic memory use here, so it would be better to impose a limit. vbsize can also overflow if the PDU size limit is ever increased from 64K. /* pmValueBlocks (if any) are copied across "as is" */ index = vsize / sizeof(__pmPDU); memcpy((void *)&newbuf[nvsize], (void *)&pp->data[index], vbsize); Okay, this pretty strongly assumes that there is no sharing, otherwise this copy operation can go way beyond the original PDU size. Is it really necessary to compute vbsize from scratch? It seems to me that you could just copy the entire original PDU unchanged, or at least the pmValueBlocks part. Its length is determined by the final vsize value and the overall PDU length. /* * in the input PDU buffer, pval is an index to the * start of the pmValueBlock, in units of __pmPDU */ index = sizeof(__pmPDU) * ntohl(vp->value.pval) + offset; nvp->value.pval = (pmValueBlock *)&newbuf[index]; This is currently problematic because index could point somewhere into the expanded vlists area, so the values stored there could be invalid while they were previously correct. In the 32 bit case, there is: /* salvage pmValueBlocks from end of PDU */ index = ntohl(pduvp->value.lval); pduvbp = (pmValueBlock *)&pdubuf[index]; index should be guarded check for negative values, and compared to the PDU size (so that the multiplication hidden in &pdubuf[index] does not overflow). if (sizeof(pmValueBlock) > (size_t)(pduend - (char *)pduvbp)) goto corrupt; if (pduvbp->vlen < 0 || pduvbp->vlen > pp->hdr.len) goto corrupt; pduvp->value.pval = pduvbp; __ntohpmValueBlock(pduvbp); Is pduvb->vlen valid before calling __ntohpmValueBlock(pduvbp)?
Created attachment 601673 [details] Revised PCP result PDU decoding patch (In reply to comment #3) > (In reply to comment #2) > pduvp = &vlp->vlist[j]; > if (sizeof(__pmValue_PDU) > (size_t)(pduend - (char *)pduvp)) > goto corrupt; > index = ntohl(vlp->vlist[j].value.pval); > pduvbp = (pmValueBlock *)&pdubuf[index]; > > I think you need to check here that index isn't negative *nod*, done, thanks. > and that it points > into the pmValueBlocks area, and not the earlier parts in the PDU. This proved tricky (we don't know where the start of the value blocks area will be until end of the first pass over the pdubuf). In the revised patch I've added a pointer to track where the earliest valueblock start is. Then after the first pass we can compare this to the calculated vsize (just making a note here, as its not immediately obvious). > triggering a crash. Can multiple values share the same external data? The As I think you found later, no they don't. > It seems to me that you could > just copy the entire original PDU unchanged, > or at least the pmValueBlocks part. That part is copied as-is (after swabbing) into the new buffer via a single memcpy. The vlist part cannot be copied directly though, as the pointer size on 64 bit platforms is too large to place directly in-line in the original buffer (this is the fundamental reason why a separate 64 bit code path had to be added at some point in the now distant past). > Is pduvb->vlen valid before calling __ntohpmValueBlock(pduvbp)? Nope, good catch. This patch should address all of the issues, with the check to ensure valueblock indices don't point into the vlists being the most subtle (and hence possibly wrong!). Testing suggests its OK, but again I'm appreciating any/all feedback on this - thanks!
(In reply to comment #4) > In the revised patch I've added a pointer to track where the earliest > valueblock start is. Then after the first pass we can compare this to > the calculated vsize (just making a note here, as its not immediately > obvious). This should work in principle, yes. > > It seems to me that you could > > just copy the entire original PDU unchanged, > > or at least the pmValueBlocks part. > > That part is copied as-is (after swabbing) into the new buffer via a > single memcpy. What I meant was: Could you copy the remaining PDU after only determining the vlist length? Then you don't have to determine the size of the pmValueBlocks part explicitly, and could do the bulk of the checking in the second path (which fills in the pointers). I'm still happy about the current code. This PDU type is very difficult to parse, particularly using the pointer-based approach. However, I could not spot any remaining bugs. From the patch description (which is based on my earlier comment): Values can overlap with each other or with other parts of the PDU, but this might not be a problem. You should remove that because overlaps are most definitely a problem in the old code.
Created attachment 602218 [details] Revised PCP result PDU decoding patch Update the patch header to reflect the point Florian made in the previous comment.
Created attachment 602222 [details] Fix regression in PCP archive reading This patch fixes a regression introduced by the other patch for this bug, which results in PCP archives no longer being able to be replayed.
Created attachment 602400 [details] Revised PCP result PDU decoding patch This update resolves two additional bugs in the earlier patches, which were exposed by pcpqa tests 003 and 202 (and a number of others, for the latter problem) respectively. Test 003 exposes a problem decoding a corner case of zero-length aggregate value, such as the sample.aggregate.null metric from pmdasample. Test 202 exposes a problem decoding mark records (from PCP archives), which have a zero-length initial __pmPDU component in the result_t structure.
Upstream patch: http://oss.sgi.com/cgi-bin/gitweb.cgi?p=pcp/pcp.git;a=commit;h=49c679c44425915a8d6aa4af5f90b35384843c12 http://oss.sgi.com/cgi-bin/gitweb.cgi?p=pcp/pcp.git;a=commit;h=d08105f0f36c24e3f6d1e28e6d2289001e03f589 This issue has been addressed in pcp-3.6.5
This issue was addressed in Fedora and EPEL via the following security updates: Fedora-16: https://admin.fedoraproject.org/updates/pcp-3.6.5-1.fc16 Fedora-17: https://admin.fedoraproject.org/updates/pcp-3.6.5-1.fc17 Rawhide: https://admin.fedoraproject.org/updates/pcp-3.6.5-1.fc18 EPEL-5: https://admin.fedoraproject.org/updates/pcp-3.6.5-1.el5 EPEL-6: https://admin.fedoraproject.org/updates/pcp-3.6.5-1.el6