Bug 725798 - yum-m-p segmentation fault on rpmforge repository, due to missing pkgId in metadata
Summary: yum-m-p segmentation fault on rpmforge repository, due to missing pkgId in me...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: yum-metadata-parser
Version: 5.6
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: James Antill
QA Contact: Karel Srot
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-26 15:25 UTC by Orion Poplawski
Modified: 2014-01-21 06:24 UTC (History)
11 users (show)

Fixed In Version: yum-metadata-parser-1.1.2-4.el5
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-07-25 10:26:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:1115 0 normal SHIPPED_LIVE yum-metadata-parser bug fix update 2012-07-25 14:26:00 UTC

Description Orion Poplawski 2011-07-26 15:25:29 UTC
Description of problem:

# yum -y upgrade yum
Loaded plugins: downloadonly
CoRA                                                                     | 2.1 kB     00:00     
addons                                                                   |  951 B     00:00     
adobe-linux                                                              |  951 B     00:00     
base                                                                     | 2.1 kB     00:00     
epel                                                                     | 3.7 kB     00:00     
extras                                                                   | 2.1 kB     00:00     
rpmforge                                                                 | 1.1 kB     00:00     
rpmforge: [###                                                                      ] 471/10722Segmentation fault

# yum -y upgrade yum -d9 
Loading "downloadonly" plugin
Not loading "fastestmirror" plugin, as it is disabled
Running "config" handler for "downloadonly" plugin
Config time: 0.164
Yum Version: 3.2.22
COMMAND: yum -y upgrade yum -d9 
Installroot: /
Ext Commands:

   yum
Setting up Package Sacks
Running "postreposetup" handler for "downloadonly" plugin
Segmentation fault

Version-Release number of selected component (if applicable):
yum-3.2.22-33.el5.centos

How reproducible:
Every time

Comment 1 Ben 2011-07-26 15:36:29 UTC
I can confirm this happening on over 25 of my RHEL5.6 boxes.  And not just with yum but with yum-updatesd and rhn_check:

Jul 26 14:13:56 raine kernel: yum-updatesd-he[18464]: segfault at 0000000000000000 rip 0000003360479b80 rsp 00007fff5f282268 error 4
Jul 26 15:13:55 raine kernel: yum-updatesd-he[25007]: segfault at 0000000000000000 rip 0000003360479b80 rsp 00007fff0e44d6b8 error 4
Jul 26 15:17:05 raine kernel: rhn_check[25672]: segfault at 0000000000000000 rip 0000003360479b80 rsp 00007fff5401a2d8 error 4
Jul 26 16:13:55 raine kernel: yum-updatesd-he[31721]: segfault at 0000000000000000 rip 0000003360479b80 rsp 00007fffdc7e8148 error 4
Jul 26 16:13:55 raine kernel: yum-updatesd-he[31721]: segfault at 0000000000000000 rip 0000003360479b80 rsp 00007fffdc7e8148 error 4

Comment 2 Ben 2011-07-26 15:37:34 UTC
(In reply to comment #1)

Apologies, I meant to add:

All of these servers also have the rpmforge repo enabled.

Comment 3 Orion Poplawski 2011-07-26 15:42:03 UTC
(gdb) bt
#0  0x000000343c079b80 in strlen () from /lib64/libc.so.6
#1  0x00002aaab3bfb958 in IA__g_string_chunk_insert_len (chunk=0x1091b60, string=0x0, len=-1)
    at gstring.c:224
#2  0x00002aaab39a9d2b in update_package_cb (p=0x10d8920, user_data=<value optimized out>)
    at sqlitecache.c:361
#3  0x00002aaab39a6b33 in primary_parser_package_end (data=0x7fffffffc220, 
    name=0x10b25fb "package") at xml-parser.c:365
#4  primary_sax_end_element (data=0x7fffffffc220, name=0x10b25fb "package") at xml-parser.c:473
#5  0x00002aaab3e8c988 in xmlParseEndTag1 (ctxt=0x1097600, line=1785) at parser.c:7428
#6  0x00002aaab3e9a84a in xmlParseElement__internal_alias (ctxt=0x1097600) at parser.c:8750
#7  0x00002aaab3e9662e in xmlParseContent__internal_alias (ctxt=0x1097600) at parser.c:8564
#8  0x00002aaab3e9a7e2 in xmlParseElement__internal_alias (ctxt=0x1097600) at parser.c:8724
#9  0x00002aaab3e9ac2a in xmlParseDocument__internal_alias (ctxt=0x1097600) at parser.c:9332
#10 0x00002aaab3e9b8ec in xmlSAXUserParseFile__internal_alias (sax=0x2aaab3baca40, 
    user_data=0x7fffffffc220, filename=<value optimized out>) at parser.c:12572
#11 0x00002aaab39a77eb in yum_xml_parse_primary (
    filename=0xfffb94 "/var/cache/yum/rpmforge/primary.xml.gz", 
    count_callback=<value optimized out>, package_callback=<value optimized out>, 
    user_data=<value optimized out>, err=<value optimized out>) at xml-parser.c:593
#12 0x00002aaab39a9919 in update_packages (self=<value optimized out>, 
    args=<value optimized out>, update_info=0x7fffffffc330) at sqlitecache.c:412
#13 py_update (self=<value optimized out>, args=<value optimized out>, 
    update_info=0x7fffffffc330) at sqlitecache.c:537
#14 0x00002aaab39a9be5 in py_update_primary (self=0x0, args=0xaf6730) at sqlitecache.c:565

(gdb) up
#2  0x00002aaab39a9d2b in update_package_cb (p=0x10d8920, user_data=<value optimized out>)
    at sqlitecache.c:361
361         g_hash_table_insert (update_info->all_packages,
(gdb) list
356     static void
357     update_package_cb (Package *p, gpointer user_data)
358     {
359         UpdateInfo *update_info = (UpdateInfo *) user_data;
360
361         g_hash_table_insert (update_info->all_packages,
362                              g_string_chunk_insert (update_info->package_ids_chunk,
363                                                     p->pkgId),
364                              GINT_TO_POINTER (1));
365
(gdb) print update_info
$1 = (UpdateInfo *) 0x7fffffffc330
(gdb) print *update_info
$2 = {db = 0x1087e90, remove_handle = 0x1090fd0, count_from_md = 10722, packages_seen = 471, 
  add_count = 471, del_count = 0, current_packages = 0x1092840, all_packages = 0x1092800, 
  package_ids_chunk = 0x1091b60, timer = 0x108e280, python_callback = 0xfee960, 
  info_init = 0x2aaab39aa0f0 <package_writer_info_init>, 
  info_clean = 0x2aaab39a94e0 <package_writer_info_clean>, 
  create_tables = 0x2aaab39a8900 <yum_db_create_primary_tables>, 
  write_package = 0x2aaab39a9ff0 <write_package_to_db>, 
  xml_parse = 0x2aaab39a7790 <yum_xml_parse_primary>, user_data = 0xef9120}
(gdb) print *p
$3 = {pkgKey = 0, pkgId = 0x0, name = 0x12afd90 "mhash-devel", arch = 0x12afd9c "x86_64", 
  version = 0x12afda5 "0.9.9", epoch = 0x12afda3 "0", release = 0x12afdab "1.el5.rf", 
  summary = 0x12afdb8 "Header files and libraries for developing apps which will use mhash", 
  description = 0x12afdfc "The mhash-devel package contains the header files and libraries needed\nto develop programs that use the mhash library.\n\nInstall the mhash-devel package if you want to develop applications that\nwill us"..., url = 0x12afef5 "http://mhash.sourceforge.net/", 
  time_file = 1311155083, time_build = 1311154870, rpm_license = 0x12aff3e "LGPL", 
  rpm_vendor = 0x12aff43 "Dag Apt Repository, http://dag.wieers.com/apt/", 
  rpm_group = 0x12aff72 "Development/Libraries", 
  rpm_buildhost = 0x12aff88 "lisse.hasselt.wieers.com", 
  rpm_sourcerpm = 0x12affa1 "mhash-0.9.9-1.el5.rf.src.rpm", rpm_header_start = 440, 
  rpm_header_end = 4747, rpm_packager = 0x12afed9 "Dag Wieers <dag>", 
  size_package = 19967, size_installed = 36750, size_archive = 38772, 
  location_href = 0x12aff13 "RPMS/mhash-devel-0.9.9-1.el5.rf.x86_64.rpm", location_base = 0x0, 
  checksum_type = 0x12afdb4 "sha", requires = 0x10de260, provides = 0x10e2780, 
  conflicts = 0x0, obsoletes = 0x10e2790, files = 0x0, changelogs = 0x0, chunk = 0x11ca900}

Comment 4 James Antill 2011-07-26 15:46:29 UTC
 According to investigations on IRC this is almost certainly:

commit 6dd9030c162d484f3871a6cee267980b94c9c902
Author: Paul Nasrat <pauln>
Date:   Tue Nov 27 14:07:31 2007 +0000

    Fix segmentation fault experienced with a malformed primary.xml

diff --git a/sqlitecache.c b/sqlitecache.c
index 749cbc1..bc20b56 100644
--- a/sqlitecache.c
+++ b/sqlitecache.c
@@ -358,6 +358,11 @@ update_package_cb (Package *p, gpointer user_data)
 {
     UpdateInfo *update_info = (UpdateInfo *) user_data;
 
+    /* TODO: Wire in logging of skipped packages */
+    if (p->pkgId == NULL) {
+        return;
+    }
+
     g_hash_table_insert (update_info->all_packages,
                          g_string_chunk_insert (update_info->package_ids_chunk,
                                                 p->pkgId),

Comment 5 James Antill 2011-07-26 15:47:06 UTC
 Yeh, that traceback pretty much confirms it.

Comment 6 James Antill 2011-07-26 16:15:23 UTC
 And from the metadata:

  <package type="rpm">
    <name>mhash-devel</name>
    <arch>x86_64</arch>
    <version epoch="0" ver="0.9.9" rel="1.el5.rf"/>
    <checksum type="sha" pkgid="YES"/>

Comment 7 Leonard den Ottolander 2011-07-26 17:09:14 UTC
Are you suggesting the test itself is the problem? Seems more like the test should be extended to catch this case where pkgid=YES but no pkgid is actually set:

<checksum type="sha" pkgid="YES"/></checksum>

The empty checksum is the problem. p->pkgId is set to 0x0 in this case, reading from the trace in comment 3, so the test for NULL does not catch it.

Extending the test in the patch mentioned in comment #4 to

if (p->pkgId == NULL || p->pkgId == 0) {
    return;
}

seems a possible solution.

Comment 8 seth vidal 2011-07-26 17:27:13 UTC
The root cause here is createrepo --cachedir. The cache for specific pkgs is damaged and is just an empty file. It appears to be happening on dag's system for one reason or another. He knows how to correct it.

Additionally, yum-metadata-parser upstream already has a patch for the segfault and has so for a while.

Hope this helps.

Comment 9 Leonard den Ottolander 2011-07-26 17:35:55 UTC
I misread comment 4. Thought you indicated that patch as the culprit, but since that patch is not in the code I now understand it is the proposed fix. You can also disregard my confusion about 0x0 which is an address (so indeed NULL not 0) not a value.

Comment 10 Dag Wieers 2011-07-26 17:57:20 UTC
Let me clarify that this 'corrupt' metadata is caused by using RHEL5's createrepo using the --cachedir option. With the help of Seth Vidal we found no less than 86 zero-length cache files in the cachedir for various repositories (we have more than 40 large repositories and almost each of them was affected in some way by this during 19/7 and 25/7), even when there has not been any problems reported with the filesystem, the disk-backend or the KVM virtualization.

One way to prevent this, is to make createrepo report empty cache-files and not take them into account or replace them instantly and use that. It should not be using empty checksums to generate repository metadata or generate corrupt metadata.

Comment 11 Dag Wieers 2011-07-26 18:01:58 UTC
Ok, apparently I am slow to comment as Seth already mentioned it more than 30 minutes :-/

Comment 12 Normando 2011-07-26 22:58:07 UTC
Sorry Dag, but is there a chanse to fix ASAP?

Comment 13 Dag Wieers 2011-07-26 23:13:22 UTC
No, we are waiting for our main mirror (HEAnet) to pull the latest changes. HEAnet's rsync however is pretty slow lately and takes 9 hours for an unchanged repository. It takes 16 hours when pushing updates as we recently did. We don't control it :-/

Comment 14 Normando 2011-07-26 23:31:40 UTC
Thanks Dag. I am happy to hear that!

Comment 15 grant haywood 2011-07-27 19:47:37 UTC
Is disabling rpmforge temporarily an acceptable workaround here?

Comment 16 James Antill 2011-07-27 20:05:11 UTC
yes, you can work around it by disabling the repo.

Comment 17 Dag Wieers 2011-07-27 21:06:33 UTC
The problem has been fixed since 5 am (CET) this morning, now 18 hours ago. If you still have this problem, you should clean your yum cache.

Probably safest to do 'yum clean all'

Comment 23 errata-xmlrpc 2012-07-25 10:26:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-1115.html


Note You need to log in before you can comment on or make changes to this bug.