Bug 917100

Summary: Some public Bugzilla tickets not showing up in internet search engines
Product: [Community] Bugzilla Reporter: Jacob Hunt <jhunt>
Component: Bugzilla GeneralAssignee: Simon Green <sgreen>
Status: CLOSED CURRENTRELEASE QA Contact: tools-bugs <tools-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.2CC: ebaak, jingwang
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.4rc2-3.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-22 03:09:00 EDT Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Jacob Hunt 2013-03-01 12:15:51 EST
Description of problem:

In attempting to investigate a kernel panic, I tried searching the Red Hat's site for related panics and found none.  Later I found out that there were publicly readable Bugzilla tickets with the panic I saw, but they had not been crawled by internet search engines.

The bugs in question were 565668 and 654210.

From looking at https://bugzilla.redhat.com/robots.txt, then http://bugzilla.redhat.com/sitemap_index.xml, then the files https://bugzilla.redhat.com/sitemap[1-4].xml.gz, at least the two tickets listed above are not in any of the four sitemaps.


Version-Release number of selected component (if applicable):

Red Hat Bugzilla 4.2.5-7.1


How reproducible:

Random, in that some public bugs are indexed and others are not.

  
Actual results:

It seems random which public bugs are indexed in the sitemaps 


Expected results:

All public bugs should be searchable in search engines
Comment 4 wangjing 2013-04-23 02:17:56 EDT
simon,hi,

 I'm not clear about the steps, could u feel free to tell us if u know?
thanks!
Comment 5 Simon Green 2013-04-23 03:24:43 EDT
(In reply to comment #4)
>  I'm not clear about the steps, could u feel free to tell us if u know?
> thanks!

Hi Jing,

In a nutshell, not all public bugs were referred to in our sitemap file. Once this bug is fixed, all public bugs will be shown in our site map.
Comment 6 wangjing 2013-04-23 05:34:08 EDT
(In reply to comment #5)
> (In reply to comment #4)
> In a nutshell, not all public bugs were referred to in our sitemap file.
> Once this bug is fixed, all public bugs will be shown in our site map.

hi~Simon~

1)what is our sitemap file?
2)what kind of public bugs cannot be shown now?
3)does the public bug mean that it's not in any group?

thanks!
Comment 7 Simon Green 2013-04-23 07:42:23 EDT
(In reply to comment #6)
> 1)what is our sitemap file?

https://bugzilla.redhat.com/sitemap_index.xml (and files referenced in it)

> 2)what kind of public bugs cannot be shown now?

Whole blocks of bugs are currently missing (between the last bug in one file and the first bug in the next file)

> 3)does the public bug mean that it's not in any group?

Correct.

  -- simon
Comment 8 wangjing 2013-04-25 22:53:24 EDT
(In reply to comment #7)
> (In reply to comment #6)
> > 2)what kind of public bugs cannot be shown now?
> Whole blocks of bugs are currently missing (between the last bug in one file
> and the first bug in the next file)

I can't see any bugs displaying on this page, adding a screenshot attached, any problems?
Comment 10 Simon Green 2013-04-25 22:56:07 EDT
(In reply to comment #8)
> I can't see any bugs displaying on this page, adding a screenshot attached,
> any problems?

As per the screenshot, the bugs are in https://bugzilla.redhat.com/sitemapX.xml.gz (where X is a number between 1 and 4).

  -- simon
Comment 12 wangjing 2013-04-28 04:02:25 EDT
(In reply to comment #10)
> (In reply to comment #8)
> As per the screenshot, the bugs are in
> https://bugzilla.redhat.com/sitemapX.xml.gz (where X is a number between 1
> and 4).

simon,hi~
1)which clarification or products are these public bugs from?
2)about 'sitemapX.xml.gz'(where X is a number between 1 and 4).--> I guess the steps and expected results are:

steps: search for a buglist in certain clarification or products and 'Groups' fields are 'None'.

expected results:
the bugs in the buglist are displaying separately in thus four pages ordered by bug ID:
https://bugzilla.redhat.com/sitemap1.xml.gz 
https://bugzilla.redhat.com/sitemap2.xml.gz 
https://bugzilla.redhat.com/sitemap3.xml.gz 
https://bugzilla.redhat.com/sitemap4.xml.gz 

right?
thanks!
Comment 13 Simon Green 2013-04-28 18:10:49 EDT
(In reply to comment #12)
> simon,hi~
> 1)which clarification or products are these public bugs from?

All classifications and products (except those that don't have any public bugs)

> 2)about 'sitemapX.xml.gz'(where X is a number between 1 and 4).--> I guess
> the steps and expected results are:
> 
> steps: search for a buglist in certain clarification or products and
> 'Groups' fields are 'None'.
> 
> expected results:
> the bugs in the buglist are displaying separately in thus four pages ordered
> by bug ID:
> https://bugzilla.redhat.com/sitemap1.xml.gz 
> https://bugzilla.redhat.com/sitemap2.xml.gz 
> https://bugzilla.redhat.com/sitemap3.xml.gz 
> https://bugzilla.redhat.com/sitemap4.xml.gz 
> 
> right?

Nearly. There is a limit of 50000 links in a single sitemap file (as per the site map specifications). Because of this, there will be at least 11 sitemapX.xml.gz files once this bug is fixed.

  -- simon