Bug 1435506 - first search result for "list of files in RPM package" is the *Romanian* documentation
Summary: first search result for "list of files in RPM package" is the *Romanian* docu...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora Documentation
Classification: Fedora
Component: about-fedora
Version: devel
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
Assignee: Petr Bokoc
QA Contact: Fedora Docs QA
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-24 01:19 UTC by skierpage
Modified: 2022-08-25 23:21 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-08 15:22:09 UTC
Embargoed:


Attachments (Terms of Use)

Description skierpage 2017-03-24 01:19:55 UTC
(This bug report is not about these specific pages, but about the multiple undifferentiated versions of documentation pages on docs.fedoraproject.org and the poor search experience this engenders.)

I Googled for "list of files in RPM package" (without the quotes. First result I get is "4.2.3. Listing the files in a package - Fedora Documentation", looks promising.  But the URL is https://docs.fedoraproject.org/ro/Fedora_Draft_Documentation/0.1/html/RPM_Guide/ch04s02s03.html , the *Romanian* documentation. It's mostly in English but previous and next are "Înapoi and Înainte."

Here's another example of search failing to return an appropriate page. If I Google "Fedora uninstall RPM package", the second result is "4.2.5 Removing Packages - Fedora Documentation", again this looks promising. But the URL is https://docs.fedoraproject.org/en-US/Fedora/16/html/System_Administrators_Guide/sec-Removing.html , Fedora *16* documentation. Fedora 16 reached End of Life four years ago!

How reproducible:
Every time for me, even in a private window.

Steps to Reproduce:
1. https://www.google.com/search?q=list+of+files+in+RPM+package
2. https://www.google.com/search?q=Fedora+uninstall+RPM

Actual results:
The only result from docs.fedoraproject.org on the first page isn't even English documentation. I would expect to get https://docs.fedoraproject.org/en-US/Fedora_Draft_Documentation/0.1/html/RPM_Guide/ch04s02s03.html as my first result.

Expected results:
Pages must identify their language [1] and give rel="alternate" links to the page in other languages [2]. Pages should also use a canonical URL to identify the "master" version of the documentation (Fedora 25 right now) [3], so the Fedora 16 documentation of some feature gets lower search ranking than the documentation for the current Fedora version. I think you could also put language information for the canonical (most recent Fedora version) URLs into a sitemap on docs.fedoraproject.org [4] instead of updating all these old pages. As far as I can tell, docs.fedoraproject.org pages do none of these, hence it's  hit or miss what Google returns.

[1] https://www.w3.org/International/questions/qa-html-language-declarations
[2] https://support.google.com/webmasters/answer/189077?hl=en
[3] https://support.google.com/webmasters/answer/139066?hl=en
[4] https://support.google.com/webmasters/answer/2620865?hl=en&ref_topic=2370587

Comment 1 Paul W. Frields 2019-09-25 14:16:19 UTC
Petr, is this still an issue on the latest docs.fp.o site?

Comment 2 skierpage 2019-09-25 21:28:56 UTC
(In reply to Paul W. Frields from comment #1)
> Petr, is this still an issue on the latest docs.fp.o site?

Still true for me in a Firefox private window. For the second search the Fedora 16 URL is down to the third result, the second result is now "Removing Packages - Fedora Documentation" for Fedora 23, which reached EOL in 2016. It looks exactly the same as the Fedora 16 result, both have the green breadcrumb
  https://docs.fedoraproject.org › en-US › Fedora › html › sec-Removing
What's sad is when you follow these old URLs, the left-hand nav only offers up to Fedora 26. There's no obvious way to find documentation for current supported Fedora.

So here are more suggestions:
* Change the <title> to "xyz - Fedora 23 Documentation" or "xyz - Fedora 30 Docs Site" instead of making users scan the URL (which is hard on a phone).
* Maybe there's some way to include the version in the green breadcrumb that Google shows.
* Nuke the entire Fedora Documentation tree that only has versions up to 26! If there's any value at all to this old documentation you can point users to the Internet Archive's Wayback machine ( https://web.archive.org/web/20160515102459/https://docs.fedoraproject.org/en-US/Fedora/23/html/System_Administrators_Guide/index.html ) or if you must a separate docs-obsolete.fedoraproject.org.

Comment 3 Petr Bokoc 2019-09-27 13:26:35 UTC
Hello,

Yes, I see similar results in Google, "list+of+files+in+RPM+package" is in 5th place in a private window, not first, but it's still the top hit for docs.fp.o. We've had this kind of problem before, although this is the first time I see Google prioritizing Romanian over English, previous reports were only about landing at outdated but still English docs.

This is kind of a long running problem with no easy solutions besides just removing old docs altogether. Good point with the Internet Archive; I actually never thought of that.

Docs for Fedora up to version 26 were built using an old system which is unmaintained, and the way it worked actually prevents us from making *any* updates other than running a script to insert something into each page. Basically we can't rebuild the site, we can only edit the built sources; the reasons for that are complicated. I do think we might be able to fix the headers, though, and I think we should try to find a way to insert a big red banner into each page saying "this is really outdated, go to docs.fp.o". I don't think noting the version in the page title would be enough because neither mobile Firefox nor Chrome display the page title as you browse, only in the tab list...

Anyway, for 27 and later, on our new system, we have more control over the content and we could probably handle this properly with a separate UI layout for EOLed versions that has a similar banner.

The problem is that the person who does the most work on the site and who could fix this the easiest is only rarely available to take on a bigger task like this one, so it's stalled. I'll talk to him and see if we can get this done in some reasonable timeframe. I absolutely agree that it's a problem.

Oh and the open issues are here:
* https://pagure.io/fedora-docs/docs-fp-o/issue/118 (for pre-27 docs)
* https://pagure.io/fedora-docs/docs-fp-o/issue/116 (27 and later)

I linked your bug report in the pre-27 one and I'll close this if you don't mind, we're using Pagure issues nowadays. (Closing down the whole Fedora Docs product category here in Bugzilla is also something I should get around to soon...)

Cheers,
Petr

Comment 4 skierpage 2022-08-25 23:21:19 UTC
FYI, I think someone changed https://docs.fedoraproject.org/robots.txt to add
  Disallow: /*/Fedora*/
so few or no old obsolete doc pages now show up in Google search results \o/. I still think you should just nuke the entire tree.


Note You need to log in before you can comment on or make changes to this bug.