classification
Title: Avoid Python 2 documentation to appear in Web search results
Type: enhancement Stage:
Components: Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Terry Davis, ammar2, bittner, mdk, terry.reedy, xtreak
Priority: normal Keywords:

Created on 2020-03-19 09:35 by bittner, last changed 2020-03-26 19:46 by terry.reedy.

Messages (6)
msg364597 - (view) Author: Peter Bittner (bittner) * Date: 2020-03-19 09:35
Currently, when you do a Web search (e.g. using Google, Bing, Yahoo!, DuckDuckGo, et al.) for a Python module or function call you'll find a link to the related Python 2 documentation first.

How to reproduce:

1. Search for simply "os.environ" in your favorite search engine.
2. Find a link to the Python documentation in the first 3 results. Typically, this will point to the Python 2 docs first.

(Side note: Google seems to now actively manipulate the results ranking Python 3 results higher. Apparently, this is the only popular search engine behaving like that.)

Expected result:

- When searching for Python modules, functions, builtins, etc. on the Web, no search results for Python 2 should pop up at all if the same content exists for Python 3

Possible implementation:

- Add a "noindex" meta tag to the header of the generated HTML documentation
- see https://support.google.com/webmasters/answer/93710
msg364721 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-03-20 23:10
I completely agree with the goal.  But I think that this is duplicate of at one previous issue either here or on the website tracker.
msg364725 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2020-03-21 04:39
See also the approach to use robots.txt https://github.com/python/pythondotorg/issues/1030
msg364868 - (view) Author: Terry Davis (Terry Davis) Date: 2020-03-23 17:57
It seems like using "noindex" to tell search engines not to look at Python 2 docs at all would have the side-effect of breaking Python 2 documentation searches.
I have gotten into the habit of searching for "python 3" explicitly, which I think is a reasonable work-around. Especially since python 2 and 3 could be considered different languages.
msg365050 - (view) Author: Ammar Askar (ammar2) * (Python triager) Date: 2020-03-26 04:44
Instead of noindex maybe the 3.x documentation can be marked as the canonical one: https://support.google.com/webmasters/answer/139066

This should still allow the old docs to be crawled but emphasize the latest docs on the website:

> Google Search result usually points to the canonical page, unless one of the duplicates is explicitly better suited for a user: for example, the search result will probably point to the mobile page if the user is on a mobile device, even if the desktop page is marked as canonical.

Presumably the "better suited" means that if you search for "python2 timeit" you'd still find the py2 docs.

Terry Reedy, could you link the earlier issue so this information can be posted there, or are you referring to the issue Karthik linked?
msg365110 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2020-03-26 19:46
Sorry, the 'previous issue' is a vague memory which I cannot find.  For some reason, searching this tracker for 'Google' return 1500+ hits.
History
Date User Action Args
2020-03-26 19:46:58terry.reedysetmessages: + msg365110
2020-03-26 04:44:31ammar2setnosy: + ammar2
messages: + msg365050
2020-03-23 17:57:09Terry Davissetnosy: + Terry Davis
messages: + msg364868
2020-03-21 04:39:25xtreaksetnosy: + xtreak
messages: + msg364725
2020-03-20 23:10:20terry.reedysetnosy: + terry.reedy
messages: + msg364721
2020-03-19 13:21:27xtreaksetnosy: + mdk
2020-03-19 09:35:58bittnercreate