This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: robotsparser deny all with some rules
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: EricG, adiboo67, iritkatriel, nico.bonefato, quentin-maire, vstinner
Priority: normal Keywords:

Created on 2019-03-06 09:42 by quentin-maire, last changed 2022-04-11 14:59 by admin.

Messages (6)
msg337285 - (view) Author: wats0ns (quentin-maire) Date: 2019-03-06 09:42
RobotsParser parse a "Disallow: ?" rule as a deny all, but this is a valid rule that should be interpreted as "Disallow: /?*" or "Disallow: /*?*"
msg338293 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2019-03-18 22:13
Can you provide a link to documentation showing that "Disallow: ?" shouldn't be the same as deny all?  Thanks!
msg338298 - (view) Author: wats0ns (quentin-maire) Date: 2019-03-18 23:20
I can't find a documentation about it, but all of the robots.txt checkers I find behave like this. You can test on this site: http://www.eskimoz.fr/robots.txt, I believe that this is how it's implemented now in most parsers ?
msg390073 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-02 15:48
I removed almost all messages of this issue since most of them looked list SPAM. I also blocked user accounts who posted SPAM. If it was a mistake, contact me.

This is the Python bug tracker, not a forum to ask questions how to use Python, or to report bugs in your website.

Multiple comments were written in French, whereas this bug tracker is in English.

I even hesitate to close the issue since it got too many SPAM comments.
msg408351 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-12-12 00:11
I restored one non-spam message from the OP that was deleted.

Changing to enhancement because this is not a bug (i.e., deviation from documentation).

I don't know enough about this to have a view on whether this enhancement request should be accepted.
msg416852 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2022-04-06 10:21
I removed two comments: none of the mentioned URL contains a "Disallow: ?" rule and the comments didn't add any value to this issue. It looks like regular spam (SEO).
History
Date User Action Args
2022-04-11 14:59:12adminsetgithub: 80388
2022-04-06 10:21:58vstinnersetmessages: + msg416852
2022-04-06 10:21:10vstinnersetmessages: - msg416767
2022-04-06 10:21:08vstinnersetmessages: - msg416847
2022-04-06 09:17:05adiboo67setmessages: + msg416847
2022-04-05 10:27:26adiboo67setnosy: + adiboo67
messages: + msg416767
2021-12-12 00:11:21iritkatrielsetversions: + Python 3.11, - Python 3.5
nosy: + iritkatriel

messages: + msg408351

type: behavior -> enhancement
2021-12-12 00:08:06iritkatrielsetnosy: + quentin-maire
messages: + msg338298
2021-09-29 17:08:26vstinnersetmessages: - msg402889
2021-09-29 16:17:35nico.bonefatosetnosy: + nico.bonefato
messages: + msg402889
2021-04-02 15:48:12vstinnersetnosy: + vstinner
messages: + msg390073
2021-04-02 15:46:06vstinnersetmessages: - msg338298
2021-04-02 15:46:05vstinnersetmessages: - msg365770
2021-04-02 15:45:26vstinnersetmessages: - msg370275
2021-04-02 15:44:53vstinnersetmessages: - msg367546
2021-04-02 15:44:49vstinnersetmessages: - msg366509
2021-04-02 15:44:33vstinnersetmessages: - msg374629
2021-04-02 15:44:05vstinnersetmessages: - msg377125
2021-04-02 15:42:31vstinnersetmessages: - msg372112
2021-04-02 15:41:58vstinnersetmessages: - msg377058
2021-04-02 15:41:37vstinnersetmessages: - msg376032
2021-04-02 15:41:22vstinnersetmessages: - msg374642
2021-04-02 15:40:51vstinnersetmessages: - msg378070
2021-04-02 15:39:54vstinnersetmessages: - msg379615
2021-04-02 15:39:52vstinnersetmessages: - msg379616
2021-04-02 15:38:37vstinnersetmessages: - msg385859
2021-04-02 15:37:42vstinnersetmessages: - msg381443
2021-04-02 15:36:49vstinnersettitle: référencement naturel -> robotsparser deny all with some rules
2021-04-02 15:36:09vstinnersetmessages: - msg390072
2021-04-02 15:36:07vstinnersetmessages: - msg390071
2021-04-02 15:33:20EricGsetmessages: + msg390072
2021-04-02 15:30:36EricGsetnosy: + EricG, - jeanotlapin, nico702, ideeanimationanniversaire

messages: + msg390071
title: robotsparser deny all with some rules -> référencement naturel
2021-01-28 13:35:19jeanotlapinsetnosy: + jeanotlapin
messages: + msg385859
2020-11-19 17:30:47ideeanimationanniversairesetnosy: + ideeanimationanniversaire
messages: + msg381443
2020-10-25 23:01:14nico702setmessages: + msg379616
2020-10-25 22:55:38nico702setnosy: + nico702, - matthieuhemea
messages: + msg379615
2020-10-05 18:11:29matthieuhemeasetnosy: + matthieuhemea, - Patrick Valibus 410 Gone, Jmgray47, arnaud, calamina, amiir.mascud, jeanotlapin
messages: + msg378070
2020-09-18 15:20:07jeanotlapinsetnosy: + jeanotlapin
messages: + msg377125
2020-09-17 15:15:46amiir.mascudsetnosy: + amiir.mascud
messages: + msg377058
2020-08-28 12:00:03calaminasetnosy: + calamina
messages: + msg376032
2020-07-31 13:24:17arnaudsetnosy: + arnaud
messages: + msg374642
2020-07-31 04:34:49Jmgray47setnosy: + Jmgray47
messages: + msg374629
2020-06-22 20:35:43Patrick Valibus 410 Gonesetnosy: + Patrick Valibus 410 Gone, - cheryl.sabella, quentin-maire, lagustais, artasca, Fred AYERS, mathias44
messages: + msg372112
2020-05-28 23:54:56mathias44setnosy: + mathias44
messages: + msg370275
2020-04-28 17:20:52Fred AYERSsetnosy: + Fred AYERS
messages: + msg367546
2020-04-15 12:57:20artascasetnosy: + artasca
messages: + msg366509
2020-04-04 16:46:51lagustaissetnosy: + lagustais
messages: + msg365770
2019-03-18 23:20:00quentin-mairesetmessages: + msg338298
2019-03-18 22:13:37cheryl.sabellasetnosy: + cheryl.sabella
messages: + msg338293
2019-03-06 09:42:01quentin-mairecreate