classification
Title: robotparser should support specifying SSL context
Type: enhancement Stage: patch review
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Tchinmai7, berker.peksag
Priority: normal Keywords: patch

Created on 2021-03-22 23:08 by Tchinmai7, last changed 2021-04-07 01:56 by Tchinmai7.

Pull Requests
URL Status Linked Edit
PR 24984 closed Tchinmai7, 2021-03-23 00:03
PR 24986 open Tchinmai7, 2021-03-23 00:20
Messages (3)
msg389352 - (view) Author: Tarun Chinmai Sekar (Tchinmai7) * Date: 2021-03-22 23:25
IMO this could be enhanced by adding a sslcontext parameter to read method

a sample change would it could look like
```
def read(self, sslcontext=None):
    """Reads the robots.txt URL and feeds it to the parser."""
    try:
        if sslcontext:
           f = urllib.request.urlopen(self.url, context=sslcontext)
        else:
           f = urllib.request.urlopen(self.url)
    except urllib.error.HTTPError as err:
        if err.code in (401, 403):
            self.disallow_all = True
        elif err.code >= 400 and err.code < 500:
            self.allow_all = True
    else:
        raw = f.read()
        self.parse(raw.decode("utf-8").splitlines())

```

Happy to send a PR if this proposal makes sense.
msg390395 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2021-04-07 01:21
I'm not opposing to the idea, but what's the practical use case here? I haven't seen a case where you needed to pass a custom SSLContext in order to fetch the robots.txt file.
msg390396 - (view) Author: Tarun Chinmai Sekar (Tchinmai7) * Date: 2021-04-07 01:56
I am writing a web scraper, that runs in a container that has CA-Certificates stored in a non-standard location. The Ca-Certificates are managed by Certifi. By allowing to override the sslcontext, it is possible for the user to construct a sslcontext and pass it in.
History
Date User Action Args
2021-04-07 01:56:13Tchinmai7setmessages: + msg390396
2021-04-07 01:21:06berker.peksagsetnosy: + berker.peksag

messages: + msg390395
versions: - Python 3.6, Python 3.7, Python 3.8, Python 3.9
2021-03-23 00:20:43Tchinmai7setpull_requests: + pull_request23746
2021-03-23 00:03:17Tchinmai7setkeywords: + patch
stage: patch review
pull_requests: + pull_request23744
2021-03-22 23:25:16Tchinmai7setmessages: + msg389352
2021-03-22 23:21:30Tchinmai7setcomponents: + Library (Lib)
2021-03-22 23:08:07Tchinmai7create