classification
Title: module 'urllib' has no attribute 'request'
Type: Stage: resolved
Components: Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, brett.cannon, martin.panter, orsenthil, piyush-kgp
Priority: normal Keywords:

Created on 2019-04-23 05:56 by piyush-kgp, last changed 2019-04-23 17:08 by brett.cannon. This issue is now closed.

Files
File name Uploaded Description Edit
Screen Shot 2019-04-23 at 11.22.48 AM.png piyush-kgp, 2019-04-23 05:56
Messages (5)
msg340690 - (view) Author: Piyush (piyush-kgp) Date: 2019-04-23 05:56
The current way to use one of `urllib.request` APIs is like this:
```
import urllib.request
urllib.request.urlretrieve
```

Can we change this to:
```
import urllib
urllib.request.urlretrieve
```
This will require adding 1 line at https://github.com/python/cpython/blob/master/Lib/urllib/__init__.py

This is required because help on `urllib` says that `request` is part of `urllib` suggesting that `urllib.request` should be available if I `import urllib`.
Moreover `import urllib.request` is not at all intuitive.

I can submit a PR if other's think what I'm proposing makes sense.
msg340691 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2019-04-23 06:04
I vaguely recollect the reason for this. It was done with the translation of Python2 code (which had urllib, urllib2, and urlparse, robotparser) etc, combined into a single package. We wanted to keep the import as urllib.request and urllib.response, and urllib.parse

That said. If there is no other request currently open for this. Let's keep this, and am +1 to this suggestion. We have to think if there are any potential drawbacks for the old code which is already using 2nd level import.
msg340697 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2019-04-23 07:55
What about other packages in the stdlib? For example, you can see the same behavior in the email package:

>>> import email
>>> email.message.EmailMessage()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'email' has no attribute 'message'

IMO, this is how imports work in Python and IIRC os.path is the only exception in the stdlib. I think this needs to be discussed on python-ideas first.
msg340706 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2019-04-23 08:57
The “urllib” package also contains “urllib.parse”, which is a lot more lightweight than “urllib.request”. In a quick experiment, importing “urllib.request” is more than 2 times slower than importing “urllib.parse” on its own. And importing “urllib” by itself is not much faster, so I guess a lot of the time is unavoidable Python startup, and “urllib.request” is probably many times slower than “urllib.parse”.

The reason for the slowness is the dependencies and initialization. The “urllib.parse” module only imports a few commonly-used modules. On the other hand, importing “urllib.request” imports many heavyweight high-level modules directly and indirectly (email submodules in particular, also things like SSL, multithreading, HTTP client, temporary files). Some of these dependencies also compile lots of regular expressions at import time.

The slowdown can be a problem for things like command-line programs. Just today I found “circusd --help” on a Raspberry Pi took ~5 s to produce output.

The case of “os.path” is different: it isn’t a submodule of “os”. It is just a pointer to “posixpath”, “ntpath”, etc, depending on “os.name”.
msg340737 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2019-04-23 17:08
I'm -1 on pulling `request` up to implicitly be part of the `urllib` namespace without an import. `os.path` is the only exception that I know of in the stdlib and that's historical (it predates packages existing in the language). Otherwise the proposed change is suggesting that we automatically import all submodules to the top of a package which is expensive (as Martin pointed out), and simply not how Python's import is meant to be used.

And the statement that "`request` is part of `urllib`" is true today, just like saying any submodule is part of a package. If that wording is confusing then a PR to tweak the wording could be considered.

Since there are 3 core devs who are against this idea I'm closing this as "not a bug". Thanks for the idea regardless, Piyush!
History
Date User Action Args
2019-04-23 17:08:29brett.cannonsetstatus: open -> closed
resolution: not a bug
messages: + msg340737

stage: resolved
2019-04-23 15:02:50rhettingersetnosy: + brett.cannon
2019-04-23 08:57:12martin.pantersetnosy: + martin.panter
messages: + msg340706
2019-04-23 07:55:09berker.peksagsetnosy: + berker.peksag
messages: + msg340697
2019-04-23 06:04:21orsenthilsetnosy: + orsenthil
messages: + msg340691
2019-04-23 05:56:21piyush-kgpcreate