New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
urllib.request.open(someURL).read() returns a bytes object so writing it requires binary mode #49669
Comments
There needs to be something somewhere in the documentation that makes I am not sure what documentation should be changed, but I do think I wanted to read from a web page, make some string replacements, and with open('url.html', 'w') as fil:
fil.write(urllib.request.open(aURL).read()).replace(str1, str2) The first thing that happened was an error telling me that I can't write So I converted it to a string, but that put a b' at the beginning of the Instead of str(thebytes) I did the proper thing: thebytes.decode(), and But then I found that Non-ASCII characters created problems -- they were So I tried decoding using different codecs but couldn't find one that Finally I realized that the whole thing was a delusion: obviously I went back to the relevant documentation multiple times, including I apologize in advance if the requested documentation exists and I |
I got struck by the same feature. In addition, currently the docs are wrong in the examples (at http://docs.python.org/dev/py3k/library/urllib.request.html#examples the output of f.read() is a string instead of bytes). There I propose the change from >>> import urllib.request
>>> f = urllib.request.urlopen('http://www.python.org/')
>>> print(f.read(100))
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<?xml-stylesheet href="./css/ht2html to >>> import urllib.request
>>> f = urllib.request.urlopen('http://www.python.org/')
>>> print(f.read(100).decode('utf-8'))
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtm The other examples need to be corrected in a similar way. In the documentation of urllib.request.urlopen I propose to add a sentence (after the paragraph "This function returns a file-like object...") explaining that reading the object returns bytes that need to be decoded to a string: |
Yeah, there a example in the tutorial that was changed recently along similar lines suggested. (http://docs.python.org/dev/py3k/tutorial/stdlib.html#internet-access) |
Fixed in revision 80092 and merged into release31-maint in revision 80093. I am marking this as fixed and closed. If there are any similar issues at other places, we will address them as separate bugs. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: