classification
Title: calendar formatyearpage returns bytes, not str
Type: Stage:
Components: Extension Modules Versions: Python 3.0
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, ggenellina, mnewman
Priority: normal Keywords:

Created on 2009-01-17 17:56 by mnewman, last changed 2010-08-21 23:38 by georg.brandl. This issue is now closed.

Messages (4)
msg80030 - (view) Author: Michael Newman (mnewman) Date: 2009-01-17 17:56
formatyearpage is returning "bytes", not "str"

Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit
(Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import calendar
>>> calendar.HTMLCalendar().formatyearpage(2009)[0:50]
b'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE h'
>>> type(calendar.HTMLCalendar().formatyearpage(2009)[0:50])
<class 'bytes'>

# For the time being, to fix it I can use "decode"...
>>> calendar.HTMLCalendar().formatyearpage(2009).decode("utf-8")[0:50]
'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE h'
>>> type(calendar.HTMLCalendar().formatyearpage(2009).decode("utf-8")[0:50])
<class 'str'>
msg80081 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009-01-18 09:34
This is the expected behavior; that's why the function takes an 
"encoding" argument. As it returns a complete XML document, it must be 
already encoded. Other methods return just document pieces, so str is 
fine. Probably should be better explained in the documentation.
msg80107 - (view) Author: Michael Newman (mnewman) Date: 2009-01-18 15:51
It seems to be working consistently (see UTF-16 extreme example below),
but I had expected it to act similarly to Python 2.6, which it does not.
I suppose this is due to the distinction now made between strings and
bytes in Python 3.0.

I was initially concerned that Python 3.0 was always just giving an
ASCII byte stream no matter what encoding was chosen (since you can't
tell between ASCII and UTF-8 for the characters being used in the
example), but the UTF-16 example shows its fine.

I agree that as long as the documentation in Python 3.X notes it will
return "bytes", then its fine. Thanks for the clarification/confirmation.

Python 3.0 (r30:67507, Dec  3 2008, 20:14:27) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import calendar
>>> calendar.HTMLCalendar().formatyearpage(2009, encoding="utf-8")[0:50]
b'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE h'
>>> calendar.HTMLCalendar().formatyearpage(2009, encoding="ascii")[0:50]
b'<?xml version="1.0" encoding="ascii"?>\n<!DOCTYPE h'
>>> calendar.HTMLCalendar().formatyearpage(2009, encoding="utf-16")[0:50]
b'\xff\xfe<\x00?\x00x\x00m\x00l\x00
\x00v\x00e\x00r\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"\x00
\x00e\x00n\x00c\x00o\x00'

Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import calendar
>>> calendar.HTMLCalendar().formatyearpage(2009, encoding="utf-8")[0:50]
'<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE h'
>>> calendar.HTMLCalendar().formatyearpage(2009, encoding="ascii")[0:50]
'<?xml version="1.0" encoding="ascii"?>\n<!DOCTYPE h'
>>> calendar.HTMLCalendar().formatyearpage(2009, encoding="utf-16")[0:50]
'\xff\xfe<\x00?\x00x\x00m\x00l\x00
\x00v\x00e\x00r\x00s\x00i\x00o\x00n\x00=\x00"\x001\x00.\x000\x00"\x00
\x00e\x00n\x00c\x00o\x00'
msg114608 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-08-21 23:38
Closing, this is working as expected.
History
Date User Action Args
2010-08-21 23:38:30georg.brandlsetstatus: open -> closed

nosy: + georg.brandl
messages: + msg114608

resolution: works for me
2009-01-18 15:52:01mnewmansetmessages: + msg80107
2009-01-18 09:34:41ggenellinasetnosy: + ggenellina
messages: + msg80081
2009-01-17 17:56:54mnewmancreate