Issue 22549: bug in accessing bytes, inconsistent with normal strings and python 2.7

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/66739

classification

Title:	bug in accessing bytes, inconsistent with normal strings and python 2.7
Type:	behavior	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.4, Python 2.7

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	kevinbhendricks, r.david.murray
Priority:	normal	Keywords:

Created on 2014-10-03 17:36 by kevinbhendricks, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 241	merged	marco.buttu, 2017-02-22 21:34

Messages (3)
msg228348 - (view)	Author: Kevin Hendricks (kevinbhendricks)	Date: 2014-10-03 17:36
Hi, I am working on porting my ebook code from Python 2.7 to work with both Python 2.7 and Python 3.4 and have found the following inconsistency I think is a bug ... KevinsiMac:~ kbhend$ python3 Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> o = '123456789' >>> o[-3] '7' >>> type(o[-3]) <class 'str'> >>> type(o) <class 'str'> the above is what I expected but under python 3 for bytes you get the following instead: >>> o = b'123456789' >>> o[-3] 55 >>> type(o[-3]) <class 'int'> >>> type(o) <class 'bytes'> When I compare this to Python 2.7 for both bytestrings and unicode I see the expected behaviour. Python 2.7.7 (v2.7.7:f89216059edf, May 31 2014, 12:53:48) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> o = '123456789' >>> o[-3] '7' >>> type(o[-3]) <type 'str'> >>> type(o) <type 'str'> >>> o = u'123456789' >>> o[-3] u'7' >>> type(o[-3]) <type 'unicode'> >>> type(o) <type 'unicode'> I would consider this a bug as it makes it much harder to write python code that works on both python 2.7 and python 3.4
msg228363 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2014-10-03 19:06
Agreed, but that is a design decision that was taken long ago (regretted by more than a few but defended by others). You can find a number of discussions of this by searching the python-dev archives, including some more recent discussions on possibilities for lessening the pain, but I don't remember if any of those turned into real proposals. For now, you can find some helpers in six, or you can write your code using slice notation (b'abc'[1:2] == b'b').
msg228385 - (view)	Author: Kevin Hendricks (kevinbhendricks)	Date: 2014-10-03 21:20
Thanks for letting me know this was expected behaviour. I see the same "issue" holds true while using: for c in b'0123456789': print(ord(c)) I ended up using slices nearly everyplace. Still ran into iterator issues. Horrible hack really. I think I will spend some time reading the python dev archives to figure out how anyone could defend this approach. FWIW, introducing a bytes class that works exactly like byte (non-unicode strings) in python 2.X but disallowing any automatic up-conversion to full unicode (like during concatenation), would have been a useful step. I work on decoding binary formatted ebook files all of the time, and python 3's second class treatment of bytes makes no sense to me. Perfectly valid code can be written using only utf-8 and latin-1 encoded bytestrings with no need to upconvert to anything. It is practically impossible to support code like that in Python 3. Boggles the mind. Thanks again for the fast response. Kevin

History
Date	User	Action	Args
2022-04-11 14:58:08	admin	set	github: 66739
2017-02-22 21:34:27	marco.buttu	set	pull_requests: + pull_request203
2014-10-03 21:20:15	kevinbhendricks	set	messages: + msg228385
2014-10-03 19:06:21	r.david.murray	set	status: open -> closed nosy: + r.david.murray messages: + msg228363 resolution: not a bug stage: resolved
2014-10-03 17:36:30	kevinbhendricks	create