Issue 17850: unicode_escape encoding fails for '\\Upsilon'

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/62050

classification

Title:	unicode_escape encoding fails for '\\Upsilon'
Type:	behavior	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 2.7

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	Edward.K..Ream, edreamleo, ezio.melotti, r.david.murray
Priority:	normal	Keywords:

Created on 2013-04-26 13:44 by Edward.K..Ream, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)
msg187852 - (view)	Author: Edward K. Ream (Edward.K..Ream)	Date: 2013-04-26 13:44
On both windows and Linux the following fails on Python 2.7: s = '\\Upsilon' unicode(s,"unicode_escape") UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 0-7: end of string in escape sequence BTW, the six.py package uses this call. If this call doesn't work, six is broken.
msg187853 - (view)	Author: Ezio Melotti (ezio.melotti) *	Date: 2013-04-26 13:47
This is not a bug, \U should be followed by 8 hex digits and it indicates a Unicode codepoint: >>> '\\u0065'.decode('unicode_escape') u'e' >>> '\\U00000065'.decode('unicode_escape') u'e' >>> '\\Upsilon'.decode('unicode_escape') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'unicodeescape' codec can't decode bytes in position 0-7: end of string in escape sequence >>> u'\Upsilon' File "<stdin>", line 1 SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-7: end of string in escape sequence >>> u'\U00000065' u'e'
msg187854 - (view)	Author: Edward K. Ream (Edward.K..Ream)	Date: 2013-04-26 13:51
Thanks for your quick reply. If this is not a bug, why does six define six.u as unicode(s,"unicode_escape") for all u constants??
msg187855 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2013-04-26 13:54
Because as Ezio demonstrated, it produces the same result as using the 'u' prefix on the same string.
msg187858 - (view)	Author: Edward K Ream (edreamleo) *	Date: 2013-04-26 14:26
On Fri, Apr 26, 2013 at 8:51 AM, Edward K. Ream <report@bugs.python.org>wrote: > > If this is not a bug, why does six define six.u as > unicode(s,"unicode_escape") for all u constants?? > Oops. The following works:: s = r'\\Upsilon' unicode(s,"unicode_escape") My apologies for the noise. Edward

History
Date	User	Action	Args
2022-04-11 14:57:44	admin	set	github: 62050
2013-04-26 14:26:16	edreamleo	set	nosy: + edreamleo messages: + msg187858
2013-04-26 13:54:43	r.david.murray	set	nosy: + r.david.murray messages: + msg187855
2013-04-26 13:51:11	Edward.K..Ream	set	messages: + msg187854
2013-04-26 13:47:57	ezio.melotti	set	status: open -> closed type: crash -> behavior nosy: + ezio.melotti messages: + msg187853 resolution: not a bug stage: resolved
2013-04-26 13:44:53	Edward.K..Ream	create