Message 277207 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	eryksun
Recipients	AndreyTomsk, eryksun, ezio.melotti, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
Date	2016-09-22.08:29:10
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1474532950.35.0.255188406961.issue28246@psf.upfronthosting.co.za>
In-reply-to

Content
The default encoding on your system is Windows codepage 1251. However, your file is encoded using UTF-8: >>> lines = open('ResourceStrings.rc', 'rb').read().splitlines() >>> print(*lines, sep='\n') b'\xef\xbb\xbf\xd0\x90 (cyrillic A)' b'\xd0\x98 (cyrillic I) <<< line read fails' b'\xd0\x91 (cyrillic B)' It even has a UTF-8 BOM (i.e. b'\xef\xbb\xbf'). You need to pass the encoding to built-in open(): >>> print(open('ResourceStrings.rc', encoding='utf-8').read()) А (cyrillic A) И (cyrillic I) <<< line read fails Б (cyrillic B)

The default encoding on your system is Windows codepage 1251. However, your file is encoded using UTF-8:

    >>> lines = open('ResourceStrings.rc', 'rb').read().splitlines()
    >>> print(*lines, sep='\n')
    b'\xef\xbb\xbf\xd0\x90 (cyrillic A)'
    b'\xd0\x98 (cyrillic I) <<< line read fails'
    b'\xd0\x91 (cyrillic B)'

It even has a UTF-8 BOM (i.e. b'\xef\xbb\xbf'). You need to pass the encoding to built-in open():

    >>> print(open('ResourceStrings.rc', encoding='utf-8').read())
    А (cyrillic A)
    И (cyrillic I) <<< line read fails
    Б (cyrillic B)

History
Date	User	Action	Args
2016-09-22 08:29:10	eryksun	set	recipients: + eryksun, paul.moore, vstinner, tim.golden, ezio.melotti, zach.ware, steve.dower, AndreyTomsk
2016-09-22 08:29:10	eryksun	set	messageid: <1474532950.35.0.255188406961.issue28246@psf.upfronthosting.co.za>
2016-09-22 08:29:10	eryksun	link	issue28246 messages
2016-09-22 08:29:10	eryksun	create