Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in

#27959: Add 'oem' encoding

Can't Edit
Can't Publish+Mail
Start Review
2 years, 8 months ago by steve.dower
2 years, 8 months ago
pmoore, haypo, tim.golden, devnull_psf.upfronthosting.co.za, Zach Ware, steve.dower

Patch Set 1 #

Total comments: 4

Patch Set 2 #

Total comments: 8
Unified diffs Side-by-side diffs Delta from patch set Stats Patch
Include/unicodeobject.h View 1 1 chunk +1 line, -1 line 0 comments Download
Lib/encodings/aliases.py View 1 1 chunk +1 line, -0 lines 0 comments Download
Lib/encodings/__init__.py View 1 2 chunks +10 lines, -0 lines 2 comments Download
Lib/site.py View 1 2 chunks +0 lines, -16 lines 0 comments Download
Lib/test/test_codecs.py View 1 9 chunks +31 lines, -33 lines 4 comments Download
Modules/clinic/_codecsmodule.c.h View 1 4 chunks +80 lines, -1 line 0 comments Download
Modules/_codecsmodule.c View 1 3 chunks +36 lines, -0 lines 0 comments Download
Objects/unicodeobject.c View 1 1 chunk +6 lines, -0 lines 2 comments Download


Total messages: 4
Hum, it looks like Lib/encodings/oem.py is missing: copy/paste Lib/encodings/mbcs.py and replace mbcs with oem? :-) ...
2 years, 8 months ago #1
http://bugs.python.org/review/27959/diff/18349/Include/unicodeobject.h File Include/unicodeobject.h (right): http://bugs.python.org/review/27959/diff/18349/Include/unicodeobject.h#newcode1677 Include/unicodeobject.h:1677: PyAPI_FUNC(PyObject*) PyUnicode_DecodeOEM( On 2016/09/06 03:01:51, haypo wrote: > Hum, ...
2 years, 8 months ago #2
http://bugs.python.org/review/27959/diff/18363/Lib/encodings/__init__.py File Lib/encodings/__init__.py (right): http://bugs.python.org/review/27959/diff/18363/Lib/encodings/__init__.py#newcode161 Lib/encodings/__init__.py:161: return encodings.mbcs.getregentry() Again, I dislike calling getpreferredencoding() at each ...
2 years, 8 months ago #3
2 years, 8 months ago #4
File Lib/encodings/__init__.py (right):

Lib/encodings/__init__.py:161: return encodings.mbcs.getregentry()
On 2016/09/07 00:45:34, haypo wrote:
> Again, I dislike calling getpreferredencoding() at each search. It adds an
> overhead to all lookups. I prefer the current code in site.py, just move it
> here, no? (but the startswith('cp') test can be removed.)

The results are all cached, and this only does the call (which is just "cp%d" %
GetACP() ) when lookup is otherwise about to fail. The current aliasing code is
far less clean than this.

Encoding lookup is inherently a slow operation - we've already failed at least
one call to __import__ by the time we get here. Creating a string is trivial by

File Lib/test/test_codecs.py (right):

Lib/test/test_codecs.py:2654: transform_aliases["mbcs"] = ["dbcs", "ansi"]
On 2016/09/07 00:45:34, haypo wrote:
> This change doesn't make sense. mbcs is not a "transform" codec. Please revert
> this change.

Okay. How can I test that the aliases are correct and have not changed?

Lib/test/test_codecs.py:3138: _bootlocale.getpreferredencoding =
On 2016/09/07 00:45:34, haypo wrote:
> Why do you mock the ANSI code page?
> Why not directly testing the actual ANSI code page?
> ansi_code_page = locale.getpreferredencoding(False)
> codec = codecs.lookup(ansi_code_page)

I want to test that it's only returned when it matches the user's current CP,
otherwise we're changing the semantics from what was previously there. mbcs is
not a suitable alternative for all unknown codepages, only when it's our current

And many code pages are well defined (i.e. encodings.cp*), in which case we
won't be testing the fallback code against an otherwise unknown codepage.

File Objects/unicodeobject.c (right):

Objects/unicodeobject.c:7780: return PyUnicode_EncodeCodePage(CP_OEMCP, unicode,
On 2016/09/07 00:45:34, haypo wrote:
> This function is useless, please just remove it.

Sign in to reply to this message.

RSS Feeds Recent Issues | This issue
This is Rietveld 894c83f36cb7+