classification
Title: Codec naming scheme and aliasing support
Type: enhancement Stage:
Components: Unicode Versions:
process
Status: closed Resolution: remind
Dependencies: Superseder:
Assigned To: lemburg Nosy List: jhylton, lemburg
Priority: low Keywords:

Created on 2000-12-12 14:51 by lemburg, last changed 2007-08-23 19:27 by lemburg. This issue is now closed.

Messages (9)
msg53058 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2000-12-12 14:51
The docs should contain a note about the codec naming scheme,
the use of codec packages and how to address them in the
encoding name and some notes about the aliasing support
which is available for codecs which are found by the standard
codec search function in the encodings package.

Here's a starter (actually a posting to python-dev, but it has all
the needed details):
"""
I just wanted to inform you of a change I plan for the standard
encodings search function to enable better support for aliasing
of encoding names.

The current implementation caches the aliases returned from the
codecs .getaliases() function in the encodings lookup cache
rather than in the alias cache. As a consequence, the hyphen to
underscore mapping is not applied to the aliases. A codec would
have to return a list of all combinations of names with hyphens
and underscores in order to emulate the standard lookup 
behaviour.

I have a ptach which fixes this and also assures that aliases
cannot be overwritten by codecs which register at some later
point in time. This assures that we won't run into situations
where a codec import suddenly overrides behaviour of previously
active codecs. [The patch was checked into CVS on 2000-12-12.]

I would also like to propose the use of a new naming scheme
for codecs which enables drop-in installation. As discussed
on the i18n-sig list, people would like to install codecs
without having the users to call a codec registration function
or to touch site.py.

The standard search function in the encodings package has a
nice property (which I only noticed after the fact ;) which
allows using Python package names in the encoding names,
e.g. you can install a package 'japanese' and the access the
codecs in that package using 'japanese.shiftjis' without
having to bother registering a new codec search function
for the package -- the encodings package search function
will redirect the lookup to the 'japanese' package.

Using package names in the encoding name has several
advantages:
* you know where the codec comes from
* you can have mutliple codecs for the same encoding
* drop-in installation without registration is possible
* the need for a non-default encoding package is visible in the
  source code
* you no longer need to drop new codecs into the Python
  standard lib

"""
msg53059 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2001-08-16 10:48
Logged In: YES 
user_id=38388

This should probably become a PEP...

I'll look into this after I'm back from vacation on the 10.09.
msg53060 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2002-03-01 22:39
Logged In: YES 
user_id=31392

Is this a bug?  Or should you just write a PEP?
msg53061 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2003-06-19 02:27
Logged In: YES 
user_id=31392

No real activity here in more than 2 years.  Move it to PEP 42 
if you don't want to forget.
msg53062 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2003-06-19 02:33
Logged In: YES 
user_id=31392

No real activity here in more than 2 years.  Move it to PEP 42 
if you don't want to forget.
msg53063 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2003-06-19 02:33
Logged In: YES 
user_id=31392

No real activity here in more than 2 years.  Move it to PEP 42 
if you don't want to forget.
msg53064 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2003-06-19 08:28
Logged In: YES 
user_id=38388

PEP 42 wouldn't help. The functionality mentioned in the
above quote is all there, so this is just about writing an
informational PEP as guideline for codec authors.

I've set the request to pending and lowered the priority.
msg53065 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2003-06-19 08:35
Logged In: YES 
user_id=38388

Changed tracker type to "Feature Request" and state to open
again.
The "pending" state doesn't seem to remain listed on the
My.SF page.
msg55190 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2007-08-23 19:27
Closing this request as the encodings package search function should not
be used import external codecs (this poses a security risk).
History
Date User Action Args
2007-08-23 19:27:24lemburgsetstatus: open -> closed
messages: + msg55190
2000-12-12 14:51:24lemburgcreate