Message79322
About pwd, we have 7 fields:
- username: the regex looks like « [a-zA-Z0-9_.@]
[a-zA-Z0-9_.@\/]*$? », so it's ASCII only
- password: ASCII only? on my Ubuntu, /etc/passwd uses "x" for all
passwords, and /etc/shadow uses MD5 hash with a like
like "$1$x6vJEXyc$" (MD5 marker + salt)
- user identifier: integer (ASCII)
- main group identifier: integer (ASCII)
- GECOS: user text
- shell: filename
- home directory: filename
We can expect GECOS and filenames to be encoded in the "default system
locale" (eg. latin-1 or UTF-8). An user is allowed to change its GECOS
field. If the user account use a different locale and set a non-ASCII
GECOS, decoding the string (to unicode) will fail.
Your patch latin1.diff is wrong: the charset is not always latin-1 or
always utf-8: it depends on the system default charset. You should use
sys.getfilesystemencoding() or locale.getpreferredencoding() to get
the right encoding. If you used latin-1 as automagic charset to get
text as bytes, it's not the good solution: use the bytes type to get
real bytes (as you implemented with your get*b() functions).
The situation is similar to the bytes/unicode filename debate (see
issue #3187). I think that we can consider that a system correctly
configured will use the same locale for all users accounts => use
unicode. But for compatibility with old systems mixing different
locales / or new system with locale problems => use bytes.
The default should be unicode, but we need to be able get all fields
as bytes. Example:
pwd.getpwnam(str) -> str fields (and integers for uid/gid)
pwd.getpwnamb(bytes) -> bytes fields (and integers for uid/gid)
We have already bytes/unicode functions using the "b" suffix:
os.getpwd()->str and os.getpwdb()->bytes.
Note: The GECOS field problem was already reported in issue #3023 (by
baikie). |
|
Date |
User |
Action |
Args |
2009-01-07 11:30:42 | vstinner | set | recipients:
+ vstinner, loewis, baikie |
2009-01-07 11:30:41 | vstinner | set | messageid: <1231327841.3.0.949817677105.issue4859@psf.upfronthosting.co.za> |
2009-01-07 11:30:40 | vstinner | link | issue4859 messages |
2009-01-07 11:30:39 | vstinner | create | |
|