This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Unicode fix for test in tkFileDialog.py
Type: Stage:
Components: Tkinter Versions:
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: loewis Nosy List: ber, loewis
Priority: normal Keywords: patch

Created on 2002-04-04 18:59 by ber, last changed 2022-04-10 16:05 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
python_cvs20020404_tkFileDialog_umlaut.patch ber, 2002-04-04 18:59 Patch for python cvs 20020404
Messages (4)
msg39464 - (view) Author: Bernhard Reiter (ber) (Python committer) Date: 2002-04-04 18:59
Patch is against current CVS form 20020404.
It also gives pointers to the problem described
in
http://mail.python.org/pipermail/python-list/2001-June/048787.html



Python's open() uses the Py_FileSystemDefaultEncoding.
Py_FileSystemDefaultEncoding is NULL (bltinmodule.c)
for most systems.
Setlocate will set it.  Thus we fixed the example and
set the locale to
the user defaults. Now "enc" will have a useful
encoding thus the
example will work with a non ascii characters in the
filename,
e.g. with umlauts in it.  It bombed on them before.

        Traceback (most recent call last):
  File "tkFileDialog.py", line 105, in ?
    print "open", askopenfilename(filetypes=[("all
filez", "*")])
  UnicodeError: ASCII encoding error: ordinal not in
range(128)

open() will work with the string directly now.
encode(enc) is only needed for terminal output,
thus we enchanced the example to show the two uses of
the returned filename
string separatly.

(It might be interesting to drop a note about this in
the right part of the user documentation.)

If you comment out the setlocale() you can see that
open fails,
which illustrates what seems to be a design flaw in tk.
Tk should be able to give you a string in exactly
the encoding in which the filesystem gave it to tk.


4.4.2002
Bernhard <bernhard@intevation.de>
Bernhard <bh@intevation.de>
msg39465 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-04-04 20:16
Logged In: YES 
user_id=21627

I think this patch is not acceptable. If the application
wants to support non-ASCII file names, it must invoke
setlocale(); it is not the library's responsibility to make
this decision behind the application's back.

People question the validity of using CODESET in the file
system, so each developer needs to make a concious decision.
BTW, how does Tcl come up with the names in the first place?
msg39466 - (view) Author: Bernhard Reiter (ber) (Python committer) Date: 2002-04-07 18:20
Logged In: YES 
user_id=113859

I agree with your analysis that the appplication has
to set the locale, if it wants to support non-ASCII filenames.

This is why we fixed the _test_ code to demonstrate exactly
this. The code of the modules itself is untouched.
If you do not fix the _test_ code it will bomb on non-ascii
file names.

Our code also demonstrates that there might be a difference
in the file system encoding (suitable for open) and the
terminal encoding (suitable for printing).
msg39467 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2002-04-08 15:00
Logged In: YES 
user_id=21627

Sorry, I misinterpreted your patch first.

I agree with your distinction of a file system encoding, and
a terminal encoding; I still hope to enhance Python to
expose an estimate of both - then leaving it to the
application to make use of either as appropriate (the file
system encoding would be used implicitly as is done today).

As for the flaw in Tk: it turns out that Tcl has a different
notion of the default encoding than Python - Tcl always uses
a locale-aware default encoding, whereas Python has a
system-wide fixed default encoding (usually ASCII). 

It is a good thing that Tkinter manages to represent file
names correctly (i.e. as Unicode strings) in most cases - if
you want to get the file name in the encoding in which the
file system gave it to you, you need to establish the value
of Tcl's "encoding system" command.

Committed as tkFileDialog.py 1.7.
History
Date User Action Args
2022-04-10 16:05:11adminsetgithub: 36382
2002-04-04 18:59:22bercreate