classification
Title: msilib.Directory.make_short only handles file names with a single dot in them
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: loewis Nosy List: BreamoreBoy, amaury.forgeotdarc, atuining, brian.curtin, cgohlke, hnrqbaggio, janssen, loewis, markm, tim.golden
Priority: normal Keywords: patch

Created on 2007-09-07 15:38 by atuining, last changed 2011-03-27 08:20 by loewis. This issue is now closed.

Files
File name Uploaded Description Edit
msilib.__init__.patch atuining, 2007-09-07 15:38 review
msilib.__init__.patch hnrqbaggio, 2009-05-03 00:55 review
msilib-2.patch amaury.forgeotdarc, 2009-05-04 16:46
msilib.make_short.patch markm, 2011-03-27 04:13 Patch for Make Short
Messages (13)
msg55736 - (view) Author: Anthony Tuininga (atuining) * Date: 2007-09-07 15:38
Attached is a patch that fixes the handling of file names with 0 or 2 or
more dots in them.
msg86990 - (view) Author: Henrique Baggio (hnrqbaggio) Date: 2009-05-03 00:34
Sorry, I don't know how create a patch, but just change the line with 

parts = file.split(".") to parts = os.path.splitext(file)

and the problem is fixed.
msg86991 - (view) Author: Henrique Baggio (hnrqbaggio) Date: 2009-05-03 00:55
I create a patch using the os.path.splitext function.
msg87138 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2009-05-04 16:46
The current patch is not correct, because os.path.splitext returns the
extension with the leading dot.

Here is another patch that simplifies the code (os.path.splitext is
guaranteed to return two strings)
It also adds the first unit test for msilib.

There is an unresolved issue: what is make_short('foo.2.txt') supposed
to return? FOO.2.TXT or FOO~1.TXT ?
msg87145 - (view) Author: Henrique Baggio (hnrqbaggio) Date: 2009-05-04 18:16
@Amaury,

Sorry my mistake. I forgot splitext returns a tuple. =/

About your question, if the file name has less then 8 characters, then 
the function don't change it. Else, it return tha name with 8 chars.

e.g., make_short(foo.2.txt) returns FOO.2.TXT
and make_short(foo.longer_name.txt) returns FOO.LO~1.TXT
msg104031 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2010-04-23 17:32
So what happens if the original file name is something like "foo~1.txt"?  Couldn't there be a name collision?
msg104035 - (view) Author: Bill Janssen (janssen) * (Python committer) Date: 2010-04-23 18:05
Here's how Microsoft does it:  http://support.microsoft.com/kb/142982/en-us
msg104050 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2010-04-23 20:40
MSI short names must be 8.3 names. Microsoft has them in MSI in case the MSI file gets installed on an 8.3 system, or in case 8.3 applications want to access files and want to be sure they access the right one. So the actual numbering is completely irrelevant (since we only support systems with long file name support, and will always use long names to access the files). They still need to be in the 8.3 space, though, else Installer might be unhappy. So having two dots in the short name is incorrect.
msg105577 - (view) Author: Christoph Gohlke (cgohlke) Date: 2010-05-12 08:08
A slightly different patch is attached to issue7639, which generates short names more similar to Windows/NTFS:
 
http://bugs.python.org/file15898/msilib_make_short.diff

Here are some short names created with the msilib_make_short patch, which are identical to the short names created by the Windows NTFS file system:

foo.txt             ->  FOO.TXT
foo.2.txt           ->  FOO2~1.TXT
someLongName.txt    ->  SOMELO~1.TXT
someLongerName.txt  ->  SOMELO~2.TXT

For comparison, the msilib-2 patch generates these short names:

foo.txt             ->  FOO.TXT
foo.2.txt           ->  FOO.2.TXT    <- different from NTFS
someLongName.txt    ->  SOMELO~1.TXT
someLongerName.txt  ->  SOMELO~2.TXT
msg116775 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-09-18 13:24
@Brian/Tim do have have any input on this?  Also note that a similar patch exists on issue7639.
msg119374 - (view) Author: Christoph Gohlke (cgohlke) Date: 2010-10-22 09:18
The revised patch for issue7639 now generates better short names for file names containing spaces, '+', and leading '.'.

http://bugs.python.org/file19334/msilib.diff

Test cases that could be added to MsilibTest.test_makeshort():

TEST          :  test
TES~1.T       :  t.e.s.t
TEST~1        :  .test
TESTTEST      :  testtest
TESTTE~1      :  .testtest
TESTTE~2      :  test test
TESTTE~3      :  test test test
AFILE~1.DOC   :  A file.doc
THISIS~1.TXT  :  This is a really long filename.123.456.789.txt
THISIS~1.789  :  This is a really long filename.123.456.7890
TEST__~1      :  test++
TE____~1      :  te++++++
TEST~1.__     :  test.++
TEST.123      :  test.123
TEST~1.123    :  test.1234
TEXT~1.123    :  text.1234
TESTTE~1.__   :  testtest.++
FOO.TXT       :  foo.txt
FOO2~1.TXT    :  foo.2.txt
SOMELO~1.TXT  :  someLongName.txt
SOMELO~2.TXT  :  someLongerName.txt
PY~15~1.~     :  py.~1.5.~
msg132288 - (view) Author: Mark Mc Mahon (markm) * Date: 2011-03-27 04:13
I looked at the existing patches - and noted that they went closer to how Windows does short files - but still left out some cases.

I believe the latest patch catches all cases.

from http://msdn.microsoft.com/en-us/library/aa368590(v=vs.85).aspx
    Short and long file names must not contain the following characters:
        slash (/) or (\)
        question mark (?)
        vertical bar (|)
        right angle bracket (>)
        left angle bracket (<)
        colon (:)
        asterisk (*)
        quotation mark (")

    In addition, short file names must not contain the following characters:
        plus sign (+)
        comma (,)
        semicolon (;)
        equals sign (=)
        left square bracket ([)
        right square bracket (])

    No space is allowed preceding the vertical bar (|) separator for the short file name/long file name syntax. Short file names may not include a space, although a long file name may. A space can exist after the separator only if the long file name of the file name begins with the space. No full-path syntax is allowed.

Though I wonder do we really need to check for or replace the first set of characters above - none are allowed in any file name, so if they are there it is probably a error in how the function was called!

I also tested speed of re.sub, comprehension ("".join(c for c in ...) and for loops - and for loops were the fasted (for the small set of characters being replaced).

I am not patching make_id() - because I have added a patch for that to issue2694.

Note - The attached patch will probably not apply cleanly - as it pre-supposes that the patch (http://bugs.python.org/file21408/make_id_fix_and_test.patch) from issue2694 is applied first (especially for the tests)
msg132294 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2011-03-27 08:20
This is now fixed with Christoph Gohlke's patch in issue 7639. If anything remains to be done, please submit a new issue (rather than posting to this one).
History
Date User Action Args
2011-03-27 08:20:15loewissetstatus: open -> closed
resolution: fixed
messages: + msg132294
2011-03-27 04:13:24markmsetfiles: + msilib.make_short.patch
nosy: + markm
messages: + msg132288

2010-10-22 09:19:00cgohlkesetmessages: + msg119374
2010-09-18 13:24:56BreamoreBoysetnosy: + tim.golden, brian.curtin, BreamoreBoy
messages: + msg116775
2010-05-12 08:08:32cgohlkesetnosy: + cgohlke
messages: + msg105577
2010-04-23 20:40:49loewissetmessages: + msg104050
2010-04-23 18:05:22janssensetmessages: + msg104035
2010-04-23 17:32:32janssensetnosy: + janssen
messages: + msg104031
2009-05-04 18:16:24hnrqbaggiosetmessages: + msg87145
2009-05-04 16:46:40amaury.forgeotdarcsetfiles: + msilib-2.patch
nosy: + amaury.forgeotdarc
messages: + msg87138

2009-05-03 00:55:50hnrqbaggiosetfiles: + msilib.__init__.patch

messages: + msg86991
2009-05-03 00:34:43hnrqbaggiosetnosy: + hnrqbaggio
messages: + msg86990
2009-04-07 04:04:46ajaksu2setstage: test needed
type: behavior
versions: + Python 2.6, Python 3.0, - Python 2.5
2007-09-17 08:40:51jafosetpriority: normal
2007-09-07 17:06:08loewissetkeywords: + patch
assignee: loewis
nosy: + loewis
2007-09-07 15:38:40atuiningcreate