Author wombat
Recipients Santiago.Romero, belopolsky, benjamin.peterson, cgwalters, dexen, doughellmann, eric.araujo, ezio.melotti, fperez, loewis, mark.dickinson, mcepl, nwerneck, orsenthil, r.david.murray, rhettinger, vstinner, wombat
Date 2011-09-15.19:52:00
SpamBayes Score 7.90201e-13
Marked as misclassified No
Message-id <1316116321.43.0.883789811618.issue1170@psf.upfronthosting.co.za>
In-reply-to
Content
> That can be done programmatically using the unicodedata module.  
> The regex module (that will hopefully be include in 3.3) is 
> also able to match characters that belongs to specific categories.

Ezio:  Thanks.  (New to me, actually)  Is this what you mean?:
http://www.regular-expressions.info/unicode.html
For the purposes of patching shlex, should we use regex instead of sets of characters (or strings) to test for membership in shlex.wordterminators?  (Or should we create a different class member?  Unfortunately, I guess shlex.wordchars has to be left as some kind of container object to maintain backwards compatibility.)
Something like that would definitely solve the problem nicely.

> Andrew: Thanks for your contribution, but your patch cannot 
> go into 2.7, as we don’t add new features in stable versions

Eric: That's fine.  I just posted here because this page currently gets the top hit when searching for "shlex unicode".  If you think it's appropriate to repost my message for python version 3.4, let me know.  The issue with shlex.wordchars that I raised is valid for any version of python.  I'm not sure my solution is optimal.  (I like the regex idea).
History
Date User Action Args
2011-09-15 19:52:01wombatsetrecipients: + wombat, loewis, rhettinger, mark.dickinson, belopolsky, orsenthil, vstinner, dexen, benjamin.peterson, cgwalters, mcepl, ezio.melotti, eric.araujo, doughellmann, r.david.murray, nwerneck, fperez, Santiago.Romero
2011-09-15 19:52:01wombatsetmessageid: <1316116321.43.0.883789811618.issue1170@psf.upfronthosting.co.za>
2011-09-15 19:52:00wombatlinkissue1170 messages
2011-09-15 19:52:00wombatcreate