This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients lemburg, loewis, nathanlmiles, rsc, terry.reedy, timehorse
Date 2008-11-28.21:33:40
SpamBayes Score 0.00015994108
Marked as misclassified No
Message-id <1227908021.62.0.877440749758.issue1693050@psf.upfronthosting.co.za>
In-reply-to
Content
Unicode TR#18 defines \w as a shorthand for

\p{alpha}
\p{gc=Mark}
\p{digit}
\p{gc=Connector_Punctuation}

which would include all marks. We should recursively check whether we
follow the recommendation (e.g. \p{alpha} refers to all character having
the Alphabetic derived core property, which is Lu+Ll+Lt+Lm+Lo+Nl +
Other_Alphabetic, where Other_Alphabetic is a selected list of
additional character - all from Mn/Mc)
History
Date User Action Args
2008-11-28 21:33:41loewissetrecipients: + loewis, lemburg, terry.reedy, nathanlmiles, rsc, timehorse
2008-11-28 21:33:41loewissetmessageid: <1227908021.62.0.877440749758.issue1693050@psf.upfronthosting.co.za>
2008-11-28 21:33:40loewislinkissue1693050 messages
2008-11-28 21:33:40loewiscreate