This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ezio.melotti, loewis, pitrou, python-dev, vstinner
Date 2011-11-21.13:32:32
SpamBayes Score 1.0772513e-08
Marked as misclassified No
Message-id <1321882353.25.0.546446284533.issue13441@psf.upfronthosting.co.za>
In-reply-to
Content
> No, they should be rejected. Allowing them in some specific
> places might cause them to leak somewhere else and cause problems,
> so I'd rather stick with that range and reject all the chars
> >U+10FFFF everywhere.

That's why I added a (debug) check to reject them. I don't think that your UTF-8 encoder support such character some example. All functions assumes that the maximum character is U+10FFFF.

If they should be rejected, a solution is to modify strxfrm() to return a list of integer (of code points) instead of a string.
History
Date User Action Args
2011-11-21 13:32:33vstinnersetrecipients: + vstinner, loewis, pitrou, ezio.melotti, python-dev
2011-11-21 13:32:33vstinnersetmessageid: <1321882353.25.0.546446284533.issue13441@psf.upfronthosting.co.za>
2011-11-21 13:32:32vstinnerlinkissue13441 messages
2011-11-21 13:32:32vstinnercreate