Issue 923697: SAX2 'property_encoding' feature not supported

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/40084

classification

Title:	SAX2 'property_encoding' feature not supported
Type:	enhancement	Stage:	needs patch
Components:	XML	Versions:	Python 3.1, Python 2.7

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	amaury.forgeotdarc, josephw
Priority:	low	Keywords:

Created on 2004-03-26 06:39 by josephw, last changed 2022-04-11 14:56 by admin.

Messages (5)
msg54124 - (view)	Author: Joseph Walton (josephw)	Date: 2004-03-26 06:39
I need to parse a byte stream as an XML document and, afterwards, access the same document as a Unicode string. I would prefer to rely on the parser's charset-determining logic, and the 'property_encoding' feature ("http://www.python.org/sax/properties/encoding") seems to offer exactly this information. However, the default Expat parser doesn't support this feature. --- from xml.sax import make_parser, handler from xml.sax.xmlreader import InputSource from sys import stdin p = make_parser() # Should not fail. Should it return None, or UTF-8? assert(p.getProperty(handler.property_encoding) == None) source = InputSource() source.setByteStream(stdin) p.parse(source) # Should now be the name of the actual encoding used assert(p.getProperty(handler.property_encoding) != None) --- This raises SAXNotRecognizedException. Is there another SAX parser I could use instead?
msg116563 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2010-09-16 15:46
The URL referenced in msg54124 gives a 404. It is also used as the property_encoding in the sax handler module. Could this be fixed in 3.2 or can this issue be closed?
msg116754 - (view)	Author: Joseph Walton (josephw)	Date: 2010-09-18 04:13
The behaviour is unchanged in Python 3.1 and the sample program still fails.
msg190048 - (view)	Author: Mark Lawrence (BreamoreBoy) *	Date: 2013-05-26 02:29
This is still an issue so the sax handler module property_encoding attribute be set to what URL?
msg190111 - (view)	Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) *	Date: 2013-05-26 20:38
Mark, the "http://www.python.org/sax/properties/encoding" is not meant to be a web page. It's like an attribute name, but fully qualified so that attributes given by different organizations don't clash. (There may be different usages of "encoding": is it the one set by the user, or the one determined by the parser? according to Python docs, here it's both) Python's default Expat parser doesn't support this feature, so the present behavior is correct. Proper support should not be difficult to add, with a XmlDeclHandler.

History
Date	User	Action	Args
2022-04-11 14:56:03	admin	set	github: 40084
2014-02-03 17:10:57	BreamoreBoy	set	nosy: - BreamoreBoy
2013-05-26 20:38:14	amaury.forgeotdarc	set	nosy: + amaury.forgeotdarc messages: + msg190111 stage: test needed -> needs patch
2013-05-26 02:29:14	BreamoreBoy	set	messages: + msg190048
2010-09-18 04:13:26	josephw	set	messages: + msg116754 versions: + Python 3.1
2010-09-16 15:46:11	BreamoreBoy	set	nosy: + BreamoreBoy messages: + msg116563
2009-02-14 11:34:04	ajaksu2	set	stage: test needed components: + XML, - None versions: + Python 2.7
2004-03-26 06:39:58	josephw	create