msg147378 - (view) |
Author: Nekmo (Nekmo) |
Date: 2011-11-09 22:17 |
Currently, the mapping of namespaces is global and can cause failures if multiple instances are used or in multithreading. The variable is in xml.etree.ElementTree._namespace_map. I ask it to be switched to xml.etree._Element instance.
|
msg147379 - (view) |
Author: Jesús Cea Avión (jcea) *  |
Date: 2011-11-09 22:28 |
Tagging this as targeting 3.3.
Nekmo, could you possibly poste some code showing the problem?
|
msg147380 - (view) |
Author: Nekmo (Nekmo) |
Date: 2011-11-09 23:22 |
In my case, I have several clients, and they define the namespaces. I am interested in return the same namespace that they gave me, for example, the client "A" gives me this:
<house:iq xmlns:house="http://localhost/house" />
To name the namespace, I set it at nsmap:
>>> import xml.etree.ElementTree as etree
>>> etree.register_namespace('house', 'http://localhost/house')
>>> etree._namespace_map
{'http://localhost/house': 'house',
'http://purl.org/dc/elements/1.1/': 'dc',
'http://schemas.xmlsoap.org/wsdl/': 'wsdl',
'http://www.w3.org/1999/02/22-rdf-syntax-ns#': 'rdf',
'http://www.w3.org/1999/xhtml': 'html',
'http://www.w3.org/2001/XMLSchema': 'xs',
'http://www.w3.org/2001/XMLSchema-instance': 'xsi',
'http://www.w3.org/XML/1998/namespace': 'xml'}
Thus, keeping the name of the namespace:
>>> etree.tostring(etree.Element('{http://localhost/house}iq'))
b'<house:iq xmlns:house="http://localhost/house" />'
But if I have a client "B", which uses a different name, and run in parallel, problems can occur:
<home:iq xmlns:home="http://localhost/house" />
>>> import xml.etree.ElementTree as etree
>>> etree.register_namespace('home', 'http://localhost/house')
>>> etree._namespace_map
{'http://localhost/house': 'home',
'http://purl.org/dc/elements/1.1/': 'dc',
'http://schemas.xmlsoap.org/wsdl/': 'wsdl',
'http://www.w3.org/1999/02/22-rdf-syntax-ns#': 'rdf',
'http://www.w3.org/1999/xhtml': 'html',
'http://www.w3.org/2001/XMLSchema': 'xs',
'http://www.w3.org/2001/XMLSchema-instance': 'xsi',
'http://www.w3.org/XML/1998/namespace': 'xml'}
Therefore, I ask that _namespace_map is within etree._Element instance, and not global
|
msg147415 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2011-11-11 00:23 |
This patch proposes an implementation of the feature.
>>> from xml.etree import ElementTree as ET
>>> ET.tostring(ET.Element('{http://localhost/house}iq'), encoding="unicode", namespaces={'http://localhost/house': 'home'})
'<home:iq xmlns:home="http://localhost/house" />'
|
msg147419 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2011-11-11 07:06 |
Florent, thanks for the notification.
Nekmo, note that you are misusing this feature. The _namespace_map is meant to provide "well known namespace prefixes" only, so that common namespaces end up using the "expected" prefix. This is also the reason why it maps namespaces to prefixes and not the other way round. It is not meant to temporarily assign arbitrary prefix to namespaces. That is the reason for it being a global option.
That being said, lxml.etree's Element factory takes an "nsmap" parameter that implements the feature you want. It's documented here:
http://lxml.de/tutorial.html#namespaces
Note that it maps prefixes to namespaces and not the other way round. This is because there is a corresponding "nsmap" property on Elements that provides the currently defined prefixes in the context of an Element. ElementTree itself does not (and cannot) support this property because it drops the prefixes during parsing. However, I would still request that an implementation of the parameter to the Element() factory should be compatible for both libraries.
Also look for "nsmap" in the compatibility docs (appears in two sections):
http://lxml.de/compatibility.html
|
msg147422 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2011-11-11 08:38 |
Reading the proposed patch, I must agree that it makes more sense in ElementTree to support this as a serialiser feature. ET's tree model doesn't have a notion of prefixes, whereas it's native to lxml.etree.
Two major advantages of putting this into the serialiser are: 1) cET doesn't have to be modified, and 2) it does not require additional memory to store the nsmap reference on each Element. The latter by itself is a very valuable property, given that cET aims specifically at a low memory overhead.
I see a couple of drawbacks:
1) it only supports the case that namespaces are globally defined. The implementation cannot handle the case that local namespaces should only be defined in subtrees, or that prefixes are being reused. This is no real restriction because globally defined namespaces are usually just fine. It's more of an inconvenience in some cases, such as multi-namespace languages like SOAP or WSDL+XSD, where namespaces are commonly declared on the subtree where they start being used.
2) lxml.etree cannot support this because it keeps the prefixes in the tree nodes and uses them on serialisation. This cannot easily be overridden because the serialiser is part of libxml2.
I didn't see in the patch how (or if?) the prefix redefinition case is handled. Given that prefixes are always defined globally, it would be nice if this only resulted in an error if two namespaces that are really used in the document map to the same prefix, not always when the namespace dict is redundant by itself.
Also note that it's good to be explicit about the keyword arguments that a function accepts. It aids when help(tostring) tells you directly what you can pass in, instead of just printing "**kw".
|
msg147743 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2011-11-16 00:46 |
Thank you Stefan for the comments.
I've added the prefix collision detection, and removed the **kw argument.
(+ tests)
|
msg149133 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2011-12-09 22:11 |
Updated with documentation.
Thank you for the review.
I know this does not cover different namespaces in subtree.
But this use case seems very specific. The user could find other means to achieve it.
|
msg149143 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2011-12-10 07:04 |
Given that this is a major new feature for the serialiser in ElementTree, I think it's worth asking Fredrik for any comments.
|
msg149187 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2011-12-10 20:47 |
Of course it's better to have someone else to review the patch.
However in this case, I'm not sure it is a major feature.
BTW, I noticed that effbot is currently marked as *inactive* maintainer
http://docs.python.org/devguide/experts.html#stdlib
If it is not an oversight, it means that this issue might wait "an extended period" before receiving a response.
|
msg164984 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2012-07-08 09:44 |
Do we merge the patch for 3.3?
I'm +1 on this (patch submitted 8 months ago, backward compatible and reviewed).
|
msg164991 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-07-08 10:15 |
Can this be honestly classified as a bugfix though? If it's a feature it will have to be postponed to 3.4
|
msg164993 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2012-07-08 10:24 |
Looks like a new feature to me.
|
msg164996 - (view) |
Author: Florent Xicluna (flox) *  |
Date: 2012-07-08 10:27 |
Well, it fixes the behavior of ElementTree in some multi-threaded cases, provided you pass the namespace map as an argument of the serializer call.
The fix implements an optional argument for this use case.
As a side effect, it makes it easier to work with custom namespaces.
If the consensus is to wait for next version, I'm fine with that.
|
msg165002 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2012-07-08 10:56 |
Florent, what you describe is exactly the definition of a new feature.
Users even have to change their code in order to make use of it.
|
msg165496 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-07-15 03:39 |
I'm changing the issue name to reflect the direction it's taken. Florent, once 3.3 is branched, could you please refresh the patch vs. head for 3.4 (don't forget the "what's new") and I'll review it for commit.
|
msg165497 - (view) |
Author: Eli Bendersky (eli.bendersky) *  |
Date: 2012-07-15 03:42 |
I'd also expand the doc of register_namespace to note what it should and shouldn't be used for (once this feature is added).
|
msg228422 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2014-10-04 01:07 |
This patch no longer applies to the tip of default. Whoever updates it should also address Eli's comment about expanding the register_namespace doc. I'm adding the 'easy' tag because Florent already did the hard work, and at this point it is just a patch update and doc change.
|
msg348625 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2019-07-29 11:38 |
This issue is 8 years old and has already 3 patches attached, it is not newcomer friendly: I remove the "easy" keyword.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:23 | admin | set | github: 57587 |
2019-07-29 11:38:53 | vstinner | set | keywords:
- easy nosy:
+ vstinner messages:
+ msg348625
|
2014-10-04 01:07:03 | r.david.murray | set | versions:
+ Python 3.5, - Python 3.4 nosy:
+ r.david.murray
messages:
+ msg228422
keywords:
+ easy stage: commit review -> needs patch |
2012-07-15 03:42:41 | eli.bendersky | set | messages:
+ msg165497 |
2012-07-15 03:39:49 | eli.bendersky | set | messages:
+ msg165496 title: Change the variable "nsmap" from global to instance (xml.etree.ElementTree) -> ET: add custom namespaces to serialization methods |
2012-07-08 10:56:28 | scoder | set | messages:
+ msg165002 |
2012-07-08 10:27:49 | flox | set | messages:
+ msg164996 |
2012-07-08 10:24:06 | scoder | set | messages:
+ msg164993 versions:
+ Python 3.4, - Python 3.3 |
2012-07-08 10:15:39 | eli.bendersky | set | messages:
+ msg164991 |
2012-07-08 09:44:33 | flox | set | nosy:
+ eli.bendersky messages:
+ msg164984
|
2011-12-10 20:47:17 | flox | set | messages:
+ msg149187 |
2011-12-10 07:04:25 | scoder | set | messages:
+ msg149143 |
2011-12-10 07:03:06 | scoder | set | nosy:
+ effbot
|
2011-12-09 22:11:45 | flox | set | files:
+ issue13378_non_global_namespaces_v3.diff
messages:
+ msg149133 stage: patch review -> commit review |
2011-11-16 00:46:44 | flox | set | files:
+ issue13378_non_global_namespaces_v2.diff
messages:
+ msg147743 |
2011-11-11 08:38:02 | scoder | set | messages:
+ msg147422 |
2011-11-11 07:06:18 | scoder | set | messages:
+ msg147419 |
2011-11-11 00:37:30 | flox | set | nosy:
+ scoder
|
2011-11-11 00:23:33 | flox | set | files:
+ issue13378_non_global_namespaces.diff keywords:
+ patch messages:
+ msg147415
stage: patch review |
2011-11-09 23:22:49 | Nekmo | set | messages:
+ msg147380 |
2011-11-09 22:32:24 | flox | set | nosy:
+ flox
|
2011-11-09 22:28:35 | jcea | set | nosy:
+ jcea
messages:
+ msg147379 versions:
+ Python 3.3, - Python 3.2 |
2011-11-09 22:17:50 | Nekmo | create | |