This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ajaksu2
Recipients ajaksu2, akuchling, loewis, pboddie, vdupras
Date 2008-02-17.23:58:08
SpamBayes Score 0.053802382
Marked as misclassified No
Message-id <1203292691.0.0.266930258143.issue2124@psf.upfronthosting.co.za>
In-reply-to
Content
Martin, I agree that simply not resolving DTDs is an unreasonable
request (and said so in the blog post). But IMHO there are lots of
possible optimizations, and the most valuable would be those darn easy
for newcomers to understand and use.

In Python, a winning combo would be an arbitrary (and explicit) FS
"dtdcache" that people could use with simple a drop-in import (from a
third-party module?). Perhaps the cache lives in a pickled dictionary
with IDs, checksums and DTDs. Could also be a sqlite DB, if updating the
dict becomes problematic.

In that scenario, AMK could save latter W3C hits with:

from xml.sax import make_parser
from dtdcache.sax.saxutils import prepare_input_source # <- dtdcache
parser = make_parser()
inp = prepare_input_source('file:file.xhtml', cache="/tmp/xmlcache")

It might be interesting to have read-only, force-write and read-write
modes. Not sure how to map that on EntityResolver and DTD consumers (I'm
no XML user myself).

Regarding the std-lib, I believe effective caching hooks for DTDs trump
implementing in-memory or sqlite/FS. IMNSHO, correct, accessible support
for catalogs shouldn't be the only change, as caching should give better
performance on both ends.
History
Date User Action Args
2008-02-17 23:58:11ajaksu2setspambayes_score: 0.0538024 -> 0.053802382
recipients: + ajaksu2, loewis, akuchling, pboddie, vdupras
2008-02-17 23:58:11ajaksu2setspambayes_score: 0.0538024 -> 0.0538024
messageid: <1203292691.0.0.266930258143.issue2124@psf.upfronthosting.co.za>
2008-02-17 23:58:10ajaksu2linkissue2124 messages
2008-02-17 23:58:08ajaksu2create