This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ElementTree: wrong XML prolog for the utf-8-sig encoding
Type: Stage:
Components: Library (Lib) Versions: Python 3.11, Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eli.bendersky, prikryl, scoder
Priority: normal Keywords:

Created on 2022-02-01 11:00 by prikryl, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 31043 open prikryl, 2022-02-01 11:00
Messages (1)
msg412247 - (view) Author: Petr Prikryl (prikryl) * Date: 2022-02-01 11:00
When ElementTree object is to be written to the file, and when BOM is needed, the 'utf-8-sig' can be used for the purpose. However, the XML prolog then looks like...
    
    <?xml version='1.0' encoding='utf-8-sig'?>
    
... and that encoding in the prolog makes no sense. Therefore,
the utf-8-sig is changed to utf-8 for the purpose.

To fix the situation, the following two lines should be added to
`cpython/Lib/xml/etree/ElementTree.py`

`elif enc_lower == "utf-8-sig":
     declared_encoding = "utf-8"
`

just above the line 741 that says 
`write("<?xml version='1.0' encoding='%s'?>\n" % (
       declared_encoding,))`

I have already cloned the main branch, added the lines to `https://github.com/pepr/cpython.git`, and sent pull request.

I have tested the functionality locally with `Python 3.10.2 (tags/v3.10.2:a58ebcc, Jan 17 2022, 14:12:15) [MSC v.1929 64 bit (AMD64)] on win32`
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90756
2022-02-01 16:40:13ned.deilysetnosy: + scoder, eli.bendersky

versions: + Python 3.11
2022-02-01 11:00:34prikrylcreate