This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: ConfigParser does not parse utf-8 files with BOM bytes
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: duplicate
Dependencies: Superseder: Python3: guess text file charset using the BOM
View: 7651
Assigned To: lukasz.langa Nosy List: Sean.Wang, eric.araujo, lukasz.langa
Priority: normal Keywords:

Created on 2012-03-15 02:24 by Sean.Wang, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (3)
msg155843 - (view) Author: Sean Wang (Sean.Wang) Date: 2012-03-15 02:24
ConfigParser failed to parse a utf-8 file with BOM bytes('\xef\xbb\xbf'),
it would raise ConfigParser.MissingSectionHeaderError.

I think that other files with BOM would have the same problem; because the argument "SECTCRE" does not consider the BOM conditions.

Now the workaround is like below:

cp=ConfigParser.ConfigParser()
cfgfile=os.path.join(curpath,'config.cfg')
cp.readfp(codecs.open(cfgfile, 'r','utf-8-sig'))
msg156042 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2012-03-16 14:46
Could you paste the exact code that fails?  In 3.2+ there is a read_something method that takes an encoding argument, so that should work for example.
msg156079 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2012-03-16 20:23
What you considered a workaround is actually what you should be using faced with BOM bytes. This is a broader issue in Python, not necessarily connected with ConfigParser or any other library. Also, this has been already reported here:

http://bugs.python.org/issue7519

For the UTF-8 BOM context please see:

http://bugs.python.org/issue7651

To solve the actual problem we should really do something about that last issue.

If you have any further questions, please ask. If not, I will close this issue.
History
Date User Action Args
2022-04-11 14:57:28adminsetgithub: 58519
2012-03-20 12:32:44lukasz.langasetstatus: open -> closed
resolution: duplicate
stage: needs patch -> resolved
superseder: Python3: guess text file charset using the BOM
versions: - Python 2.7, Python 3.2
2012-03-16 20:23:32lukasz.langasetassignee: lukasz.langa
messages: + msg156079
2012-03-16 14:46:21eric.araujosetnosy: + eric.araujo
messages: + msg156042
2012-03-15 20:19:01pitrousetnosy: + lukasz.langa
stage: needs patch

versions: + Python 3.2, Python 3.3
2012-03-15 02:24:51Sean.Wangcreate