This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: csv.DictReader, skipinitialspace does not ignore tabs
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: andre.lehmann, xtreak
Priority: normal Keywords:

Created on 2019-01-24 10:28 by andre.lehmann, last changed 2022-04-11 14:59 by admin.

Files
File name Uploaded Description Edit
conf.csv andre.lehmann, 2019-01-24 10:28
Messages (3)
msg334289 - (view) Author: André Lehmann (andre.lehmann) Date: 2019-01-24 10:28
When using the csv.DictReader a dialect can be given to change the behavior of interpretation of the csv file.

The Dialect has an option "skipinitialspace" which shall ignore the whitespace after the delimiter according to the documentation (https://docs.python.org/3/library/csv.html).

Unfortunately this works only for spaces but not for tabs which are also whitespaces.

See the following code snippet applied on the attached file:

with open("conf-csv", "r") as csvfile:
    csv.register_dialect("comma_and_ws", skipinitialspace=True)
    csv_dict_reader = csv.DictReader(csvfile, dialect="comma_and_ws")
    for line in csv_dict_reader:
        print(line)

The second line shall not contain "\t" chars.
msg334290 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-01-24 10:59
https://bugs.python.org/issue21297#msg216907 to be related and has a patch. It refers to whitespace as only space ('U+0020') with tabs being ignored.

Current code where only space is taken into account : https://github.com/python/cpython/blob/fd628cf5adaeee73eab579393cdff71c8f70cdf2/Modules/_csv.c#L621
msg334295 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-01-24 11:21
Sorry, I overlooked the patch. The issue reported is the same in issue21297 but the patch was about changing whitespace to space in the doc instead of changing the behavior as I can see from the discussion.
History
Date User Action Args
2022-04-11 14:59:10adminsetgithub: 79997
2021-03-25 22:59:07iritkatrielsetcomponents: + Library (Lib)
versions: + Python 3.10, - Python 3.5
2019-01-24 11:21:24xtreaksetmessages: + msg334295
2019-01-24 10:59:29xtreaksetnosy: + xtreak
messages: + msg334290
2019-01-24 10:28:34andre.lehmanncreate