classification
Title: fileinput.hook_compressed returning bytes from gz file
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.2, Python 3.1
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: amaury.forgeotdarc, mnewman
Priority: normal Keywords: easy

Created on 2009-04-14 23:48 by mnewman, last changed 2010-07-21 14:23 by amaury.forgeotdarc.

Files
File name Uploaded Description Edit
example.zip mnewman, 2009-04-14 23:47 ZIP file containing test example
Messages (2)
msg85978 - (view) Author: Michael Newman (mnewman) Date: 2009-04-14 23:47
The attached ZIP file contains "test.bat" which runs "test.py" with
Python 2.6 and Python 3.0.

Python 2.6 behaves as expected (see "py26.out"), since it returns
strings from both "mike.txt" and "mike.txt.gz". However, the same test
with Python 3.0 returns bytes from "mike.txt.gz", as shown in "py30.out":
Output: Hello from Mike.
Output: This is the second line.
Output: Why did the robot cross the road?
Output: b'Hello from Mike.'
Output: b'This is the second line.'
Output: b'Why did the robot cross the road?'

For reference, I tested this on Python versions:
Python 2.6.1 (r261:67517, Dec  4 2008, 16:51:00) [MSC v.1500 32 bit
(Intel)] on win32
Python 3.0.1 (r301:69561, Feb 13 2009, 20:04:18) [MSC v.1500 32 bit
(Intel)] on win32
msg111063 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010-07-21 14:23
gzip.open() only implements the "rb" mode, and returns bytes.
fileinput could certainly wrap it with a io.TextIOWrapper.

Then the encoding issue arises.
fileinput.FileInput should grow an "encoding" parameter instead of always relying on the default encoding.
History
Date User Action Args
2010-07-21 14:23:43amaury.forgeotdarcsetversions: - Python 2.7
nosy: + amaury.forgeotdarc

messages: + msg111063

keywords: + easy
2010-07-10 16:20:54BreamoreBoysetstage: needs patch
versions: + Python 3.1, Python 2.7, Python 3.2, - Python 3.0
2009-04-14 23:48:01mnewmancreate