Title: str.join() intercepts TypeError raised by iterator
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: georg.brandl, kermode, paul.moore
Priority: normal Keywords:

Created on 2004-02-26 21:19 by kermode, last changed 2006-06-14 08:10 by georg.brandl. This issue is now closed.

Messages (3)
msg20139 - (view) Author: Lenard Lindstrom (kermode) Date: 2004-02-26 21:19
For str.join(), if it is passed an iterator and that 
iterator raises a TypeError, that exception is caught 
by the join method and replaced by its own 
TypeError exception. SyntaxError and IndexError 
exceptions are uneffected.


Python 2.3.3 (#51, Dec 18 2003, 20:22:39) [MSC 
v.1200 32 bit (Intel)] on win32
IDLE 1.0.2      
>>> def gen(n):
	if not isinstance(n, int):
		raise TypeError, "gen() TypeError"
	if n<0:
		raise IndexError,  "gen() 
	for i in range(n):
		yield str(i)

>>> ''.join(gen(5))
>>> ''.join(gen(-1))

Traceback (most recent call last):
  File "<pyshell#9>", line 1, in -toplevel-
  File "<pyshell#7>", line 5, in gen
    raise IndexError, "gen() IndexError"
IndexError: gen() IndexError
>>> ''.join(gen(None))

Traceback (most recent call last):
  File "<pyshell#10>", line 1, in -toplevel-
TypeError: sequence expected, generator found
msg20140 - (view) Author: Paul Moore (paul.moore) * (Python committer) Date: 2004-06-05 19:13
Logged In: YES 

Unicode objects do not have this behaviour. For example:

>>> u''.join(gen(None))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 3, in gen
TypeError: gen() TypeError

The offending code is at line 1610 or so of stringobject.c.
The equivalent Unicode code starts at line 3955 of

The string code does a 2-pass approach to calculate the size
of the result, allocate space, and then build the value. The
Unicode version resizes as it goes along. This *may* be a
significant speed optimisation (on the assumption that
strings are more commonly used than Unicode objects), but I
can't test (no MSVC7 to build with).

If the speed issue is not significant, I'd recommend
rewriting the string code to use the same approach the
Unicode code uses. Otherwise, the documentation for str.join
should clarify these points:

1. The sequence being joined is materialised as a tuple
(PySequence_Fast) - this may have an impact on generators
which use a lot of memory.
2. TypeErrors produced by materialising the sequence being
joined will be caught and re-raised with a different message.
msg20141 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-06-14 08:10
Logged In: YES 

This is fixed on the trunk.
Date User Action Args
2004-02-26 21:19:49kermodecreate