classification
Title: Make the half-open range behaviour easier to teach
Type: enhancement Stage: resolved
Components: Versions: Python 3.8
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: mdk, ncoghlan, rhettinger, seluj78, serhiy.storchaka, steven.daprano
Priority: normal Keywords: patch

Created on 2018-11-09 16:25 by mdk, last changed 2018-11-21 08:39 by rhettinger. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 10436 closed mdk, 2018-11-09 16:28
Messages (22)
msg329531 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-09 16:25
This morning I was teaching Python (again and again), and again I was thinking we could do better about the representation of ranges.

Typically in the current repr of ranges we do not see that the end is excluded:

>>> range(10)
range(0, 10)

However it has the (little?) benefit of respecting the "repr gives valid Python".

I propose to change it to:

>>> range(10)
<range object [0, 1, ..., 8, 9]>
msg329532 - (view) Author: Jules Lasne (seluj78) * Date: 2018-11-09 16:36
Sounds like a great idea to me, hence I never really understood how range worked
msg329535 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-11-09 16:43
The current repr is awkward for teaching but does have the virtue of being able to round-trip.  When that is possible, it is what the language usually chooses.

FWIW, you can show ranges with print() and *-unpacking:

    >>> print(*range(1000, 2000, 100))
    1000 1100 1200 1300 1400 1500 1600 1700 1800 1900
msg329540 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-09 17:27
Or just

>>> *range(1000, 2000, 100),
(1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900)
msg329541 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-11-09 17:35
If possible, I prefer to get the repr in the form of Python expression rather of cryptic angled form. The former is often shorter, that is important if it is a part of the repr of more complex object. You can just copy, paste and edit it.
msg329565 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-11-09 23:03
Not everyone knows the '...' convention. At least according to Google's predictive search, if I type "what does three dots" I get common searches such as "what does three dots mean at the end of a sentence" and similar.

How does your proposed repr look for the edge-cases where there are fewer than five included values? e.g. range(0).
msg329566 - (view) Author: Jules Lasne (seluj78) * Date: 2018-11-09 23:12
As you can see in his PR (https://github.com/python/cpython/pull/10436), he added multiple display types based on the size of the range.

This is easily represented in the dumb_range_repr function: https://github.com/python/cpython/pull/10436/files#diff-95a46658bf7fed08423d060e8f9c1dc2R18

Or here is the C implementation: https://github.com/python/cpython/pull/10436/files#diff-5782f3fcbdfb176507359c3712c42655R597
msg329568 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-11-09 23:18
One other thought, since the current repr round-trips, it can be eval'd.  So changing it might break some code.
msg329569 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-11-09 23:35
With the proposed design, any two empty range objects have the same repr:

repr(range(0)) == repr(range(2, 2)) == repr(range(1, 5, -1)) etc.

Between this loss of information, and the loss of round-tripping through eval, I'm against this proposal. But I'd perhaps be in favour of it as the __str__ rather than __repr__, so that printing a range object displays in the proposed format.

By the way, the ``dumb_range_repr`` function in the PR could be simplified:

# untested
def dumb_range_repr(r):
    if len(r) < 5:
        return f"<range object {list(r)}>"
    else:
        return f"<range object [{r[0]}, {r[1]}, ..., {r[-2]}, {r[-1]}]>"
msg329671 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-11-11 08:55
I agree with Steven and Raymond on this one: changing __repr__ on ranges in a way that breaks round-tripping through eval would be problematic, especially as I'd expect that to be an issue in doctests as well.

However, I also like the idea of having easier access to a more list-like representation that shows the actual range endpoints, not just the inputs used to calculate them, and like Steven, I'm more comfortable with changing __str__ than I am with changing __repr__.

That would give:

>>> range(10)
range(0, 10)
>>> print(range(10))
<range object: [0, 1, ..., 8, 9]>
msg329704 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-12 00:28
I understand we like round-tripping represnetations, I think we like them because in much cases it's immediatly and unambiguously understandable by a Python developer.

It's the best representation as it's the one conveying the most information. But `range(0, 10)` convery very few information, one may forget the "begin included, end excluded" rule (or even if the 2nd one is the end of the length). I think the following is more usefull:

>>> range(10)
<range object [0, 1, ..., 8, 9]>
>>> range(10, 2)
<range object []>
>>> range(2, 10)
<range object [2, 3, ..., 8, 9]>
>>> range(2, 10, 2)
<range object [2, 4, 6, 8]>
>>> 



@steven:

I dont think moving this to __str__ would help someone: I've never seen any student try `str(range(10))` in the repl, they all naturally try the bare `range(10)` and they're all presented with un-informative information. If someone is here to teach them to try with str, better try with list(range(10)) or *range(10).

As for repr(range(0)) == repr(range(2, 2)) == repr(range(1, 5, -1)) I do not consider this a bug, they are all strictly equivalent as being the empty range (when speaking of a mathematical object, maybe not the in-memory struct).


@raymond:

I'm also not OK to teach `*repr(10)` during the first class. I personally go for `list(range(10))`, but I can only because I'm physically available when they ask why the information displayed is not what they expect. A lot of people are learning Python at home and they're probably just lost while being presented with the round-tripping representation.


I don't really agree that changing the repr could break code doing `eval(repr(range(10)))`, is it really something people do?


@nick:

I agree changing the repr could break some doctests on function returning ranges, on the other hand I've never seen a function returning a range.
msg329707 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-11-12 01:20
> I've never seen any student try `str(range(10))` in the repl

I never suggested that students would try calling str() directly. That 
would be silly. They would use print(), as I'm sure many of them are 
already doing.

After 20+ years of using Python, I still sometimes use print in the 
interactive interpreter when I don't need to.

> they all naturally try the bare `range(10)` and they're all presented 
> with un-informative information.

It isn't un-informative information. Its very informative, and useful, 
but perhaps not the information you are trying to teach your students 
*at that moment*. But they will have potentially decades of use of 
Python, long after they have learned that range() is half-open, and the 
long <range object [start, start+1, ..., end-2, end-1]> form is no 
longer necessary, and perhaps even an annoyance.

(It certainly annoys *me*. The existing short form is usually better for 
my needs, and I think I'm more representative of the average Python 
coder over their career than beginners during their first few weeks.)

> As for repr(range(0)) == repr(range(2, 2)) == repr(range(1, 5, -1)) I 
> do not consider this a bug

I didn't say it was a bug. But it reduces the utility of the display, as 
you cannot tell the difference between any two empty range objects. And 
that can be important when trying to work out why your range object is 
unexpectedly empty.

> I don't really agree that changing the repr could break code doing 
> `eval(repr(range(10)))`, 

That's not something up for debate. Whether you "really agree" or not, 
it is a fact that your proposed repr is not legal Python code and 
therefore it will break code doing eval() on it.

> is it really something people do?

It is a backwards-incompatible change of behaviour, therefore we must 
assume it will break someone's code and treat it as a major change. 
That's not to say that we can't change the repr, but we don't do it 
lightly.

Personally, making a change for the sake of beginners during their first 
few weeks of learning the language, but inconveniences them for the 
remaining 95% of their career as a Python coder, does not sound like a 
good trade-off to me.

That's why I suggest that print(range_obj) is a good compromise. 
Lots of beginners already do this, and for those who don't, print is a 
good diagnostic tool which they should be taught early.

And yes, changing the __str__ is a backwards-incompatible change too, 
but the potential negative consequences are smaller and the work-arounds 
are easier.
msg330042 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2018-11-18 10:53
(Retitled the issue to better reflect the underlying feature request)

As Steven describes, there are enough problems with changing range.__repr__ that if that's the proposal, then the only possible answer is "No", and closing the issue.

However, changing range.__str__ (and hence print, f-strings, logging, and more) offers many of the same benefits, without most of the downsides (repr will still roundtrip through eval, doctests won't break, etc).

The only potential benefit that gets lost is the fact that entering "range(10)" at the REPL will still print "range(0, 10)", such that you need to do "print(range(10))" to get the version that shows the endpoint values. For longer ranges, "print(range(100))" will still end up being a lot more user friendly than "print(list(100))".
msg330043 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-18 11:17
In one hand I'm OK to enhance the __str__ of range, so I'll change my PR for this.

It will not fix the issue, but let's not break backward compatibility.
msg330064 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-18 21:43
My first though went to giving something really simple like:

>>> print(range(10))
1, 2, ..., 8, 9

But for the empty range it would give an empty string. It may make sense, but may also be surprising.

The other way would be to print [1, 2, ..., 8. 9], so the empty range gets [] instead of nothing.

I think I prefer the first way.
msg330065 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-11-18 21:57
On Sun, Nov 18, 2018 at 09:43:02PM +0000, Julien Palard wrote:

> My first though went to giving something really simple like:
> 
> >>> print(range(10))
> 1, 2, ..., 8, 9

-1 

Surely that would be your *second* thought, since you already had a 
perfectly adequate first thought:

<range object [1, 2, ..., 8, 9]> 

is explicit about what kind of object we have. Remember, there will be 
times where people don't know they have a range object, and are printing 
it to find out what they have.

Let's just move that from __repr__ to __str__.

> But for the empty range it would give an empty string. It may make 
> sense, but may also be surprising.
> 
> The other way would be to print [1, 2, ..., 8. 9], so the empty range gets [] instead of nothing.

Certainly not. That looks like a list containing 1, 2, ellipsis, 8, 9, 
and will only increase confusion about the difference between lists and 
range objects.
msg330066 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-18 22:27
If I understand correctly, you'd like str(range(10)) to return "<range object [1, 2, ..., 8, 9]>"?

I'm really unconfortable doing this, for me __str__ is here to return an “informal or nicely printable string representation of an object", not a convoluted "<{type(object)} object ...>" notation.

I agree with you, the [0, 1, ..., 8, 9] notation is too confusing with the repr of a list, that's why I proposed the "0, 1, ..., 8, 9" which looks nice.
msg330082 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-11-19 09:26
On Sun, Nov 18, 2018 at 10:27:11PM +0000, Julien Palard wrote:
> 
> Julien Palard <julien+python@palard.fr> added the comment:
> 
> If I understand correctly, you'd like str(range(10)) to return "<range object [1, 2, ..., 8, 9]>"?

Exactly the same as you suggested for repr(range(10)) to return, so yes.

> I'm really unconfortable doing this, for me __str__ is here to return 
> an “informal or nicely printable string representation of an object", 

I think that the output you suggested is an informal AND nicely 
printable string representation of the object. In what way do you think 
it fails?

It's an *informal* representation in the sense that it doesn't mimic the 
range constructor, you can't evaluate it, it isn't even legal Python 
syntax.

"Nicely printable" is a matter of taste, but I think its quite nice 
(just not suitable for use as the repr), and especially nice for the 
purpose of showing the kind of object we're dealing with, rather than 
just the values in it.

> not a convoluted "<{type(object)} object ...>" notation.

If this is too convoluted for str(), why is it suitable for beginners 
when it goes through repr() instead?

> I agree with you, the [0, 1, ..., 8, 9] notation is too confusing with 
> the repr of a list, that's why I proposed the "0, 1, ..., 8, 9" which 
> looks nice.

Except that it gives no clue that it is a range object, and fails for 
empty ranges.
msg330112 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-11-19 19:44
After more thought, I'm in agreement with the comments that the proposed __str__ revision is confusing.

After teaching another Python intro course last week, I'm now thinking that no change should be made.  There are other effective ways to teach half-open intervals (i.e. using slicing on strings is now my preferred way). 

Also students need to learn about using list() with iterators.  This core skill comes up with generators, enumerate, zip, filter, etc. So, we just need to teach the skill earlier in the course than we did with Python2.7.

I recommend that we just close this and resist the urge to create a new oddity that does't generalize well (i.e. most other iterators can't show a preview of the output without actually consuming some of their inputs).
msg330114 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2018-11-19 20:26
Hi Raymond,

I agree, there exist other means of teaching half closed range, but I was more concerned by self-taught students, alone facing the current range repr than students well accompanied.

I also agree, let's not change the current repr (for backward compatibility) and let's not change the current str (it won't help anyway), so I'm closing this.
msg330120 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2018-11-20 00:01
Raymond:
> I'm in agreement with the comments that the proposed __str__ revision is confusing.

In what way is it "confusing"?

I'm especially perplexed that Julien apparently thinks it is confusing when emitted by str(), but educational and useful when emitted by repr(). This makes no sense to me.

I think Julien's idea is a good one, just not for repr, and I don't think it is confusing at all.


Raymond:
> most other iterators can't show a preview of the output without actually consuming some of their inputs

`range` is not some arbitrary iterator, in fact it isn't an iterator at all:

py> r = range(10)
py> iter(r) is r
False


It is a sequence, like list and tuple, and like list and tuple it is perfectly capable of showing its content (in full or part) on demand. Other important built-in sequence types like strings, lists and tuples aren't hamstrung with the restriction not to do anything iterators can't do, there's no good reason for range objects to be given that restriction.


Julien:
> I'm closing this.

Not so hasty, please. Some of us think this is a worthwhile enhancement. You might have changed your mind, but the idea is bigger than you now :-)

I'm taking this discussion to Python-Ideas to see if there is community interest in this feature. If so, I'm going to reopen the issue.
msg330182 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-11-21 08:39
As a teacher, I think the proposal makes us worst off.  It is far easier and more useful at the interactive prompt to use list() rather than print() to show ranges:

    >>> list(range(10))
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>> list(range(2, 10))
    [2, 3, 4, 5, 6, 7, 8, 9]
    >>> list(range(2, 10, 3))
    [2, 5, 8]

If you do the same thing with print(), it takes an additional character ("print" vs "list"), it creates a new source of confusion (str vs repr), and it doesn't generalize to other iterators like enumerate(), reversed(), and generators.

Also, the various ideas listed for a possible new __str__ are all awkward or mysterious for some inputs (empty ranges, short ranges, etc).

FWIW, I teach this topic every week.  Presenting with list(range(...)) is less convenient than with the Python 2.7 version, but it works out just fine in practice and nicely sets the stage for covering set(iterable), tuple(iterable), dict.fromkeys(iterable), etc.

I'm opposed the this proposal because I think it will create more teaching difficulties than it solves.
History
Date User Action Args
2018-11-21 08:39:07rhettingersetmessages: + msg330182
2018-11-20 00:01:11steven.dapranosetmessages: + msg330120
2018-11-19 20:26:33mdksetstatus: open -> closed
resolution: rejected
messages: + msg330114

stage: patch review -> resolved
2018-11-19 19:44:45rhettingersetmessages: + msg330112
2018-11-19 09:26:01steven.dapranosetmessages: + msg330082
2018-11-18 22:27:11mdksetmessages: + msg330066
2018-11-18 21:57:08steven.dapranosetmessages: + msg330065
2018-11-18 21:43:02mdksetmessages: + msg330064
2018-11-18 11:17:56mdksetmessages: + msg330043
2018-11-18 10:53:18ncoghlansetmessages: + msg330042
title: Range repr could be better -> Make the half-open range behaviour easier to teach
2018-11-12 01:20:40steven.dapranosetmessages: + msg329707
2018-11-12 00:28:44mdksetmessages: + msg329704
2018-11-11 08:55:32ncoghlansetnosy: + ncoghlan
messages: + msg329671
2018-11-09 23:35:57steven.dapranosetmessages: + msg329569
2018-11-09 23:18:04rhettingersetmessages: + msg329568
2018-11-09 23:12:54seluj78setmessages: + msg329566
2018-11-09 23:03:54steven.dapranosetnosy: + steven.daprano
messages: + msg329565
2018-11-09 17:35:28serhiy.storchakasetmessages: + msg329541
2018-11-09 17:27:31serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg329540
2018-11-09 16:43:12rhettingersetnosy: + rhettinger
messages: + msg329535
2018-11-09 16:36:17seluj78setnosy: + seluj78
messages: + msg329532
2018-11-09 16:28:55mdksetkeywords: + patch
stage: patch review
pull_requests: + pull_request9710
2018-11-09 16:25:51mdkcreate