Issue 35200: Make the half-open range behaviour easier to teach

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/79381

classification

Title:	Make the half-open range behaviour easier to teach
Type:	enhancement	Stage:	resolved
Components:		Versions:	Python 3.8

process

Status:	closed	Resolution:	rejected
Dependencies:		Superseder:
Assigned To:		Nosy List:	mdk, ncoghlan, rhettinger, seluj78, serhiy.storchaka, steven.daprano
Priority:	normal	Keywords:	patch

Created on 2018-11-09 16:25 by mdk, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL	Status	Linked	Edit
PR 10436	closed	mdk, 2018-11-09 16:28

Messages (22)
msg329531 - (view)	Author: Julien Palard (mdk) *	Date: 2018-11-09 16:25
This morning I was teaching Python (again and again), and again I was thinking we could do better about the representation of ranges. Typically in the current repr of ranges we do not see that the end is excluded: >>> range(10) range(0, 10) However it has the (little?) benefit of respecting the "repr gives valid Python". I propose to change it to: >>> range(10) <range object [0, 1, ..., 8, 9]>
msg329532 - (view)	Author: Jules Lasne (seluj78) *	Date: 2018-11-09 16:36
Sounds like a great idea to me, hence I never really understood how range worked
msg329535 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2018-11-09 16:43
The current repr is awkward for teaching but does have the virtue of being able to round-trip. When that is possible, it is what the language usually chooses. FWIW, you can show ranges with print() and -unpacking: >>> print(range(1000, 2000, 100)) 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900
msg329540 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-11-09 17:27
Or just >>> *range(1000, 2000, 100), (1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900)
msg329541 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2018-11-09 17:35
If possible, I prefer to get the repr in the form of Python expression rather of cryptic angled form. The former is often shorter, that is important if it is a part of the repr of more complex object. You can just copy, paste and edit it.
msg329565 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-11-09 23:03
Not everyone knows the '...' convention. At least according to Google's predictive search, if I type "what does three dots" I get common searches such as "what does three dots mean at the end of a sentence" and similar. How does your proposed repr look for the edge-cases where there are fewer than five included values? e.g. range(0).
msg329566 - (view)	Author: Jules Lasne (seluj78) *	Date: 2018-11-09 23:12
As you can see in his PR (https://github.com/python/cpython/pull/10436), he added multiple display types based on the size of the range. This is easily represented in the dumb_range_repr function: https://github.com/python/cpython/pull/10436/files#diff-95a46658bf7fed08423d060e8f9c1dc2R18 Or here is the C implementation: https://github.com/python/cpython/pull/10436/files#diff-5782f3fcbdfb176507359c3712c42655R597
msg329568 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2018-11-09 23:18
One other thought, since the current repr round-trips, it can be eval'd. So changing it might break some code.
msg329569 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-11-09 23:35
With the proposed design, any two empty range objects have the same repr: repr(range(0)) == repr(range(2, 2)) == repr(range(1, 5, -1)) etc. Between this loss of information, and the loss of round-tripping through eval, I'm against this proposal. But I'd perhaps be in favour of it as the __str__ rather than __repr__, so that printing a range object displays in the proposed format. By the way, the ``dumb_range_repr`` function in the PR could be simplified: # untested def dumb_range_repr(r): if len(r) < 5: return f"<range object {list(r)}>" else: return f"<range object [{r[0]}, {r[1]}, ..., {r[-2]}, {r[-1]}]>"
msg329671 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2018-11-11 08:55
I agree with Steven and Raymond on this one: changing __repr__ on ranges in a way that breaks round-tripping through eval would be problematic, especially as I'd expect that to be an issue in doctests as well. However, I also like the idea of having easier access to a more list-like representation that shows the actual range endpoints, not just the inputs used to calculate them, and like Steven, I'm more comfortable with changing __str__ than I am with changing __repr__. That would give: >>> range(10) range(0, 10) >>> print(range(10)) <range object: [0, 1, ..., 8, 9]>
msg329704 - (view)	Author: Julien Palard (mdk) *	Date: 2018-11-12 00:28
I understand we like round-tripping represnetations, I think we like them because in much cases it's immediatly and unambiguously understandable by a Python developer. It's the best representation as it's the one conveying the most information. But `range(0, 10)` convery very few information, one may forget the "begin included, end excluded" rule (or even if the 2nd one is the end of the length). I think the following is more usefull: >>> range(10) <range object [0, 1, ..., 8, 9]> >>> range(10, 2) <range object []> >>> range(2, 10) <range object [2, 3, ..., 8, 9]> >>> range(2, 10, 2) <range object [2, 4, 6, 8]> >>> @steven: I dont think moving this to __str__ would help someone: I've never seen any student try `str(range(10))` in the repl, they all naturally try the bare `range(10)` and they're all presented with un-informative information. If someone is here to teach them to try with str, better try with list(range(10)) or range(10). As for repr(range(0)) == repr(range(2, 2)) == repr(range(1, 5, -1)) I do not consider this a bug, they are all strictly equivalent as being the empty range (when speaking of a mathematical object, maybe not the in-memory struct). @raymond: I'm also not OK to teach `repr(10)` during the first class. I personally go for `list(range(10))`, but I can only because I'm physically available when they ask why the information displayed is not what they expect. A lot of people are learning Python at home and they're probably just lost while being presented with the round-tripping representation. I don't really agree that changing the repr could break code doing `eval(repr(range(10)))`, is it really something people do? @nick: I agree changing the repr could break some doctests on function returning ranges, on the other hand I've never seen a function returning a range.
msg329707 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-11-12 01:20
> I've never seen any student try `str(range(10))` in the repl I never suggested that students would try calling str() directly. That would be silly. They would use print(), as I'm sure many of them are already doing. After 20+ years of using Python, I still sometimes use print in the interactive interpreter when I don't need to. > they all naturally try the bare `range(10)` and they're all presented > with un-informative information. It isn't un-informative information. Its very informative, and useful, but perhaps not the information you are trying to teach your students at that moment. But they will have potentially decades of use of Python, long after they have learned that range() is half-open, and the long <range object [start, start+1, ..., end-2, end-1]> form is no longer necessary, and perhaps even an annoyance. (It certainly annoys me. The existing short form is usually better for my needs, and I think I'm more representative of the average Python coder over their career than beginners during their first few weeks.) > As for repr(range(0)) == repr(range(2, 2)) == repr(range(1, 5, -1)) I > do not consider this a bug I didn't say it was a bug. But it reduces the utility of the display, as you cannot tell the difference between any two empty range objects. And that can be important when trying to work out why your range object is unexpectedly empty. > I don't really agree that changing the repr could break code doing > `eval(repr(range(10)))`, That's not something up for debate. Whether you "really agree" or not, it is a fact that your proposed repr is not legal Python code and therefore it will break code doing eval() on it. > is it really something people do? It is a backwards-incompatible change of behaviour, therefore we must assume it will break someone's code and treat it as a major change. That's not to say that we can't change the repr, but we don't do it lightly. Personally, making a change for the sake of beginners during their first few weeks of learning the language, but inconveniences them for the remaining 95% of their career as a Python coder, does not sound like a good trade-off to me. That's why I suggest that print(range_obj) is a good compromise. Lots of beginners already do this, and for those who don't, print is a good diagnostic tool which they should be taught early. And yes, changing the __str__ is a backwards-incompatible change too, but the potential negative consequences are smaller and the work-arounds are easier.
msg330042 - (view)	Author: Nick Coghlan (ncoghlan) *	Date: 2018-11-18 10:53
(Retitled the issue to better reflect the underlying feature request) As Steven describes, there are enough problems with changing range.__repr__ that if that's the proposal, then the only possible answer is "No", and closing the issue. However, changing range.__str__ (and hence print, f-strings, logging, and more) offers many of the same benefits, without most of the downsides (repr will still roundtrip through eval, doctests won't break, etc). The only potential benefit that gets lost is the fact that entering "range(10)" at the REPL will still print "range(0, 10)", such that you need to do "print(range(10))" to get the version that shows the endpoint values. For longer ranges, "print(range(100))" will still end up being a lot more user friendly than "print(list(100))".
msg330043 - (view)	Author: Julien Palard (mdk) *	Date: 2018-11-18 11:17
In one hand I'm OK to enhance the __str__ of range, so I'll change my PR for this. It will not fix the issue, but let's not break backward compatibility.
msg330064 - (view)	Author: Julien Palard (mdk) *	Date: 2018-11-18 21:43
My first though went to giving something really simple like: >>> print(range(10)) 1, 2, ..., 8, 9 But for the empty range it would give an empty string. It may make sense, but may also be surprising. The other way would be to print [1, 2, ..., 8. 9], so the empty range gets [] instead of nothing. I think I prefer the first way.
msg330065 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-11-18 21:57
On Sun, Nov 18, 2018 at 09:43:02PM +0000, Julien Palard wrote: > My first though went to giving something really simple like: > > >>> print(range(10)) > 1, 2, ..., 8, 9 -1 Surely that would be your second thought, since you already had a perfectly adequate first thought: <range object [1, 2, ..., 8, 9]> is explicit about what kind of object we have. Remember, there will be times where people don't know they have a range object, and are printing it to find out what they have. Let's just move that from __repr__ to __str__. > But for the empty range it would give an empty string. It may make > sense, but may also be surprising. > > The other way would be to print [1, 2, ..., 8. 9], so the empty range gets [] instead of nothing. Certainly not. That looks like a list containing 1, 2, ellipsis, 8, 9, and will only increase confusion about the difference between lists and range objects.
msg330066 - (view)	Author: Julien Palard (mdk) *	Date: 2018-11-18 22:27
If I understand correctly, you'd like str(range(10)) to return "<range object [1, 2, ..., 8, 9]>"? I'm really unconfortable doing this, for me __str__ is here to return an “informal or nicely printable string representation of an object", not a convoluted "<{type(object)} object ...>" notation. I agree with you, the [0, 1, ..., 8, 9] notation is too confusing with the repr of a list, that's why I proposed the "0, 1, ..., 8, 9" which looks nice.
msg330082 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-11-19 09:26
On Sun, Nov 18, 2018 at 10:27:11PM +0000, Julien Palard wrote: > > Julien Palard <julien+python@palard.fr> added the comment: > > If I understand correctly, you'd like str(range(10)) to return "<range object [1, 2, ..., 8, 9]>"? Exactly the same as you suggested for repr(range(10)) to return, so yes. > I'm really unconfortable doing this, for me __str__ is here to return > an “informal or nicely printable string representation of an object", I think that the output you suggested is an informal AND nicely printable string representation of the object. In what way do you think it fails? It's an informal representation in the sense that it doesn't mimic the range constructor, you can't evaluate it, it isn't even legal Python syntax. "Nicely printable" is a matter of taste, but I think its quite nice (just not suitable for use as the repr), and especially nice for the purpose of showing the kind of object we're dealing with, rather than just the values in it. > not a convoluted "<{type(object)} object ...>" notation. If this is too convoluted for str(), why is it suitable for beginners when it goes through repr() instead? > I agree with you, the [0, 1, ..., 8, 9] notation is too confusing with > the repr of a list, that's why I proposed the "0, 1, ..., 8, 9" which > looks nice. Except that it gives no clue that it is a range object, and fails for empty ranges.
msg330112 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2018-11-19 19:44
After more thought, I'm in agreement with the comments that the proposed __str__ revision is confusing. After teaching another Python intro course last week, I'm now thinking that no change should be made. There are other effective ways to teach half-open intervals (i.e. using slicing on strings is now my preferred way). Also students need to learn about using list() with iterators. This core skill comes up with generators, enumerate, zip, filter, etc. So, we just need to teach the skill earlier in the course than we did with Python2.7. I recommend that we just close this and resist the urge to create a new oddity that does't generalize well (i.e. most other iterators can't show a preview of the output without actually consuming some of their inputs).
msg330114 - (view)	Author: Julien Palard (mdk) *	Date: 2018-11-19 20:26
Hi Raymond, I agree, there exist other means of teaching half closed range, but I was more concerned by self-taught students, alone facing the current range repr than students well accompanied. I also agree, let's not change the current repr (for backward compatibility) and let's not change the current str (it won't help anyway), so I'm closing this.
msg330120 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2018-11-20 00:01
Raymond: > I'm in agreement with the comments that the proposed __str__ revision is confusing. In what way is it "confusing"? I'm especially perplexed that Julien apparently thinks it is confusing when emitted by str(), but educational and useful when emitted by repr(). This makes no sense to me. I think Julien's idea is a good one, just not for repr, and I don't think it is confusing at all. Raymond: > most other iterators can't show a preview of the output without actually consuming some of their inputs `range` is not some arbitrary iterator, in fact it isn't an iterator at all: py> r = range(10) py> iter(r) is r False It is a sequence, like list and tuple, and like list and tuple it is perfectly capable of showing its content (in full or part) on demand. Other important built-in sequence types like strings, lists and tuples aren't hamstrung with the restriction not to do anything iterators can't do, there's no good reason for range objects to be given that restriction. Julien: > I'm closing this. Not so hasty, please. Some of us think this is a worthwhile enhancement. You might have changed your mind, but the idea is bigger than you now :-) I'm taking this discussion to Python-Ideas to see if there is community interest in this feature. If so, I'm going to reopen the issue.
msg330182 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2018-11-21 08:39
As a teacher, I think the proposal makes us worst off. It is far easier and more useful at the interactive prompt to use list() rather than print() to show ranges: >>> list(range(10)) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> list(range(2, 10)) [2, 3, 4, 5, 6, 7, 8, 9] >>> list(range(2, 10, 3)) [2, 5, 8] If you do the same thing with print(), it takes an additional character ("print" vs "list"), it creates a new source of confusion (str vs repr), and it doesn't generalize to other iterators like enumerate(), reversed(), and generators. Also, the various ideas listed for a possible new __str__ are all awkward or mysterious for some inputs (empty ranges, short ranges, etc). FWIW, I teach this topic every week. Presenting with list(range(...)) is less convenient than with the Python 2.7 version, but it works out just fine in practice and nicely sets the stage for covering set(iterable), tuple(iterable), dict.fromkeys(iterable), etc. I'm opposed the this proposal because I think it will create more teaching difficulties than it solves.

History
Date	User	Action	Args
2022-04-11 14:59:07	admin	set	github: 79381
2018-11-21 08:39:07	rhettinger	set	messages: + msg330182
2018-11-20 00:01:11	steven.daprano	set	messages: + msg330120
2018-11-19 20:26:33	mdk	set	status: open -> closed resolution: rejected messages: + msg330114 stage: patch review -> resolved
2018-11-19 19:44:45	rhettinger	set	messages: + msg330112
2018-11-19 09:26:01	steven.daprano	set	messages: + msg330082
2018-11-18 22:27:11	mdk	set	messages: + msg330066
2018-11-18 21:57:08	steven.daprano	set	messages: + msg330065
2018-11-18 21:43:02	mdk	set	messages: + msg330064
2018-11-18 11:17:56	mdk	set	messages: + msg330043
2018-11-18 10:53:18	ncoghlan	set	messages: + msg330042 title: Range repr could be better -> Make the half-open range behaviour easier to teach
2018-11-12 01:20:40	steven.daprano	set	messages: + msg329707
2018-11-12 00:28:44	mdk	set	messages: + msg329704
2018-11-11 08:55:32	ncoghlan	set	nosy: + ncoghlan messages: + msg329671
2018-11-09 23:35:57	steven.daprano	set	messages: + msg329569
2018-11-09 23:18:04	rhettinger	set	messages: + msg329568
2018-11-09 23:12:54	seluj78	set	messages: + msg329566
2018-11-09 23:03:54	steven.daprano	set	nosy: + steven.daprano messages: + msg329565
2018-11-09 17:35:28	serhiy.storchaka	set	messages: + msg329541
2018-11-09 17:27:31	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg329540
2018-11-09 16:43:12	rhettinger	set	nosy: + rhettinger messages: + msg329535
2018-11-09 16:36:17	seluj78	set	nosy: + seluj78 messages: + msg329532
2018-11-09 16:28:55	mdk	set	keywords: + patch stage: patch review pull_requests: + pull_request9710
2018-11-09 16:25:51	mdk	create