classification
Title: Using constant for whence arg in seek()
Type: enhancement Stage: resolved
Components: Documentation, IO, Tests Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pitrou Nosy List: aeros, benjamin.peterson, docs@python, miss-islington, pitrou, rhettinger, serhiy.storchaka, stutzbach
Priority: normal Keywords: patch

Created on 2019-07-20 04:00 by aeros, last changed 2019-09-15 12:25 by pitrou. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 16147 merged aeros, 2019-09-14 18:11
PR 16150 merged miss-islington, 2019-09-14 20:29
PR 16151 merged miss-islington, 2019-09-14 20:29
Messages (18)
msg348207 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-07-20 04:00
In Python 3.1, three constants were added to the IO module to be used for the whence argument of seek():

    SEEK_SET or 0 – start of the stream
    SEEK_CUR or 1 – current stream position
    SEEK_END or 2 – end of the stream

However, there are at least 102 occurrences across CPython where the integer values are used instead. This only includes instances when a value for the whence arg is explicitly specified. All of the occurrences can be easily found with the usage of git grep:

git grep -E "seek\(-?[0-9]+, [0-2]\)"

This doesn't affect functionality in any way, but it would result in significant readability improvement if these were all replaced with constants. The only requirement would be importing IO or manually specifying the value of constants. 

For simplicity, I would probably start with anything that directly involves IO. The largest number of occurrences are in Lib/test/test_io, so that probably would be the best place to begin. Also, this module already imports io anyways, so it would be a straightforward find and replace. 

It would also be quite useful to make this change in the documentation, in Doc/tutorial/inputoutput there are 4 occurrences. If anything, anyone less familiar with the IO module in general would be the most likely to benefit from this change so modifying the tutorial would likely be the most useful change for the majority of users.

As a single example of what I'm talking about, here's line 334 of Lib/test/test_io:

self.assertEqual(f.seek(-1, 1), 5)

I would propose changing it to:

self.assertEqual(f.seek(-1, io.SEEK_CUR), 5)
msg348216 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-07-20 16:30
I think it's okay to leave these as 0, 1, 2 because they have a long and well known tradition across multiple languages.  The named versions are just alternatives available for end-users.  The numbered forms also let us avoid a bunch of otherwise unnecessary imports (which affect start-up time and occasionally create load order bootstrapping issues).

Antoine, what do you think?
msg348222 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-07-20 21:14
>The named versions are just alternatives available for end-users

While it is true that the usage of 0,1,2 is more commonly used across the repository, the constants are used several times across Lib/_compression.py and Lib/_pyio.py. From my perspective, it looks like the locations which already used the integer values were just not updated rather than it being an intentional design decision.

However, if it is primarily targeted at the end-users, I would still recommend at least updating the tutorial and perhaps other IO documentation to use the constants. To any user unfamiliar with the integers, the constants convey their purpose far more explicitly, and I don't think anyone familiar with the integers would be confused by the constants. 

>The numbered forms also let us avoid a bunch of otherwise unnecessary imports (which affect >start-up time and occasionally create load order bootstrapping issues).

I can definitely understand this argument for anything that doesn't already import IO. It probably wouldn't be worthwhile to increase start times and introduce load order bugs. 

But for anything that already imports IO anyways, this seems like it would only serve to be a readability improvement without causing any change in functionality. This might not make a difference at all to developers who are used to using 0,1,2, but their purpose is not inherently obvious to anyone that isn't. 

This change would be in alignment with "Explicit is better than implicit". The purpose of the integers is defined only by implication based on tradition, whereas the constants explicitly define the purpose of the argument. 

A major advantage of Python over many existing languages is the emphasis on providing maximum readability. Sometimes functionality should be prioritized, such as in places where IO is not already imported. But in cases where there is no functionality cost, readability should be the priority. Even for non-public sections such as the tests, improved readability leads to easier maintenance and incorporation of new Python developers.
msg348223 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2019-07-20 23:10
This is Antoine's call.  I believe he made the original decision not to sweep through the standard library, changing it everywhere it occurred.  Perhaps this was performance reasons, perhaps the existing code was already clear enough, perhaps the value-add was too low.  For me personally, it is easier to remember 0 than the name of the constant.
msg348282 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-07-22 07:18
I would agree with changing the tutorial.  The rest doesn't need to be changed.
msg348283 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-07-22 07:22
AFAIK there is an open PR which sweeps through the standard library, changing seek() everywhere to use named constants.
msg348286 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-07-22 07:53
> I would agree with changing the tutorial.  The rest doesn't need to be changed.

Fair enough, that's understandable. I'll look into creating a PR for the tutorial to attach to this issue. Thanks for the feedback Antoine and Raymond. 

>AFAIK there is an open PR which sweeps through the standard library, changing seek() >everywhere to use named constants.

Even I would disagree with doing it through all of stdlib, especially with the current feedback. I was primarily advocating for using the constants in places where the io module was already being imported, since in that case, it wouldn't result in any additional overhead. Replacing it everywhere would require a significant number of additional io imports.
msg348480 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-07-26 07:34
Upon further consideration, I think that it would be best to leave the code examples as is in the "inputoutput" tutorial, but it still be worth mentioning that the constants exist as alternatives. This is due to adding the additional step of "import io" in order to be able to access the constants. The tutorial code examples should probably aim to not add additional extra lines that are not needed.

While I was working on a PR for updating the tutorial, I noticed that the second argument's name for seek() was  *from_what* instead of *whence*. Was this an older name for the argument that is now outdated or am I missing something?

Doc/tutorial: https://github.com/python/cpython/blob/master/Doc/tutorial/inputoutput.rst#methods-of-file-objects

Doc/library: https://docs.python.org/3/library/io.html#io.IOBase.seek
msg348485 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-07-26 09:18
I don't know. "whence" is the official name of the argument in the POSIX API:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/lseek.html

Perhaps "from_what" is assumed to be more understandable by the average reader?

Also, os.lseek() uses "how":
https://docs.python.org/3/library/os.html#os.lseek
msg348520 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-07-26 19:33
> I don't know. "whence" is the official name of the argument in the POSIX API

> Perhaps "from_what" is assumed to be more understandable by the average reader?

From looking at the blame on GitHub, it looks like the use of the "from_what" argument in the tutorial has been there for more than 12 years, since the last commit was a massive move of the doc tree.  The documentation for the IO module was added exactly 12 years (which included the usage of the *whence* argument rather than *from_what*) ago by birkenfield. Based on that information, I think the most likely answer is that the argument used to be *from_what* in a much older version of Python. To conform to the posix standard, it was changed to *whence*, but the tutorial was never updated.

If it was simply never updated, I think that it would be better to change it to *whence*. The difference would be more likely to confuse new users of the language, if they were to start with the tutorial and later refer to the IO module documentation. 

Also the tutorial provides a fairly in-depth explanation of the purpose of the argument within seek(), so I don't think using "from_what" as the name makes its purpose any more clear to the users.
msg348521 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-07-26 19:36
Clarification: By "latest commit" should be "oldest commit" with regards to the oldest commit in GitHub's history of the section of the "inputouput" tutorial that used the *from_what* argument for seek().
msg352446 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-09-14 18:17
Created GH-16147 for replacing the *from_what* argument with *whence* in the IO tutorial. 

I would like to consider following up on this with another PR that adds the IO constants `SEEK_SET`, `SEEK_CUR`, and `SEEK_END` to the tutorial. Those constants would be particularly useful for new users of the language, and would likely make the tutorial easier to understand for those who don't have prior experience with using `seek()`.

However, I figured that replacing *from_what* with *whence* would be significantly less controversial and easier to review, which is why they will be in separate PRs.
msg352450 - (view) Author: miss-islington (miss-islington) Date: 2019-09-14 20:29
New changeset ff603f6c3d3dc0e9ea8c1c51ce907c4821f42c54 by Miss Islington (bot) (Kyle Stanley) in branch 'master':
bpo-37635: Update arg name for seek() in IO tutorial (GH-16147)
https://github.com/python/cpython/commit/ff603f6c3d3dc0e9ea8c1c51ce907c4821f42c54
msg352453 - (view) Author: miss-islington (miss-islington) Date: 2019-09-14 20:47
New changeset 4a71df88cdba77c409ee70146dd6445b19267df4 by Miss Islington (bot) in branch '3.8':
bpo-37635: Update arg name for seek() in IO tutorial (GH-16147)
https://github.com/python/cpython/commit/4a71df88cdba77c409ee70146dd6445b19267df4
msg352454 - (view) Author: miss-islington (miss-islington) Date: 2019-09-14 20:47
New changeset b9f932f9e2a170a8d39b3c17f5fabb0967839d85 by Miss Islington (bot) in branch '3.7':
bpo-37635: Update arg name for seek() in IO tutorial (GH-16147)
https://github.com/python/cpython/commit/b9f932f9e2a170a8d39b3c17f5fabb0967839d85
msg352455 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-09-14 20:48
Thanks you Kyle!
msg352456 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2019-09-14 21:01
> Thanks you Kyle!

No problem, thanks for merging it Antoine!

What do you think of the followup PR to make use of the SEEK_* constants listed in the documentation? I think it would be useful to at least mention them in the tutorial, or even make use of them directly in the examples.
msg352475 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2019-09-15 12:25
> What do you think of the followup PR to make use of the SEEK_* constants listed in the documentation? I think it would be useful to at least mention them in the tutorial, or even make use of them directly in the examples.

Yes, I think it would be good to make use of them.
History
Date User Action Args
2019-09-15 12:25:40pitrousetmessages: + msg352475
2019-09-14 21:01:06aerossetmessages: + msg352456
2019-09-14 20:48:41pitrousetstatus: open -> closed
versions: + Python 3.7, Python 3.8
messages: + msg352455

resolution: fixed
stage: patch review -> resolved
2019-09-14 20:47:42miss-islingtonsetmessages: + msg352454
2019-09-14 20:47:34miss-islingtonsetmessages: + msg352453
2019-09-14 20:29:44miss-islingtonsetpull_requests: + pull_request15762
2019-09-14 20:29:38miss-islingtonsetnosy: + miss-islington
messages: + msg352450
2019-09-14 20:29:37miss-islingtonsetpull_requests: + pull_request15761
2019-09-14 18:17:57aerossetmessages: + msg352446
2019-09-14 18:11:02aerossetkeywords: + patch
stage: patch review
pull_requests: + pull_request15757
2019-07-26 19:36:59aerossetmessages: + msg348521
2019-07-26 19:33:38aerossetmessages: + msg348520
2019-07-26 09:18:18pitrousetmessages: + msg348485
2019-07-26 07:34:49aerossetmessages: + msg348480
2019-07-22 07:53:17aerossetmessages: + msg348286
2019-07-22 07:22:11serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg348283
2019-07-22 07:18:46pitrousetmessages: + msg348282
2019-07-20 23:10:35rhettingersetassignee: docs@python -> pitrou
messages: + msg348223
2019-07-20 21:14:09aerossetmessages: + msg348222
2019-07-20 16:30:01rhettingersetnosy: + pitrou, rhettinger
messages: + msg348216
2019-07-20 04:02:28aerossetassignee: docs@python

components: + Documentation, Tests
nosy: + docs@python
2019-07-20 04:00:23aeroscreate