msg348207 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-07-20 04:00 |
In Python 3.1, three constants were added to the IO module to be used for the whence argument of seek():
SEEK_SET or 0 – start of the stream
SEEK_CUR or 1 – current stream position
SEEK_END or 2 – end of the stream
However, there are at least 102 occurrences across CPython where the integer values are used instead. This only includes instances when a value for the whence arg is explicitly specified. All of the occurrences can be easily found with the usage of git grep:
git grep -E "seek\(-?[0-9]+, [0-2]\)"
This doesn't affect functionality in any way, but it would result in significant readability improvement if these were all replaced with constants. The only requirement would be importing IO or manually specifying the value of constants.
For simplicity, I would probably start with anything that directly involves IO. The largest number of occurrences are in Lib/test/test_io, so that probably would be the best place to begin. Also, this module already imports io anyways, so it would be a straightforward find and replace.
It would also be quite useful to make this change in the documentation, in Doc/tutorial/inputoutput there are 4 occurrences. If anything, anyone less familiar with the IO module in general would be the most likely to benefit from this change so modifying the tutorial would likely be the most useful change for the majority of users.
As a single example of what I'm talking about, here's line 334 of Lib/test/test_io:
self.assertEqual(f.seek(-1, 1), 5)
I would propose changing it to:
self.assertEqual(f.seek(-1, io.SEEK_CUR), 5)
|
msg348216 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2019-07-20 16:30 |
I think it's okay to leave these as 0, 1, 2 because they have a long and well known tradition across multiple languages. The named versions are just alternatives available for end-users. The numbered forms also let us avoid a bunch of otherwise unnecessary imports (which affect start-up time and occasionally create load order bootstrapping issues).
Antoine, what do you think?
|
msg348222 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-07-20 21:14 |
>The named versions are just alternatives available for end-users
While it is true that the usage of 0,1,2 is more commonly used across the repository, the constants are used several times across Lib/_compression.py and Lib/_pyio.py. From my perspective, it looks like the locations which already used the integer values were just not updated rather than it being an intentional design decision.
However, if it is primarily targeted at the end-users, I would still recommend at least updating the tutorial and perhaps other IO documentation to use the constants. To any user unfamiliar with the integers, the constants convey their purpose far more explicitly, and I don't think anyone familiar with the integers would be confused by the constants.
>The numbered forms also let us avoid a bunch of otherwise unnecessary imports (which affect >start-up time and occasionally create load order bootstrapping issues).
I can definitely understand this argument for anything that doesn't already import IO. It probably wouldn't be worthwhile to increase start times and introduce load order bugs.
But for anything that already imports IO anyways, this seems like it would only serve to be a readability improvement without causing any change in functionality. This might not make a difference at all to developers who are used to using 0,1,2, but their purpose is not inherently obvious to anyone that isn't.
This change would be in alignment with "Explicit is better than implicit". The purpose of the integers is defined only by implication based on tradition, whereas the constants explicitly define the purpose of the argument.
A major advantage of Python over many existing languages is the emphasis on providing maximum readability. Sometimes functionality should be prioritized, such as in places where IO is not already imported. But in cases where there is no functionality cost, readability should be the priority. Even for non-public sections such as the tests, improved readability leads to easier maintenance and incorporation of new Python developers.
|
msg348223 - (view) |
Author: Raymond Hettinger (rhettinger) * |
Date: 2019-07-20 23:10 |
This is Antoine's call. I believe he made the original decision not to sweep through the standard library, changing it everywhere it occurred. Perhaps this was performance reasons, perhaps the existing code was already clear enough, perhaps the value-add was too low. For me personally, it is easier to remember 0 than the name of the constant.
|
msg348282 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2019-07-22 07:18 |
I would agree with changing the tutorial. The rest doesn't need to be changed.
|
msg348283 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2019-07-22 07:22 |
AFAIK there is an open PR which sweeps through the standard library, changing seek() everywhere to use named constants.
|
msg348286 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-07-22 07:53 |
> I would agree with changing the tutorial. The rest doesn't need to be changed.
Fair enough, that's understandable. I'll look into creating a PR for the tutorial to attach to this issue. Thanks for the feedback Antoine and Raymond.
>AFAIK there is an open PR which sweeps through the standard library, changing seek() >everywhere to use named constants.
Even I would disagree with doing it through all of stdlib, especially with the current feedback. I was primarily advocating for using the constants in places where the io module was already being imported, since in that case, it wouldn't result in any additional overhead. Replacing it everywhere would require a significant number of additional io imports.
|
msg348480 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-07-26 07:34 |
Upon further consideration, I think that it would be best to leave the code examples as is in the "inputoutput" tutorial, but it still be worth mentioning that the constants exist as alternatives. This is due to adding the additional step of "import io" in order to be able to access the constants. The tutorial code examples should probably aim to not add additional extra lines that are not needed.
While I was working on a PR for updating the tutorial, I noticed that the second argument's name for seek() was *from_what* instead of *whence*. Was this an older name for the argument that is now outdated or am I missing something?
Doc/tutorial: https://github.com/python/cpython/blob/master/Doc/tutorial/inputoutput.rst#methods-of-file-objects
Doc/library: https://docs.python.org/3/library/io.html#io.IOBase.seek
|
msg348485 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2019-07-26 09:18 |
I don't know. "whence" is the official name of the argument in the POSIX API:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/lseek.html
Perhaps "from_what" is assumed to be more understandable by the average reader?
Also, os.lseek() uses "how":
https://docs.python.org/3/library/os.html#os.lseek
|
msg348520 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-07-26 19:33 |
> I don't know. "whence" is the official name of the argument in the POSIX API
> Perhaps "from_what" is assumed to be more understandable by the average reader?
From looking at the blame on GitHub, it looks like the use of the "from_what" argument in the tutorial has been there for more than 12 years, since the last commit was a massive move of the doc tree. The documentation for the IO module was added exactly 12 years (which included the usage of the *whence* argument rather than *from_what*) ago by birkenfield. Based on that information, I think the most likely answer is that the argument used to be *from_what* in a much older version of Python. To conform to the posix standard, it was changed to *whence*, but the tutorial was never updated.
If it was simply never updated, I think that it would be better to change it to *whence*. The difference would be more likely to confuse new users of the language, if they were to start with the tutorial and later refer to the IO module documentation.
Also the tutorial provides a fairly in-depth explanation of the purpose of the argument within seek(), so I don't think using "from_what" as the name makes its purpose any more clear to the users.
|
msg348521 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-07-26 19:36 |
Clarification: By "latest commit" should be "oldest commit" with regards to the oldest commit in GitHub's history of the section of the "inputouput" tutorial that used the *from_what* argument for seek().
|
msg352446 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-09-14 18:17 |
Created GH-16147 for replacing the *from_what* argument with *whence* in the IO tutorial.
I would like to consider following up on this with another PR that adds the IO constants `SEEK_SET`, `SEEK_CUR`, and `SEEK_END` to the tutorial. Those constants would be particularly useful for new users of the language, and would likely make the tutorial easier to understand for those who don't have prior experience with using `seek()`.
However, I figured that replacing *from_what* with *whence* would be significantly less controversial and easier to review, which is why they will be in separate PRs.
|
msg352450 - (view) |
Author: miss-islington (miss-islington) |
Date: 2019-09-14 20:29 |
New changeset ff603f6c3d3dc0e9ea8c1c51ce907c4821f42c54 by Miss Islington (bot) (Kyle Stanley) in branch 'master':
bpo-37635: Update arg name for seek() in IO tutorial (GH-16147)
https://github.com/python/cpython/commit/ff603f6c3d3dc0e9ea8c1c51ce907c4821f42c54
|
msg352453 - (view) |
Author: miss-islington (miss-islington) |
Date: 2019-09-14 20:47 |
New changeset 4a71df88cdba77c409ee70146dd6445b19267df4 by Miss Islington (bot) in branch '3.8':
bpo-37635: Update arg name for seek() in IO tutorial (GH-16147)
https://github.com/python/cpython/commit/4a71df88cdba77c409ee70146dd6445b19267df4
|
msg352454 - (view) |
Author: miss-islington (miss-islington) |
Date: 2019-09-14 20:47 |
New changeset b9f932f9e2a170a8d39b3c17f5fabb0967839d85 by Miss Islington (bot) in branch '3.7':
bpo-37635: Update arg name for seek() in IO tutorial (GH-16147)
https://github.com/python/cpython/commit/b9f932f9e2a170a8d39b3c17f5fabb0967839d85
|
msg352455 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2019-09-14 20:48 |
Thanks you Kyle!
|
msg352456 - (view) |
Author: Kyle Stanley (aeros) * |
Date: 2019-09-14 21:01 |
> Thanks you Kyle!
No problem, thanks for merging it Antoine!
What do you think of the followup PR to make use of the SEEK_* constants listed in the documentation? I think it would be useful to at least mention them in the tutorial, or even make use of them directly in the examples.
|
msg352475 - (view) |
Author: Antoine Pitrou (pitrou) * |
Date: 2019-09-15 12:25 |
> What do you think of the followup PR to make use of the SEEK_* constants listed in the documentation? I think it would be useful to at least mention them in the tutorial, or even make use of them directly in the examples.
Yes, I think it would be good to make use of them.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:18 | admin | set | github: 81816 |
2019-09-15 12:25:40 | pitrou | set | messages:
+ msg352475 |
2019-09-14 21:01:06 | aeros | set | messages:
+ msg352456 |
2019-09-14 20:48:41 | pitrou | set | status: open -> closed versions:
+ Python 3.7, Python 3.8 messages:
+ msg352455
resolution: fixed stage: patch review -> resolved |
2019-09-14 20:47:42 | miss-islington | set | messages:
+ msg352454 |
2019-09-14 20:47:34 | miss-islington | set | messages:
+ msg352453 |
2019-09-14 20:29:44 | miss-islington | set | pull_requests:
+ pull_request15762 |
2019-09-14 20:29:38 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg352450
|
2019-09-14 20:29:37 | miss-islington | set | pull_requests:
+ pull_request15761 |
2019-09-14 18:17:57 | aeros | set | messages:
+ msg352446 |
2019-09-14 18:11:02 | aeros | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request15757 |
2019-07-26 19:36:59 | aeros | set | messages:
+ msg348521 |
2019-07-26 19:33:38 | aeros | set | messages:
+ msg348520 |
2019-07-26 09:18:18 | pitrou | set | messages:
+ msg348485 |
2019-07-26 07:34:49 | aeros | set | messages:
+ msg348480 |
2019-07-22 07:53:17 | aeros | set | messages:
+ msg348286 |
2019-07-22 07:22:11 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg348283
|
2019-07-22 07:18:46 | pitrou | set | messages:
+ msg348282 |
2019-07-20 23:10:35 | rhettinger | set | assignee: docs@python -> pitrou messages:
+ msg348223 |
2019-07-20 21:14:09 | aeros | set | messages:
+ msg348222 |
2019-07-20 16:30:01 | rhettinger | set | nosy:
+ pitrou, rhettinger messages:
+ msg348216
|
2019-07-20 04:02:28 | aeros | set | assignee: docs@python
components:
+ Documentation, Tests nosy:
+ docs@python |
2019-07-20 04:00:23 | aeros | create | |