classification
Title: Improving Lib Doc Sequence Types Section
Type: enhancement Stage: committed/rejected
Components: Documentation Versions: Python 3.4, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: anasofiapaixao, dcbbcd, docs@python, eric.araujo, ezio.melotti, georg.brandl, ncoghlan, python-dev, rhettinger, terry.reedy
Priority: normal Keywords: patch

Created on 2009-01-16 23:43 by terry.reedy, last changed 2012-08-20 07:14 by python-dev. This issue is now closed.

Files
File name Uploaded Description Edit
stdtypes.html ncoghlan, 2012-01-24 13:30 First cut - split into 3 sections, new Sequence Types section updated
0a49f6382467.diff tshepang, 2012-02-13 17:17 review
Repositories containing patches
https://bitbucket.org/ncoghlan/cpython_sandbox#seq_docs_update
Messages (30)
msg79988 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2009-01-16 23:43
Issues and suggestions for Python Standard Library / Built-in Types /
"Sequence Types — str, bytes, bytearray, list, tuple, range"

1. Put subsections in the same order as in the title and main section. 
In particular, move bytes/bytearray subsection up to follow string
subsection and move range subsection down to bottom of this grouping.

2. String paragraph (the second) ends with the rather wordy sentence
"In addition to the functionality described here, there are also
string-specific methods described in the String Methods section."
where 'String Methods' is a forward link to that subsection.
Add similar possibly less wordy sentence-links for other types.

In particular, end next (byte/bytearray) paragraph with something like
"For specific methods, see String Methods and Bytes and Byte Array
Methods. For bytearrays, also see Mutable Sequence Types."

End the list/tuple paragraph after the Warning with
"For list methods, see Mutable Sequence Types."

After the following range paragraph, the following could be added:
"For more, see Range Type."
However, there is almost nothing more said (perhaps there was before
range objects were stripped down), so I suggest deleting that subsection
and adding anything more that is not duplication to the end of the
beginning section's range paragraph.  If tuples do not need their own
section, range needs one even less.

3. Bytes and Byte Array Method subsection correctly says that bytes and
bytearrays do not have (senseless) .encode but neglects to document the
corresponding inverse .decode method (while it does mention the
specialized .fromhex decoding method).

Also add .isdecimal, .isnumeric, .isprintable, and .maketrans to the
list of exceptions in the first sentence. (Based on dir(str), dir(bytes)
in 3.0)

4. I see three problems with the current documentation of count and
index methods.
a)  They are documented under both String Methods and Mutable Sequences.
 They do not really belong in the latter, which lists "additional
operations that allow in-place modification of the object", because they
do not mutate.
b) Tuples do not have their own a section, but (unlike range objects) do
have a couple of methods: count and index.  Being neither string-like
nor mutable, their having methods is undocumented.
c) Bytearrays, on the other hand, are both string-like and mutable.  So
they are (mis)documented as having two slightly different versions of
these methods. (They actually use the string-like definition, of course.)

Consequently, the definitions of count and index in the Mutable Sequence
subsection are not mutable sequence definitions but are really
list/tuple definitions.  So I suggest one of two variations:
A) In the main section, add the list/tuple version of .count() and
.index() to the table of common sequence operations with a footnote
either explaining the difference for the string group or referring to
String Methods.
B) In the main section, add both versions to the table with footnotes
explaining which is which.

The count/index/tuple doc issue has come up more than once on c.l.p.
msg114849 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-08-24 23:09
I’m interested in making a series of patches corresponding to your suggestions, unless you or someone else want to do them.

I’m assigning to myself so that I don’t forget (I won’t have time for a couple weeks), if someone wants to do it as an easy first patch (Terry did most of the work :), it’s okay, just remove the assignment from me.
msg114859 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-08-25 00:48
Please go ahead. I will gladly review anything you do.
msg130531 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-03-10 23:19
This is maybe out of the scope of this issue, but I would like to see all the basic data types on single page on their own.  The current page[0] has some section about data types mixed with sections about operations, comparisons, and other things, followed by less-"used" types.  The page also contains lot of informations and it's not easy to browse (42 screens on a 24" monitor).

Ideally the structure should be something like:

1. True, False, None
2. int, float(, long, complex)
3. str, unicode, list, tuple(, bytearray, buffer, xrange)
4. dict
5. set(, frozenset)

(where the types in () are considered less important -- so maybe described in detail later or in another page).  The page can list common operations for each group and their methods, but leaving things like the string formatting operations to another page/section.


[0]: http://docs.python.org/library/stdtypes.html
msg130534 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-03-10 23:34
I have started learning .rst, so I hope to work on this in the not too distant future.

Ezio -- I have also noticed that some chapters are too long to be easily scrolled around in (unittest is another), and either need an index at the top (like with built-in functions) or separate files (or both)
msg130535 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-03-10 23:57
The advantage of having one big page is that you can ctrl+f easily without having to go back and forth from different pages
On the other hand, the page is not easy to browse (especially on small screens, mobile devices, old/slow pcs).

In this case I don't think that splitting the page is a problem, because the page contains information about different and fairly unrelated thing.

With pages like unittest or logging is not so easy to split because while working with them you might need to use several different functions/methods/classes and having their docs on two or more page will be annoying.  (FWIW I've been working a lot on the unittest doc to make it more "compact" and easier to browse, but there's still work to do.  We have also been considering to make a page for unittest "users" that explains how to write tests and use the assert methods and another for unittest "developers" that explains how to write test runners, suites and more advanced stuff.)

BTW .rst is really easy, and if you are not sure about something just try to build the doc with "make html" and see if it complains and if the resulting page looks OK.  Also see http://docs.python.org/documenting/index.html.
msg134953 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-05-02 07:06
See also #11975 and #11976.
msg136102 - (view) Author: Ana Sofia Paixão (anasofiapaixao) Date: 2011-05-16 15:12
I was taking a look into the possibility of splitting this page into several pages, and wondered: could the contents of the Comparisons and the Boolean operations sections just be merged into Python Reference / Expressions, and then deleted from this page altogether? They are not even data types but operators, after all.
msg136108 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-05-16 15:38
I think it should be OK.  The stdtypes page could then mention type-specific behavior in the types' sections (e.g. <, <=, =>, > for sets) and link to the language reference for the general behavior.
msg143293 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-09-01 01:40
Bringing a suggestion over from #12874, I think it may be worth splitting the current "Sequence Types" section into 3 pieces that all appear in the top level table of contents for the library reference:

4.6 Sequence Types - list, tuple, range
4.7 Text Sequence Type - str
4.8 Binary Data Sequence Types - bytes, bytearray, memoryview
msg151774 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-22 13:17
Éric, are you still planning to work on this? Otherwise I'll make a first pass at doing the split into 3 sections (as per my earlier comment) and implementing some of Terry's suggestions.

Linked Hg repo is a 2.7 based feature branch where I'll be publishing my changes as I make them.
msg151777 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-01-22 15:33
Éric is without Internet till the end of the month, so I think it's OK if you go ahead and start working on this.
msg151802 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-01-23 08:27
+1 for splitting.
msg151805 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012-01-23 09:28
+1 for Nick's suggested breakout:

4.6 Sequence Types - list, tuple, range
4.7 Text Sequence Type - str
4.8 Binary Data Sequence Types - bytes, bytearray, memoryview
msg151893 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-24 10:56
I realised that the lack of a clear binary/text distinction would make it messy to do the split docs in 2.7, so I made a new branch based on 3.2 instead (link to repo updated accordingly).
msg151905 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-24 13:30
Pushed an initial cut to my sandbox branch. Built HTML is attached so you can get a general idea of how it looks (links, etc, obviously won't work).

So far, I have made the split into 3 sections and updated the new (shorter) Sequence Types section.

That section now has 6 subsections:
- Common Sequence Operations
- Immutable Sequence Operations (very short, just mentions hash support)
- Mutable Sequence Operations
- Lists
- Tuples
- Ranges

I haven't really touched the Text and Binary sections as yet - the only changes there are things that I copied down before removing them from the updated Sequence Types section.
msg151910 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-24 14:09
Note: without the Python docs CSS to create the sidebar, the internal table of contents appears at the *bottom* of the rendered page.

Really, reviewing this sensibly is probably going to require building the docs locally after using hg pull to retrieve the changes from my sandbox.
msg152097 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-27 14:20
Branch status update:
- Text Sequence Types section updated to reflect the new structure
- changed the prose that describes the relationship between printf-style formatting and the str.format method (deliberately removing the implication that the former is any real danger of disappearing - it's simply not practical for us to seriously contemplate killing it off)
- in the top level index, I split the old "String Services" section into "Text Processing Services" and "Binary Data Services". The latter contains 'struct' and 'codecs', the former contains everthing else that used to be in String Services. The index pages for the two sections do cross-reference modules in the other section a bit (Text Processing includes a pointer directly to the codecs module, Binary Data includes pointers to both re and difflib). The real driver for this change was that "struct" has no place in a "String Services" section in Py3k. Since "codecs" could really have gone in either section, I mainly moved it to the binary section so that 'struct' wasn't the only module in there.

Major remaining update is to the Binary Sequence Types section (since I haven't really reviewed that at all after rearranging things.
msg152098 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-27 14:33
One other things the branch doesn't currently sort out is the official signature of count() and index().

In 3.2, for *all* of str, bytes, bytearray, tuple, list, range, the index() method takes the optional start:stop parameters.

collections.Sequence.index(), OTOH, does not.

count() splits the field more evenly: str, bytes, bytearray accept the extra parameters, but list, tuple, range and collections.Sequence only support counting values in the whole sequence.
msg152100 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-01-27 15:12
Have you considered/planned to rework a bit the beginning of the page too?
(Technically the issue is about the Sequence types section, but the whole page could be improved.)
IMHO the sections about Truth value testing, Boolean operations, and Comparison are out of place there, and True/False/None should be described instead.
The idea is that a developer new to Python should be able to come to this page, take a look at the headers and figure out what the main types are (what you did so far is already a good step in the right direction).
msg152143 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-28 00:22
Yeah, the basic layout of this entire section has been in place for a *long* time (http://docs.python.org/release/1.4/lib/node4.html#SECTION00310000000000000000)

Some aspects haven't really aged all that well, as people have made minimalist changes to document new features without necessarily stepping back to see if the overall structure still makes sense.

However, rather than dumping one massive patch on python-checkins, I think it makes sense to try to tackle it by section (i.e. sequences + related changes for now, then look at mappings, sets, truth values, comparisons and numbers separately).

One thing I do plan to do is a quick scan for places that reference into the sequence types section to see if they should be adjusted (e.g. see if there's some duplication in the language reference that could be reduced, or cross-references in the glossary to add or update)
msg152153 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-28 10:05
One other point... the branch is actually now relative to default, not 3.2. While that was due to a merging mistake on my part, it also means I can legitimately ignore the narrow/wide build distinction in the section on strings.
msg152160 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-28 12:05
I finished off the binary data section, so the first draft of the update is now complete in the bitbucket repo.
msg152240 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-01-29 15:44
> One other point... the branch is actually now relative to default, not
> 3.2. While that was due to a merging mistake on my part, it also means
> I can legitimately ignore the narrow/wide build distinction in the
> section on strings.

So will this go on 3.3 only or are you planning to push it on 3.2(/2.7) too?
msg152277 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-29 22:55
Trying to make this change in 2.7 would actually be a bit of a nightmare - how do you cleanly split documentation of the binary data and text processing sequence types when "str" is used for both?

The change would be *mostly* feasible in 3.2 (that's why I started my branch from there), but there are still some sharp edges that go away in 3.3 (mainly the narrow/wide Unicode split).

So unless anyone is really keen to see the update in 3.2, my current plan is to leave the maintenance versions alone and only update it for 3.3. Going that way also provides better opportunities for post-checkin feedback from folks that aren't set up to build the docs themselves (rebuilding the docs is fairly straightforward on *nix, but Terry tells me that using Windows complicates that process quite a bit).
msg152307 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012-01-30 06:44
I agree with 3.3 only. This might not be ready for 3.2.3 anyway, depending on how soon hash patch is ready, and if not, it becomes a somewhat moot point as new people should then download 3.3.0 instead of 3.2.4 next August.
msg152308 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2012-01-30 06:51
ISTM that not doing this will make maintenance harder.  For 2.7 I agree that there is no clear boundary to make, but 3.2 should be split up as well to ease merging of updates.
msg152312 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-01-30 08:08
Good point, without doing the split in both, any doc merges in this section will be a nightmare. OK, with the caveat that the initial 3.2 version may gloss over some issues that no longer apply in 3.3 (specifically the narrow/wide split), I'll make a new branch in the sandbox so the changes will be once again based on 3.2.
msg153187 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2012-02-12 08:42
Just noting that this has slipped a bit down my Python to-do list (there are other things I want to focus on before the first 3.3 alpha).

I'll get back to it at some point, but if someone want to take my branch and run with it in the meantime, please feel free.
msg168630 - (view) Author: Roundup Robot (python-dev) Date: 2012-08-20 07:14
New changeset 463f52d20314 by Nick Coghlan in branch 'default':
Close #4966: revamp the sequence docs in order to better explain the state of modern Python
http://hg.python.org/cpython/rev/463f52d20314
History
Date User Action Args
2012-08-20 07:14:23python-devsetstatus: open -> closed

nosy: + python-dev
messages: + msg168630

resolution: fixed
stage: needs patch -> committed/rejected
2012-08-20 07:00:14ncoghlansetassignee: ncoghlan
versions: - Python 3.2
2012-08-09 13:22:17ezio.melottisetversions: + Python 3.4
2012-02-13 17:17:29tshepangsetfiles: + 0a49f6382467.diff
keywords: + patch
2012-02-12 08:42:53ncoghlansetassignee: ncoghlan -> (no value)
messages: + msg153187
2012-01-30 08:08:00ncoghlansetmessages: + msg152312
2012-01-30 06:51:21georg.brandlsetmessages: + msg152308
2012-01-30 06:44:38terry.reedysetmessages: + msg152307
2012-01-29 22:55:24ncoghlansetmessages: + msg152277
2012-01-29 15:44:55ezio.melottisetmessages: + msg152240
2012-01-28 12:05:02ncoghlansetmessages: + msg152160
2012-01-28 10:05:08ncoghlansetmessages: + msg152153
2012-01-28 00:22:41ncoghlansetmessages: + msg152143
2012-01-27 15:12:22ezio.melottisetmessages: + msg152100
2012-01-27 14:33:47ncoghlansetmessages: + msg152098
2012-01-27 14:20:19ncoghlansetmessages: + msg152097
2012-01-24 14:09:07ncoghlansetmessages: + msg151910
2012-01-24 13:30:22ncoghlansetfiles: + stdtypes.html

messages: + msg151905
2012-01-24 10:56:27ncoghlansetassignee: eric.araujo -> ncoghlan
messages: + msg151893
2012-01-23 09:28:58rhettingersetnosy: + rhettinger
messages: + msg151805
2012-01-23 08:27:49georg.brandlsetmessages: + msg151802
2012-01-22 15:33:09ezio.melottisetmessages: + msg151777
2012-01-22 13:17:18ncoghlansethgrepos: + hgrepo106
messages: + msg151774
2011-11-25 04:44:45ezio.melottisetversions: - Python 3.1
2011-09-01 01:40:45ncoghlansetnosy: + ncoghlan
messages: + msg143293
2011-05-16 15:38:33ezio.melottisetmessages: + msg136108
2011-05-16 15:12:19anasofiapaixaosetnosy: + anasofiapaixao
messages: + msg136102
2011-05-02 07:06:49ezio.melottisetmessages: + msg134953
2011-03-10 23:57:30ezio.melottisetnosy: georg.brandl, terry.reedy, dcbbcd, ezio.melotti, eric.araujo, docs@python
messages: + msg130535
2011-03-10 23:34:28terry.reedysetnosy: georg.brandl, terry.reedy, dcbbcd, ezio.melotti, eric.araujo, docs@python
messages: + msg130534
2011-03-10 23:19:48ezio.melottisetnosy: georg.brandl, terry.reedy, dcbbcd, ezio.melotti, eric.araujo, docs@python
messages: + msg130531
2011-03-09 02:58:11terry.reedysetnosy: georg.brandl, terry.reedy, dcbbcd, ezio.melotti, eric.araujo, docs@python
versions: + Python 3.3
2010-08-25 00:48:12terry.reedysetmessages: + msg114859
2010-08-24 23:09:18eric.araujosetassignee: docs@python -> eric.araujo

messages: + msg114849
nosy: + eric.araujo
2010-07-10 19:28:49terry.reedysetversions: + Python 3.1
2010-07-10 12:16:38BreamoreBoysetassignee: georg.brandl -> docs@python

nosy: + docs@python
versions: + Python 3.2, - Python 3.0, Python 3.1
2009-06-06 19:54:06ezio.melottisetpriority: normal
nosy: + ezio.melotti

type: enhancement
stage: needs patch
2009-06-05 12:56:26dcbbcdsetnosy: + dcbbcd
2009-01-16 23:43:37terry.reedycreate