Improving Lib Doc Sequence Types Section #49216

terryjreedy · 2009-01-16T23:43:38Z

BPO	4966
Nosy	@birkenfeld, @rhettinger, @terryjreedy, @ncoghlan, @ezio-melotti, @merwok
Files	stdtypes.html: First cut - split into 3 sections, new Sequence Types section updated 0a49f6382467.diff

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/ncoghlan'
closed_at = <Date 2012-08-20.07:14:23.346>
created_at = <Date 2009-01-16.23:43:37.613>
labels = ['type-feature', 'docs']
title = 'Improving Lib Doc Sequence Types Section'
updated_at = <Date 2012-08-20.07:14:23.344>
user = 'https://github.com/terryjreedy'

bugs.python.org fields:

activity = <Date 2012-08-20.07:14:23.344>
actor = 'python-dev'
assignee = 'ncoghlan'
closed = True
closed_date = <Date 2012-08-20.07:14:23.346>
closer = 'python-dev'
components = ['Documentation']
creation = <Date 2009-01-16.23:43:37.613>
creator = 'terry.reedy'
dependencies = []
files = ['24314', '24511']
hgrepos = ['106']
issue_num = 4966
keywords = ['patch']
message_count = 30.0
messages = ['79988', '114849', '114859', '130531', '130534', '130535', '134953', '136102', '136108', '143293', '151774', '151777', '151802', '151805', '151893', '151905', '151910', '152097', '152098', '152100', '152143', '152153', '152160', '152240', '152277', '152307', '152308', '152312', '153187', '168630']
nosy_count = 10.0
nosy_names = ['georg.brandl', 'rhettinger', 'terry.reedy', 'dcbbcd', 'ncoghlan', 'ezio.melotti', 'eric.araujo', 'docs@python', 'python-dev', 'anasofiapaixao']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue4966'
versions = ['Python 3.3', 'Python 3.4']

terryjreedy · 2009-01-16T23:43:34Z

Issues and suggestions for Python Standard Library / Built-in Types /
"Sequence Types — str, bytes, bytearray, list, tuple, range"

Put subsections in the same order as in the title and main section.
In particular, move bytes/bytearray subsection up to follow string
subsection and move range subsection down to bottom of this grouping.
String paragraph (the second) ends with the rather wordy sentence
"In addition to the functionality described here, there are also
string-specific methods described in the String Methods section."
where 'String Methods' is a forward link to that subsection.
Add similar possibly less wordy sentence-links for other types.

In particular, end next (byte/bytearray) paragraph with something like
"For specific methods, see String Methods and Bytes and Byte Array
Methods. For bytearrays, also see Mutable Sequence Types."

End the list/tuple paragraph after the Warning with
"For list methods, see Mutable Sequence Types."

After the following range paragraph, the following could be added:
"For more, see Range Type."
However, there is almost nothing more said (perhaps there was before
range objects were stripped down), so I suggest deleting that subsection
and adding anything more that is not duplication to the end of the
beginning section's range paragraph. If tuples do not need their own
section, range needs one even less.

Bytes and Byte Array Method subsection correctly says that bytes and
bytearrays do not have (senseless) .encode but neglects to document the
corresponding inverse .decode method (while it does mention the
specialized .fromhex decoding method).

Also add .isdecimal, .isnumeric, .isprintable, and .maketrans to the
list of exceptions in the first sentence. (Based on dir(str), dir(bytes)
in 3.0)

I see three problems with the current documentation of count and
index methods.
a) They are documented under both String Methods and Mutable Sequences.
They do not really belong in the latter, which lists "additional
operations that allow in-place modification of the object", because they
do not mutate.
b) Tuples do not have their own a section, but (unlike range objects) do
have a couple of methods: count and index. Being neither string-like
nor mutable, their having methods is undocumented.
c) Bytearrays, on the other hand, are both string-like and mutable. So
they are (mis)documented as having two slightly different versions of
these methods. (They actually use the string-like definition, of course.)

Consequently, the definitions of count and index in the Mutable Sequence
subsection are not mutable sequence definitions but are really
list/tuple definitions. So I suggest one of two variations:
A) In the main section, add the list/tuple version of .count() and
.index() to the table of common sequence operations with a footnote
either explaining the difference for the string group or referring to
String Methods.
B) In the main section, add both versions to the table with footnotes
explaining which is which.

The count/index/tuple doc issue has come up more than once on c.l.p.

merwok · 2010-08-24T23:09:18Z

I’m interested in making a series of patches corresponding to your suggestions, unless you or someone else want to do them.

I’m assigning to myself so that I don’t forget (I won’t have time for a couple weeks), if someone wants to do it as an easy first patch (Terry did most of the work :), it’s okay, just remove the assignment from me.

terryjreedy · 2010-08-25T00:48:12Z

Please go ahead. I will gladly review anything you do.

ezio-melotti · 2011-03-10T23:19:48Z

This is maybe out of the scope of this issue, but I would like to see all the basic data types on single page on their own. The current page0 has some section about data types mixed with sections about operations, comparisons, and other things, followed by less-"used" types. The page also contains lot of informations and it's not easy to browse (42 screens on a 24" monitor).

Ideally the structure should be something like:

True, False, None
int, float(, long, complex)
str, unicode, list, tuple(, bytearray, buffer, xrange)
dict
set(, frozenset)

(where the types in () are considered less important -- so maybe described in detail later or in another page). The page can list common operations for each group and their methods, but leaving things like the string formatting operations to another page/section.

terryjreedy · 2011-03-10T23:34:28Z

I have started learning .rst, so I hope to work on this in the not too distant future.

Ezio -- I have also noticed that some chapters are too long to be easily scrolled around in (unittest is another), and either need an index at the top (like with built-in functions) or separate files (or both)

ezio-melotti · 2011-03-10T23:57:30Z

The advantage of having one big page is that you can ctrl+f easily without having to go back and forth from different pages
On the other hand, the page is not easy to browse (especially on small screens, mobile devices, old/slow pcs).

In this case I don't think that splitting the page is a problem, because the page contains information about different and fairly unrelated thing.

With pages like unittest or logging is not so easy to split because while working with them you might need to use several different functions/methods/classes and having their docs on two or more page will be annoying. (FWIW I've been working a lot on the unittest doc to make it more "compact" and easier to browse, but there's still work to do. We have also been considering to make a page for unittest "users" that explains how to write tests and use the assert methods and another for unittest "developers" that explains how to write test runners, suites and more advanced stuff.)

BTW .rst is really easy, and if you are not sure about something just try to build the doc with "make html" and see if it complains and if the resulting page looks OK. Also see http://docs.python.org/documenting/index.html.

ezio-melotti · 2011-05-02T07:06:50Z

See also bpo-11975 and bpo-11976.

anasofiapaixao · 2011-05-16T15:12:20Z

I was taking a look into the possibility of splitting this page into several pages, and wondered: could the contents of the Comparisons and the Boolean operations sections just be merged into Python Reference / Expressions, and then deleted from this page altogether? They are not even data types but operators, after all.

ezio-melotti · 2011-05-16T15:38:33Z

I think it should be OK. The stdtypes page could then mention type-specific behavior in the types' sections (e.g. <, <=, =>, > for sets) and link to the language reference for the general behavior.

ncoghlan · 2011-09-01T01:40:45Z

Bringing a suggestion over from bpo-12874, I think it may be worth splitting the current "Sequence Types" section into 3 pieces that all appear in the top level table of contents for the library reference:

4.6 Sequence Types - list, tuple, range
4.7 Text Sequence Type - str
4.8 Binary Data Sequence Types - bytes, bytearray, memoryview

ncoghlan · 2012-01-22T13:17:18Z

Éric, are you still planning to work on this? Otherwise I'll make a first pass at doing the split into 3 sections (as per my earlier comment) and implementing some of Terry's suggestions.

Linked Hg repo is a 2.7 based feature branch where I'll be publishing my changes as I make them.

ezio-melotti · 2012-01-22T15:33:09Z

Éric is without Internet till the end of the month, so I think it's OK if you go ahead and start working on this.

birkenfeld · 2012-01-23T08:27:50Z

+1 for splitting.

rhettinger · 2012-01-23T09:28:59Z

+1 for Nick's suggested breakout:

4.6 Sequence Types - list, tuple, range
4.7 Text Sequence Type - str
4.8 Binary Data Sequence Types - bytes, bytearray, memoryview

ncoghlan · 2012-01-24T10:56:27Z

I realised that the lack of a clear binary/text distinction would make it messy to do the split docs in 2.7, so I made a new branch based on 3.2 instead (link to repo updated accordingly).

ncoghlan · 2012-01-24T13:30:01Z

Pushed an initial cut to my sandbox branch. Built HTML is attached so you can get a general idea of how it looks (links, etc, obviously won't work).

So far, I have made the split into 3 sections and updated the new (shorter) Sequence Types section.

That section now has 6 subsections:

Common Sequence Operations
Immutable Sequence Operations (very short, just mentions hash support)
Mutable Sequence Operations
Lists
Tuples
Ranges

I haven't really touched the Text and Binary sections as yet - the only changes there are things that I copied down before removing them from the updated Sequence Types section.

ncoghlan · 2012-01-24T14:09:07Z

Note: without the Python docs CSS to create the sidebar, the internal table of contents appears at the *bottom* of the rendered page.

Really, reviewing this sensibly is probably going to require building the docs locally after using hg pull to retrieve the changes from my sandbox.

ncoghlan · 2012-01-27T14:20:19Z

Branch status update:

Text Sequence Types section updated to reflect the new structure
changed the prose that describes the relationship between printf-style formatting and the str.format method (deliberately removing the implication that the former is any real danger of disappearing - it's simply not practical for us to seriously contemplate killing it off)
in the top level index, I split the old "String Services" section into "Text Processing Services" and "Binary Data Services". The latter contains 'struct' and 'codecs', the former contains everthing else that used to be in String Services. The index pages for the two sections do cross-reference modules in the other section a bit (Text Processing includes a pointer directly to the codecs module, Binary Data includes pointers to both re and difflib). The real driver for this change was that "struct" has no place in a "String Services" section in Py3k. Since "codecs" could really have gone in either section, I mainly moved it to the binary section so that 'struct' wasn't the only module in there.

Major remaining update is to the Binary Sequence Types section (since I haven't really reviewed that at all after rearranging things.

ncoghlan · 2012-01-27T14:33:47Z

One other things the branch doesn't currently sort out is the official signature of count() and index().

In 3.2, for *all* of str, bytes, bytearray, tuple, list, range, the index() method takes the optional start:stop parameters.

collections.Sequence.index(), OTOH, does not.

count() splits the field more evenly: str, bytes, bytearray accept the extra parameters, but list, tuple, range and collections.Sequence only support counting values in the whole sequence.

ezio-melotti · 2012-01-27T15:12:22Z

Have you considered/planned to rework a bit the beginning of the page too?
(Technically the issue is about the Sequence types section, but the whole page could be improved.)
IMHO the sections about Truth value testing, Boolean operations, and Comparison are out of place there, and True/False/None should be described instead.
The idea is that a developer new to Python should be able to come to this page, take a look at the headers and figure out what the main types are (what you did so far is already a good step in the right direction).

ncoghlan · 2012-01-28T00:22:42Z

Yeah, the basic layout of this entire section has been in place for a *long* time (http://docs.python.org/release/1.4/lib/node4.html#SECTION00310000000000000000)

Some aspects haven't really aged all that well, as people have made minimalist changes to document new features without necessarily stepping back to see if the overall structure still makes sense.

However, rather than dumping one massive patch on python-checkins, I think it makes sense to try to tackle it by section (i.e. sequences + related changes for now, then look at mappings, sets, truth values, comparisons and numbers separately).

One thing I do plan to do is a quick scan for places that reference into the sequence types section to see if they should be adjusted (e.g. see if there's some duplication in the language reference that could be reduced, or cross-references in the glossary to add or update)

ncoghlan · 2012-01-28T10:05:08Z

One other point... the branch is actually now relative to default, not 3.2. While that was due to a merging mistake on my part, it also means I can legitimately ignore the narrow/wide build distinction in the section on strings.

ncoghlan · 2012-01-28T12:05:02Z

I finished off the binary data section, so the first draft of the update is now complete in the bitbucket repo.

ezio-melotti · 2012-01-29T15:44:56Z

One other point... the branch is actually now relative to default, not
3.2. While that was due to a merging mistake on my part, it also means
I can legitimately ignore the narrow/wide build distinction in the
section on strings.

So will this go on 3.3 only or are you planning to push it on 3.2(/2.7) too?

ncoghlan · 2012-01-29T22:55:24Z

Trying to make this change in 2.7 would actually be a bit of a nightmare - how do you cleanly split documentation of the binary data and text processing sequence types when "str" is used for both?

The change would be *mostly* feasible in 3.2 (that's why I started my branch from there), but there are still some sharp edges that go away in 3.3 (mainly the narrow/wide Unicode split).

So unless anyone is really keen to see the update in 3.2, my current plan is to leave the maintenance versions alone and only update it for 3.3. Going that way also provides better opportunities for post-checkin feedback from folks that aren't set up to build the docs themselves (rebuilding the docs is fairly straightforward on *nix, but Terry tells me that using Windows complicates that process quite a bit).

terryjreedy · 2012-01-30T06:44:38Z

I agree with 3.3 only. This might not be ready for 3.2.3 anyway, depending on how soon hash patch is ready, and if not, it becomes a somewhat moot point as new people should then download 3.3.0 instead of 3.2.4 next August.

birkenfeld · 2012-01-30T06:51:21Z

ISTM that not doing this will make maintenance harder. For 2.7 I agree that there is no clear boundary to make, but 3.2 should be split up as well to ease merging of updates.

ncoghlan · 2012-01-30T08:08:00Z

Good point, without doing the split in both, any doc merges in this section will be a nightmare. OK, with the caveat that the initial 3.2 version may gloss over some issues that no longer apply in 3.3 (specifically the narrow/wide split), I'll make a new branch in the sandbox so the changes will be once again based on 3.2.

ncoghlan · 2012-02-12T08:42:53Z

Just noting that this has slipped a bit down my Python to-do list (there are other things I want to focus on before the first 3.3 alpha).

I'll get back to it at some point, but if someone want to take my branch and run with it in the meantime, please feel free.

python-dev · 2012-08-20T07:14:23Z

New changeset 463f52d20314 by Nick Coghlan in branch 'default':
Close bpo-4966: revamp the sequence docs in order to better explain the state of modern Python
http://hg.python.org/cpython/rev/463f52d20314

terryjreedy assigned birkenfeld Jan 16, 2009

terryjreedy added the docs Documentation in the Doc dir label Jan 16, 2009

ezio-melotti added the type-feature A feature request or enhancement label Jun 6, 2009

BreamoreBoy mannequin assigned docspython and unassigned birkenfeld Jul 10, 2010

merwok assigned merwok and unassigned docspython Aug 24, 2010

ncoghlan assigned ncoghlan and unassigned merwok Jan 24, 2012

ncoghlan removed their assignment Feb 12, 2012

ncoghlan self-assigned this Aug 20, 2012

python-dev mannequin closed this as completed Aug 20, 2012

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving Lib Doc Sequence Types Section #49216

Improving Lib Doc Sequence Types Section #49216

terryjreedy commented Jan 16, 2009

terryjreedy commented Jan 16, 2009

merwok commented Aug 24, 2010

terryjreedy commented Aug 25, 2010

ezio-melotti commented Mar 10, 2011

terryjreedy commented Mar 10, 2011

ezio-melotti commented Mar 10, 2011

ezio-melotti commented May 2, 2011

anasofiapaixao mannequin commented May 16, 2011

ezio-melotti commented May 16, 2011

ncoghlan commented Sep 1, 2011

ncoghlan commented Jan 22, 2012

ezio-melotti commented Jan 22, 2012

birkenfeld commented Jan 23, 2012

rhettinger commented Jan 23, 2012

ncoghlan commented Jan 24, 2012

ncoghlan commented Jan 24, 2012

ncoghlan commented Jan 24, 2012

ncoghlan commented Jan 27, 2012

ncoghlan commented Jan 27, 2012

ezio-melotti commented Jan 27, 2012

ncoghlan commented Jan 28, 2012

ncoghlan commented Jan 28, 2012

ncoghlan commented Jan 28, 2012

ezio-melotti commented Jan 29, 2012

ncoghlan commented Jan 29, 2012

terryjreedy commented Jan 30, 2012

birkenfeld commented Jan 30, 2012

ncoghlan commented Jan 30, 2012

ncoghlan commented Feb 12, 2012

python-dev mannequin commented Aug 20, 2012

Improving Lib Doc Sequence Types Section #49216

Improving Lib Doc Sequence Types Section #49216

Comments

terryjreedy commented Jan 16, 2009

terryjreedy commented Jan 16, 2009

merwok commented Aug 24, 2010

terryjreedy commented Aug 25, 2010

ezio-melotti commented Mar 10, 2011

terryjreedy commented Mar 10, 2011

ezio-melotti commented Mar 10, 2011

ezio-melotti commented May 2, 2011

anasofiapaixao mannequin commented May 16, 2011

ezio-melotti commented May 16, 2011

ncoghlan commented Sep 1, 2011

ncoghlan commented Jan 22, 2012

ezio-melotti commented Jan 22, 2012

birkenfeld commented Jan 23, 2012

rhettinger commented Jan 23, 2012

ncoghlan commented Jan 24, 2012

ncoghlan commented Jan 24, 2012

ncoghlan commented Jan 24, 2012

ncoghlan commented Jan 27, 2012

ncoghlan commented Jan 27, 2012

ezio-melotti commented Jan 27, 2012

ncoghlan commented Jan 28, 2012

ncoghlan commented Jan 28, 2012

ncoghlan commented Jan 28, 2012

ezio-melotti commented Jan 29, 2012

ncoghlan commented Jan 29, 2012

terryjreedy commented Jan 30, 2012

birkenfeld commented Jan 30, 2012

ncoghlan commented Jan 30, 2012

ncoghlan commented Feb 12, 2012

python-dev mannequin commented Aug 20, 2012