classification
Title: base64 docs refers to strings instead of bytes
Type: behavior Stage: needs patch
Components: Documentation Versions: Python 3.1, Python 3.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: Dmitry.Jemerov, JingCheng.LIU, docs@python, eric.araujo, ezio.melotti, georg.brandl, orsenthil, pitrou, r.david.murray, stutzbach, terry.reedy
Priority: normal Keywords: patch

Created on 2010-09-01 03:44 by JingCheng.LIU, last changed 2010-10-18 11:47 by r.david.murray. This issue is now closed.

Messages (19)
msg115287 - (view) Author: JingCheng LIU (JingCheng.LIU) Date: 2010-09-01 03:44
http://docs.python.org/py3k/library/base64.html?highlight=base64
the examples given doesn't work
msg115289 - (view) Author: Daniel Stutzbach (stutzbach) (Python committer) Date: 2010-09-01 11:56
The example can be fixed by placing a "b" before the two string literals.

However, pretty much the whole document refers to "strings" and should refer to "byte sequences" or the "bytes" type.

I thought there were automated tests that exercised the documentation examples.  Am I wrong about that?
msg115403 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-09-02 20:20
See issue 4769, which would partially fix this problem if implemented correctly.  The docs should still be given a thorough review to use the appropriate bytes/string language.

There is a way to do automated testing of the code in the docs, but nobody has done the work to automate it yet.  Sphinx only recently got 3.x support, before which it wasn't even possible.  To run it by hand, do 'make doctest' in the Doc subdirectory of the checkout...the first step will be to fix all the failures....
msg115506 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-09-03 20:55
PATCH
Specifically, in section 17.6. base64..., near bottom, example should be

>>> import base64
>>> encoded = base64.b64encode(b'data to be encoded') #hang
>>> encoded
b'ZGF0YSB0byBiZSBlbmNvZGVk'
>>> data = base64.b64decode(encoded)
>>> data
b'data to be encoded'

with the first and third 'b' prefixes added.

I confirmed that doctest works with above.

I am a bit puzzled about Sphinx and 3.x comment, as doctest just need a plain ascii file and does not care how it was produced. (I used browser Save As text file function.) However, moot now.
msg115513 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-09-03 21:47
As an experiment, I ran doctest on 17.2 json saved as .txt, See #9767
4 failures, 2 obvious doc faults, 2 unclear to me.
Their were 2 similar doc faults in non-interactive code examples, so doctest is not enough to catch all bad code.

We clearly need to do this for the entire doc, preferably before 3.2 is released. A master issue is the wrong format, at least by itself. What I think is needed is a repository doc like Misc/maintainers.rst, call it Misc/doctests or Misc/docdoctests. It should have a line for each doc section with current status (blank for unchecked, n/a for no interactive example, issue number for fixes in progress). A master issue could then be a place where non-committers can report changes that committers could apply. What do you others think?
msg115520 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-09-03 22:29
It seems there are three ways of testing the docs:

1) ./python -m doctest Doc/library/json.rst
2) make doctest (a.k.a. sphinx-build -b doctest)
3) http://sphinx.pocoo.org/ext/doctest.html

Manually running 1) or 2) and fixing things seems okay for a first step, and when everything is fixed, we could add automation to prevent regressions. Georg is in the best place to say how to do it (through a thin test_docs.py integration layer between doctest and regrtest, with sphinx.ext.doctest, or by editing the test target of the makefile to run make doctest -C Doc).
msg115521 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-09-03 22:30
Note also that some docs (turtle) require running Tcl, which may be unwanted on headless machines like buildbots.
msg115526 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-09-03 22:45
Generally: +1 on making sure examples in the docs are up to date.  If someone wants to do the tedious work of making sure that a "make doctest" succeeds, I'm all for it, it may involve adding a few (in HTML output invisible) testsetup blocks.

Eric: I'm not sure what the difference between your methods 2 and 3 is :)

As Terry already mentioned, by far not all example code is covered by that, and I don't think there's anything we can do about it.  We can of course try converting more doc examples to doctest format, however:

a) for longer examples and especially function/class definitions this really hurts readability and usability (think copy/paste)

b) many examples are also not easily runnable (Tk is a good example, many more can be found in those for network services)
msg115529 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-09-03 22:51
> Generally: +1 on making sure examples in the docs are up to date.  If
> someone wants to do the tedious work of making sure that a "make
> doctest" succeeds, I'm all for it, it may involve adding a few (in
> HTML output invisible) testsetup blocks.

I'm not sure that's a good idea. It may add a lot of spurious imports
which only make the examples longer and less readable.
msg115530 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-09-03 22:54
2) works without changing anything, 3) requires using specific directives IIUC.
msg115532 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-09-03 22:57
>> Generally: +1 on making sure examples in the docs are up to date.  If
>> someone wants to do the tedious work of making sure that a "make
>> doctest" succeeds, I'm all for it, it may involve adding a few (in
>> HTML output invisible) testsetup blocks.
> 
> I'm not sure that's a good idea. It may add a lot of spurious imports
> which only make the examples longer and less readable.

That's why I said to use "testsetup" directives -- they are not visible in the HTML/PDF/... output, but used when running the tests.
msg115533 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-09-03 22:57
> 2) works without changing anything, 3) requires using specific directives IIUC.

No.  The doctest extension is what "make doctest" calls.
msg115535 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-09-03 23:00
Thanks for clearing this misunderstanding of mine.
msg115777 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010-09-07 15:27
I hope the trivial 2-byte fix does not get lost in the general issue.
msg118938 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-10-17 11:37
Fixed in r85642.
msg118984 - (view) Author: Daniel Stutzbach (stutzbach) (Python committer) Date: 2010-10-17 21:08
That fixes the example code, but what about the numerous text that reads  "strings" that should read "byte sequences", "bytes", or similar?
msg118990 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-10-17 23:13
I reviewed the doc and tightened up the wording (which was already mostly correct) in r85672.  Also fixed one typo and changed it to consistently use 'byte string' (rather than 'bytestring' which was used in one or two places).
msg119004 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-10-18 10:45
On Fri, Sep 03, 2010 at 10:57:17PM +0000, Georg Brandl wrote:
> That's why I said to use "testsetup" directives -- they are not
> visible in the HTML/PDF/... output, but used when running the tests.

Do you already have such a directive in sphinx? I think, it would be a
good idea to have doctests succeed. And having examples in the docs
working 'directly out of text'.
msg119009 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-10-18 11:47
Yes, Georg mentioned the directive because it exists :)

See the turtle docs for some examples, I think.  I seem to remember using it when I made those doctests pass on 2.7 (warning: it writes weird stuff on your screen :)
History
Date User Action Args
2010-10-18 11:47:14r.david.murraysetmessages: + msg119009
2010-10-18 10:45:19orsenthilsetmessages: + msg119004
2010-10-17 23:13:42r.david.murraysetmessages: + msg118990
2010-10-17 21:08:35stutzbachsetmessages: + msg118984
2010-10-17 11:37:23georg.brandlsetstatus: open -> closed
resolution: accepted -> fixed
dependencies: - b64decode should accept strings or bytes
messages: + msg118938
2010-09-07 15:27:00terry.reedysetmessages: + msg115777
2010-09-03 23:00:34eric.araujosetmessages: + msg115535
2010-09-03 22:57:47georg.brandlsetmessages: + msg115533
2010-09-03 22:57:15georg.brandlsetmessages: + msg115532
2010-09-03 22:54:28eric.araujosetmessages: + msg115530
2010-09-03 22:51:24pitrousetmessages: + msg115529
2010-09-03 22:45:22georg.brandlsetmessages: + msg115526
2010-09-03 22:30:54eric.araujosetmessages: + msg115521
2010-09-03 22:29:12eric.araujosetnosy: + eric.araujo
messages: + msg115520
2010-09-03 21:47:02terry.reedysetnosy: + georg.brandl
messages: + msg115513
2010-09-03 20:55:33terry.reedysetkeywords: + patch
nosy: + terry.reedy
messages: + msg115506

2010-09-02 20:20:50r.david.murraysetnosy: + r.david.murray
dependencies: + b64decode should accept strings or bytes
messages: + msg115403
2010-09-01 11:56:47stutzbachsetversions: + Python 3.2

nosy: + stutzbach
title: base64 encoding takes in bytes rather than string. -> base64 docs refers to strings instead of bytes
messages: + msg115289
resolution: accepted
stage: needs patch
2010-09-01 03:44:24JingCheng.LIUcreate