This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Pydoc interactive browsing enhancement
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: belopolsky, eric.araujo, georg.brandl, ncoghlan, ping, r.liebscher, ron_adam, vstinner
Priority: normal Keywords: buildbot

Created on 2008-02-02 23:05 by ron_adam, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue2001_ncoghlan_cleanup.diff ncoghlan, 2010-11-07 14:18 Additional cleanup before compatibility issues were noted
issue2001_b.diff ron_adam, 2010-11-15 17:55 New enhanced browsing, deprecated gui(), serve(), and '-g' option.
pydoc.png belopolsky, 2010-11-18 05:44
issue2001_c.diff belopolsky, 2010-11-19 15:05
issue2001_d.diff ron_adam, 2010-11-19 17:59 Many small fix's and removal of urllib parse and tests.
issue2001_e.diff ron_adam, 2010-11-19 21:25 add with statements and reads _pydoc.css file.
issue2001_f.diff ron_adam, 2010-11-21 07:18 Has unittests now.
issue_2001_f+1.diff eric.araujo, 2010-11-28 01:42
issue_2001_g.diff eric.araujo, 2010-11-28 01:43
Messages (62)
msg62012 - (view) Author: Ron Adam (ron_adam) * Date: 2008-02-02 23:05
This patch removes the gui tk control panel and replaces it with a
navigation bar on the served web pages.

This offers a nicer user experience because one no longer needs to jump
back and forth between windows.

The navbar supports getting specific modules, searching modules, and
returning to the main module index. I believe the file source view is
safer also.

Possible issues...
 
+ Restarting the server without ending it causes an error do to the port
already being used.  I think this is not new.

+ There may be some brakage if other applications depend on the tk
interface, but I don't think any do.  Or at least none that I know of.

+ I haven't tested this on windows.  (It works well on Ubuntu 7.10)
msg62351 - (view) Author: Ron Adam (ron_adam) * Date: 2008-02-13 07:02
Added a topics and keywords index choices to the navbar.

This gives the web interface the same functionality as the cli interface.

Fixed the -p option which I had missed.
msg62497 - (view) Author: Ron Adam (ron_adam) * Date: 2008-02-17 17:19
Remade the diff with correct directory name so it patches correctly.

Is there a way to add the patch keyword?
msg72216 - (view) Author: Ron Adam (ron_adam) * Date: 2008-08-31 22:38
New patch to replace outdated patch due to other changes to pydoc.

The easies way to try this out is to:

>> import pydoc
>> pydoc.gui()

Try it before and after the patch.  I think most people will prefer the
patch.
msg100283 - (view) Author: Ron Adam (ron_adam) * Date: 2010-03-02 01:29
Updated the patch to work with 3.2 and removed earlier outdated patches.
msg100284 - (view) Author: Ron Adam (ron_adam) * Date: 2010-03-02 01:37
I think this is ready for a review and feedback.
msg100778 - (view) Author: Ron Adam (ron_adam) * Date: 2010-03-10 05:13
Missed a buffer write in the gettopic() method. Fixed.
Plus some minor doc string changes.

Can someone change the stage to "patch review".  I can't do that myself.

Or is there something else I need to do first?
msg111305 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-07-23 12:50
The patch seems clean to me.  Applied patch to unit test and ran it, tests failed, then applied patch to module, tests passed.  Also tried import pydoc;pydoc.gui() from the command line, the output looked fine to me.  Tested on Windows Vista with Firefox.  Could someone commit this please.
msg111370 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-23 18:09
Thank You for the review Mark. It's very much appreciated.

I took another look at it and decided to offer another patch that moves the html/text server to the http package where the rest of the server stuff is.

I also corrected the example in it. Everything still works the same as when mark tested it.

This makes the diff file easier to read and tell what's going on.
msg111371 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-23 18:31
Ron,

Can you add a Misc/NEWS entry summarizing your change?  Also, please check if any changes need to be made to ReST documentation, Doc/library/pydoc.rst .


Ka-Ping,

Do you want to hold on to this, you can I take it over?
msg111379 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-23 20:07
Here's the new patch with the Misc/NEWS and pydoc.rst additions added to it.

I'm not sure if local_text_server is the best name for the server module. In pydoc it's a local server, but it may not be limited to that use. I've also considered text_server, text_html_server, or simple_text_server.
msg111380 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-23 20:10
On Fri, Jul 23, 2010 at 4:08 PM, Ron Adam <report@bugs.python.org> wrote:
..
> Here's the new patch with the Misc/NEWS and pydoc.rst additions added to it.

s/Romoved/Removed/
msg111381 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-23 20:15
s/navagation/navigation/

Please spell-check your changes.
msg111382 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-23 20:17
Sorry, will do...
msg111384 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-23 20:29
+:program:`pydoc` :option:`-g` will start the server and additionally open a web
+browser to a module index page.  Each served page has a navagation bar at the
+top where you can 'get' help on a individual item, 'find' all modules with a
+keyword in thier synopsis line, and goto indexes for 'modules', 'topics' and
+'keywords'.

I am not sure I like the fact that the browser is started automatically.   Please bring this up on python-dev.  This may be an opportunity to rethink pydoc command line switches.  For example, -p and -g are currently exclusive, but it would make sense for -g to start server on the port specified by -p.
msg111385 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-23 20:35
Ron,

Your latest patch does not work for me:


$ ./python.exe -m pydoc -g
Traceback (most recent call last):
  File "/Users/sasha/Work/python-svn/py3k-commit/Lib/runpy.py", line 160, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/Users/sasha/Work/python-svn/py3k-commit/Lib/runpy.py", line 73, in _run_code
    exec(code, run_globals)
  File "/Users/sasha/Work/python-svn/py3k-commit/Lib/pydoc.py", line 2315, in <module>
    if __name__ == '__main__': cli()
  File "/Users/sasha/Work/python-svn/py3k-commit/Lib/pydoc.py", line 2259, in cli
    gui()
  File "/Users/sasha/Work/python-svn/py3k-commit/Lib/pydoc.py", line 2222, in gui
    serverthread = http.text_server.startserver(url_handler, port)
AttributeError: 'module' object has no attribute 'text_server'
msg111387 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-23 21:10
Ok, spell, check and attribute error corrected.

I agree on the -p / -g issue.

I'll bring this up on python dev.


Thanks for the reviews and feedback.  It really helps.
msg111505 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-24 21:53
Hmm.  Still no luck


$ ./python.exe -m pydoc -g
Server ready at: http://localhost:7464/
Segmentation fault


The Segmentation fault happens after I enter pydoc in the search window and press the "Search!" button.

BTW, please remove !'s from the labels.
msg111506 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-07-24 22:00
OK, the crash is due to issue9319, but as far as I understand, the faulty code is in the error handling branch, so there must be a bug in your code as well.
msg111510 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-24 22:49
Ok, on the "!" marks.

The Segmentation fault exits python and isn't catchable as far as I know. It's happens in the compiled tokenize.c file. The python side error detection doesn't get a chance to catch it.

The problem is present without the patch applied and when using the console help so it wasn't anything I did.  I could use a hack to bypass that particular file, but I don't think that is the correct answer to this particular problem.  Better to get that issue resolved first.
msg112008 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-29 21:09
Link to the discussion on the python-dev new group.

Subject: [isssue 2001] Pydoc enhancement patch questions
http://permalink.gmane.org/gmane.comp.python.devel/115474
msg112090 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-30 17:36
New diff file.

Removed the '-g' option and added a '-b' option.

Using the '-g' option will now bring up pydoc options help.

Added a simple server command prompt with 'b' and 'q' choices
to open a browser and quit the server.

Allow the '-p' option to be use along with the  '-b' option.

Catch the error when the port is already in use and print a nice error instead of a traceback with an exception.

The port number now defaults to port 0 and uses a random unused port.

Changed the name of the server to better describe what it does to
http._text_server_thread.py.  Used a leading underscore on the name to
indicate it is a private module and give it time to mature a bit more
before making it public.

Rewrote the _text_server_thread.py example in its __doc__ string so
it passes a doctest.
msg112093 - (view) Author: Ron Adam (ron_adam) * Date: 2010-07-30 17:42
I also put in a temporary fix to skip the test file that was causing it to crash when doing a search.  It's marked as such and can be removed once the bug is fixed.
msg114326 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-19 06:10
I've closed #902061 as a duplicate of this, but please keep in mind msg75324 from that issue.
msg119713 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-10-27 13:50
Unassigning from ping given the lack of comments - I should be able to have a look at this in time for beta 1
msg120238 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-02 16:17
Nick, I can update the patch and move the server back into pydoc.py if that will help you get this into 3.2 beta.

I can also changed the docstrings of the new parts to # comments.
msg120262 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-02 22:02
On Wed, Nov 3, 2010 at 2:17 AM, Ron Adam <report@bugs.python.org> wrote:
>
> Ron Adam <ron_adam@users.sourceforge.net> added the comment:
>
> Nick, I can update the patch and move the server back into pydoc.py if that will help you get this into 3.2 beta.

Yep, probably the best option 3.2 - then make a new issue about making
those public in the standard library (since they aren't really
pydoc-specific).

> I can also changed the docstrings of the new parts to # comments.

Just mark the various names with an underscore so people know the
current names aren't final. There's no harm in leaving the docstrings
in place if you do that.
msg120310 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-03 06:26
Here you go Nick.  One file with Underscores for the new class's and functions.  Where there was some overlap in names, like where some of the older server class's were reused, but don't have exactly the same behavior, I started those with underscores also.

This should make it easier for you to review and/or make adjustments where it's needed.
msg120681 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-07 14:17
I'd actually started typing out the command to commit this before it finally clicked that the patch changes public APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality without going through a deprecation period first.

I've attached my version of the patch which includes some additional documentation cleanups and a tweak to test_pyclbr (since the pydoc changes broke that test through no fault of their own).

However, the public (albeit undocumented) nature of the APIs implementing the old Tk GUI means I'm not comfortable committing the patch in a form that simply drops them without going through a deprecation period first.

So, the way forward from here:
1. The good news is beta 1 has been pushed back to December 4 for other reasons, so there's still more time to work on this
2. The gui() function should still open the Tkinter GUI, and the -g option should be retained with its old functionality. Invoking this function should trigger DeprecationWarning.
3. A serve() function to start the web server component should be added back in
4. The new behaviour of opening the web client can be provided as a "browse()" function (that accepts the port number the server is listening on as an argument).
msg120692 - (view) Author: René Liebscher (r.liebscher) Date: 2010-11-07 16:06
What about http://bugs.python.org/issue2001#msg114326 ?
msg120723 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-08 09:41
> What about http://bugs.python.org/issue2001#msg114326 ?

Thanks for the reminder.


To Nick:

>However, the public (albeit undocumented) nature of the APIs >implementing the old Tk GUI means I'm not comfortable committing the >patch in a form that simply drops them without going through a >deprecation period first.

I think it would be ok sense this is a 'user' interface rather than a programming interface, but it won't hurt to ask specifically about this on python dev.  I got the impression that pydoc is considered a 'user' tool like idle, that just happens to live in lib, so the rules aren't considered to be quite as strict as it would be if it was a module meant to be used by other modules or programs.


>3. A serve() function to start the web server component should be >added back in

There is a _startserver() function.  The leading  underscore can be removed, or it can be renamed to serve.  A serve() function could also just call _startserver().


>2. The gui() function should still open the Tkinter GUI,
>and the -g option should be retained with its old functionality. >Invoking this function should trigger DeprecationWarning.
>
>4. The new behaviour of opening the web client can be provided
>as a "browse()" function (that accepts the port number the
>server is listening on as an argument).


After the patch the gui() function starts the server with a server text command window, and opens the browser to the correct page.  If you close the browser, you can use 'b' at the server prompt to reopen the browser at the address it is serving.

It sounds like you want "serve()" to just start the server. and "browse()" to just start the browser.  The gui() function does both in one step.

What would be the equivalent new way of doing both in one step, if gui() retains the old tk behavior?

Cheers, Ron
msg120731 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-08 11:40
Yeah, I'll ask for feedback on python-dev regarding the API breakage.

If we decide not to break the existing API, I'd suggest the following:

- keep both the old serve() and the old gui() (with DeprecationWarning added to both) (I'd forgotten your point in the previous python-dev thread that gui() would need changes to work with the new server, which isn't worth the hassle)

- use _startserver(port=0) and _browse(port, *, start=False) for the new components (with the -b option invoking _browse with start=True)
msg120840 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-09 04:40
As per python-dev discussion, we'll keep the old server and GUI implementations intact (but deprecated). The -g command line option will start the old implementation, -p and -b will start the new one.

The APIs to activate the new implementation will start with a leading underscore and can be whatever makes the most sense for what you're doing (the example signatures in my last message are one possibility, but don't take them as mandatory).
msg121180 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-14 07:26
Ok, here is the latest patch for review.  "issue2001_a.diff'

I restored the pydoc.py file and then put most of the new code in these two functions,

  _startserver(urlhandler, port)
  _browse(port=0, *, open_browser=True)

This creates a bettor organized file, and reduces the number of names with leading underscores.  As far as I know you can't import things that are located inside a function.

I still need to depreciate the '-g' option and the gui() function along with the old server parts.

Is there a guide on how to depreciate things?
msg121181 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-14 07:40
Just call warnings.warn with an appropriate message, a category of DeprecationWarning and a stacklevel of 2 (so the warning will refer to the function's caller rather than to the pydoc code).

It's basically the example from the warnings.warn docs.
msg121237 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-15 17:55
This should be done or very close to done.

The -g option, gui(), and serve() functions are deprecated.

The new features are browse(port, *, open_browser=True), and a '-b' option.  The '-p port' option does browse(port=port, open_browser=False), so _startserver() does not need to be part of the API.  If anyone wants to access the server directly, then we can discuss making it it's own module, or a submodule in a package.

Because we deprecated the gui() function, I figured we need to make browse() public.  The only reason it would need to be private is if we want a different name or signature.  (Any thoughts?)

I left Lib/urllib/parse.py, Lib/test/test_pyclbr.py, and Lib/test/test_urlparse.py alone.  I presumed you fixed bugs in them that needed to be fixed anyway.
msg121395 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-18 00:35
I just noticed I used "depreciated" in place of "deprecated" in one of the doc strings.  I can upload a new patch with that fixed.

Before I do that, is there any thing else I can do?

Do you agree that the browse function should be public?  
If not, what do we tell people to use instead of gui(), sense that will be deprecated?
msg121418 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-11-18 02:15
Ron,

I added a header to the text documentation clarifying that pydoc-generated documentation is not authoritative. See issue 10446.  I did add it to the HTML page because it was not obvious where to put it and I knew that you are changing layout anyways.  Please consider including the same text somewhere in HTML pages.
msg121435 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-18 04:04
Review time!  Please use rietveld for big patches in the future.

I had started with a list of remarks in same order than the code and minor remarks grouped at the end, but I see now that all my remarks are minor, since I have found no real code problem (I don’t know pydoc as well as you :)  If my message is hard to follow, I’ll redo it on rietveld or directly update your patch (I’m nitpicky, so I can volunteer to share the work).  Thanks for your continued effort and high-quality result!

* Misc/NEWS: Your entry has to be proper reST.

* pydoc.rst:
+documentation pages.  (The :option:`-g` option is depreciated.)
:option: marks an option for python itself.  Use ```` instead.  See #9312 for more info.  (Also fix the typo, which appears twice in the diff.)

+:program:`pydoc` ``-b``
:program:`pydoc -b` works.

“Each served page has a navigation bar at the top where you can 'get' help on an individual item, 'search' all modules with a keyword in their synopsis line, and goto indexes for 'modules', 'topics' and 'keywords'.”
What do the apostrophes mean?

* urllib/parse.py, test_urlparse.py: Unrelated changes?

* pydoc.py
+import sys, imp, os, re, inspect, builtins, pkgutil, time, warnings
Why not take the opportunity to put each import on its own line, for source and diff readability?

+            return 'no documentation found for %s' % repr(topic), ''
You can simplify that to “'... %r' % topic”.  (I’m assuming topic is never a tuple.)

+        if type(target) is type(''):
I see you’re following the style of the rest of the file.  Modernizing that to isinstance is probably out of scope for this patch.  (Have I said that the HTML is horrible too?)

+    msg = """The pydoc.serve() function is deprecated."""
Such messages usually don’t start with a capital nor end with a period.

+    msg = """The pydoc.gui() function and "pydoc -g" option are depreciated,
+    use "pydoc.browse() function and "pydoc -b" option instead."""
Won’t this have strange leading indentation on the second line?  I’d recommend printing two lines that wrap under 80 characters:

    msg = ('the pydoc.gui() function and "pydoc -g" option are deprecated,'
           'use pydoc.browse() and "pydoc -b" instead')


+            #...    webbrowser.open(serverthread.url)
+            #True
I don’t like commented-out example code.  I prefer it to work or not be there :)

“it's” is not a possessive, you want “its”.

+    import http.server
Aren’t function-level imports frowned upon?

+            if self.path.endswith('.css'):
+                content_type = 'text/css'
+            else:
+                content_type = 'text/html'
It’s good practice to add the “charset” parameter to specify the character encoding.

+        if url[-5:] == '.html': url = url[:-5]
Please always put two statements on two lines.

+            except IOError as value:
You don’t need to get the instance if you don’t use it in the following block (IOW, remove “ as value”).

+            fp.close()
Please use the with statement.

+        server_help_msg = """Python Version: %s
“version” need not be capitalized IMO.

-""" % (cmd, os.sep, cmd, cmd, cmd, cmd, os.sep))
+""" % (cmd, os.sep, cmd, cmd, cmd, cmd, cmd, os.sep))
Feel free to replace all those %s by %(cmd)s or {cmd} for readability.


* Assorted nitpicks, which are not blocking:
- “command line” needs an hyphen when used as an adjective (in whatsnew/3.2.rst)
- a→an in “a HTTP server” (pydoc.py)
- Use two spaces: “local machine. Port” (twice); “page. Use”.
- “To determine what the client is asking for check the URL”: I’d add a comma after “for”.
- You can remove the parentheses around “The "-g" option is deprecated” and put it on the preceding line.
- Feel free to sprinkle a few more blanklines here and there, especially before class and function definition.

> Do you agree that the browse function should be public?
I’ll leave that to Nick.
msg121442 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-11-18 05:44
I am attaching pydoc.png screenshot that shows how the new navigation bar is rendered in my browser.  It looks a little bit busy and I don't like (get)/(search) buttons jumping below the text boxes when the browser window is is not wide enough.
msg121443 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-11-18 05:56
Shouldn't tests for new features added to Lib/test/test_pydoc.py?
msg121445 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-18 06:54
Thanks for the review Éric!  The more eyes on this the better it will be.

I'm not familiar with rietveld yet.  But no time like the present to get started.  Here's the link.

   http://codereview.appspot.com/3151042/


I didn't play around with the html too much. Mostly I just wanted to get it to work for now.  I plan on adding support for style sheets, so a lot of the html code will be updated at that time.

Alex, the version info is just what sys.version returns, that was just easy to do, we can probably trim that down a bit.

The "Index of Modules" can be shortened to "Index" or "Modules".

I can have the get, and search box's always be two lines.  Something like...

Python 3.2a4/rev#     Get    [________]    Index
Platform              Search [________]    Topics : Keywords
msg121447 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-18 07:37
Sense these features reuse other parts of pydoc, they are are covered to some degree by the existing tests.  

An easy test would be to just start the server and then shut it down after a short timeout.  Better than nothing.

I'll try reading and writing directly to the socket and working up some tests from that.  I don't suppose there's something like that already in the test suite I can copy?
msg121516 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-11-19 14:21
On Thu, Nov 18, 2010 at 2:37 AM, Ron Adam <report@bugs.python.org> wrote:
..
> I'll try reading and writing directly to the socket and working up some tests from that.
> I don't suppose there's something like that already in the test suite I can copy?

I believe you can find relevant code in test/test_httpservers.py.
What I had in mind was simpler: test_pydoc already checks html output
for a module, but this test did not fail after I applied your patch.
There should be something in the tests that checks that the new
navigation bar is generated correctly.
msg121518 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-11-19 14:25
issue2001_b.diff patch includes changes to urllib.  Is this intentional?  Is it a bug fix, a feature?  There is no mention in the NEWS file.  If these changes are needed for pydoc enhancements, I would like to separate them in its own issue and commit separately.
msg121529 - (view) Author: Alexander Belopolsky (belopolsky) * (Python committer) Date: 2010-11-19 15:05
issue2001_c.diff is the same as issue2001_b.diff, but without urlparse changes and with minor modifications to pydoc.rst resolving a conflict with a recent commit.  I have also uploaded the same patch to rietveld:

http://codereview.appspot.com/3187042

Does anyone know how to properly subscribe report@bugs.python.org to the Rietveld issue so that we see reviews here?
msg121541 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-19 15:40
Gah, I accidentally generated a diff that included some unrelated changes to urrlib (and its tests) for a different issue I had been working on, and Ron's subsequent patch picked them up. I then misinterpreted "left them alone" to mean "didn't include them in the regenerated patch". Sorry about that - taking those changes out was the right thing to do.

The test_pyclbr change was necessary at the time (the new code isn't particularly class browser friendly, so the tests referencing pydoc started failing), but it shouldn't be needed now that the server and client implementation details are once again hidden inside function scopes.
msg121557 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-19 17:59
Here is the patch in the current state which includes the changes in issue2001_c.diff as well as most of the changes Éric suggested.

Still to do:
  * Use the with statement in several places to ensure closing.
  * Add tests for the server.

I did try to make the header a bit less cluttered, but I'm not completely happy with it yet, but I think it will be good enough for now.  Fixing the HTML probably should be a separate issue where we can add a style sheet at the same time.  Lets get this patch in first.

This is also reitveld:  
    
    http://codereview.appspot.com/3151042/
msg121570 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-19 21:25
I added an empty _pydoc.css file.  The server does read it and you'll be able to play around with it, but don't expect it to be pretty if you do until the rest of the html is updated.

Should I put that in the pydoc_data?

It just needs tests now, which I'll be working on as time permits over the next few days.  (And any other minor details we find.)
msg121873 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-21 07:18
Here is the latest patch with tests.

In order to test the html pages I separated out the URL handler.  So now we have three new functions.

    pydoc._start_server(urlhandler, port)
    pydoc._url_handler(url, content_type="text/html")
    pydoc.browse(port=0, *, open_browser=True)

A css file:   pydoc_data/_pydoc.css

The tests in test_pydoc.py, test that the server can be started and shutdown without an error.  And test that the _url_handler() will send back html pages with the correct title for all the different type of requests that it can handle.

I think this is ready for a final review now.
msg122238 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-23 19:35
I am reviewing this and making some edits to the patch.  Will post this week.
msg122597 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-28 01:42
First, thanks for your work on this.  I have some feature requests for pydoc too and I’m hoping to work with you in the future :)

I found it more efficient (and fair) to make a new diff instead of listing all nitpicks and have you do the changes.  issue_2001_g.diff applies cleanly in py3k.  I fixed style issues, removed some docstrings in private classes that were in my eyes unhelpful (the purpose of the class being clear from its name of from a bit of knowledge about http.server), and added the possibility to quit the server with EOF or Ctrl-C.

There were a number of trailing spaces in your files.  Suggestions: better editor settings, use of hg diff with the color extension, make patchcheck (using Python’s makefile).

I have removed the overly long entry from NEWS, I suggest it is used as commit message instead.  Here it is, corrected:

“Improve Pydoc interactive browsing (#2001).  Patch by Ron Adam.

* A ``-b`` option to start an enhanced browsing session.
* Allow ``-b`` and ``-p`` options to be used together.
* Specifying port 0 will pick an arbitrary unused socket port.
* A new ``browse()`` function to start the new server and browser.
* Show Python version information in the header.
* A *Get* field which takes the same input as the ``help()`` function.
* A *Search* field which replaces the Tkinter search box.
* Lnks to *Module Index*, *Topics*, and *Keywords*.
* Improved source file viewing.
* An ``HTMLDoc.filelink()`` method.
* The ``-g`` option and the ``gui()`` and ``serve()`` functions are
deprecated.”

issue_2001_f+1.diff shows only the changes I made on top of yours (+ some changes from the py3k branch).  Please ask any question you may have about them.


Now, those style issues were only the easy parts.  Remaining issues (I mean they’re issues to me, you or Nick may disagree):

1) _start_server has a bit of documentation in its docstring, but it’s a private function.  Also, two lines are commented so that “make doctest” does not run them, but that is non-obvious and breaks the flow of the text.  I recommend doing a combination of those things, at your call: Move those docs to a public function docstring and/or reST doc; move those docs to a comment; don’t use the doctest as test but add a unit test; remove most of the docstring.

2) Functions in methods in classes in functions don’t look great to me.  I understand pydoc tries not to import things before it knows it needs them, and it’s also a way of keeping things private, but I think flat is better (more readable) than nested.

3) bltinlink is defined six times.  Global (private) function please? :)

4) The HTML has two html elements, two doctypes, an invalid clear attribute on a div element.

5) Last but not least, the deprecations.  A few developers had a chat with Raymond Hettinger on IRC yesterday about the cost of deprecations and removals.  His defended that their harm (forcing people to change their code) was bigger than their usefulness (catering to esthetic tastes of core developers).  How about we arrange the docs and docstrings in order to make the new ways prominent and deprecate the old ones, without any actual code deprecation?
msg122603 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-28 02:20
Regarding question 1 (_start_server docstring):

I have no problem with docstrings on private functions in general. It just means they're for the benefit of developers of the module itself rather than users of the module. However, in this case, I would move the "example use" out of the docstring and into a comment following the docstring.

Regarding questions 2 & 5 (code structure and deprecations):

This whole patch would have been a *lot* easier if the server and GUI implementations in pydoc had been properly private from the start. A look at the docs makes it fairly obvious that these are meant to be implementation details of the command line invocation, but the way they were written meant they were exposed as a public standard library API. As a result, we have to jump through a lot of hoops to replace them with the new back end.

Since we don't want to maintain the old GUI client or the associated web server backend, the deprecations are needed so we can delete them entirely in 3.3. The situations Raymond is talking about are cases where we come up with a "better" API, but an old API is popular and not really all that flawed within its original scope (e.g. getopt vs optparse vs argparse). In such cases, deprecation gains us little and causes a lot of hassle. However, in a case like this, where people shouldn't have been using the old API anyway and there's a mountain of code we want to get rid of, deprecating the associated API is the right thing to do.

The nested code structure is a reaction to the concern that caused all the additional difficulty - by squirreling all the implementation details away inside a small number of functions that are named with leading underscores, we eliminate the temptation to rely on the current location of these classes. For 3.3, I'd like to pursue Ron's idea of pulling the text server out and placing it in http.server for general use, with pydoc then retrieving it from there.

Regarding questions 3 & 4 (repetition of bltinlink and HTML dodginess):

Those two sound like issues that should be fixed.
msg122637 - (view) Author: Ron Adam (ron_adam) * Date: 2010-11-28 06:11
Thanks for the review and style edits Éric.  I think it's a much better patch with the changes and suggestions from you, Nick, and  Alexander.

I'll check my white space settings.  Thanks for noticing it.

As Nick points out, parts of the patch was written with the idea of having the text server as a separate module.  So I made the example in it runnable as a doctest, as if it was going to be a public module.  As Nick also suggests, the server example could be changed to a comment, and it could be condensed quite a bit by removing the command line doctest formatting/tutorial style and replacing it with a nice single example as it would be seen in a file.

Originally I was  hopping to get a complete rewrite into python 3.0. Then it was suggested I try for 2.6.  While doing that, I went way overboard with making it have plug in html formatters and converters. (Some people suggested they wanted that). In the end, I found that these parts in this patch, (and some additional stuff), can be used without the complete rewrite.  And these parts really help make pydoc much more usable.  Not the document generating parts.  Maybe we can move some of those parts to a pydoc_lib.py file at some point, and have an API for creating document generators rather than try to have pydoc do everything it self.

As you noticed, I used an html style that is similar to what was in the other parts of pydoc.  And yes, it's not pretty.  As Nick points out, "those things should be fixed", and I agree.  But I feel it should be a separate patch.  That will make it easier to review and keep the html code changes separate from any functional changes.  Hopefully, we can update the html so it makes good use of the style sheet at that time as well.  ie... a lot of the html code will probably be changed in the near future, so don't be too nit-picky about it right now.

Other things that I hope to add later, are to have more references in the topics and keywords be links.  That one is fairly easy. The function to do that is already in pydoc.  Add line numbers and color output to the file view page.  And eventually look into detecting and using reST to enhance both the html and text output in docstrings, topics, and keywords.

As for the imports that arn't at the top of the module, I followed the usage that was in pydoc.  It seemed to me that there was probably a conscious reason for why it done that way.

I can't find anything I disagree with. 

   Ron
msg123215 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-12-03 09:33
Committed in r86962.

Thanks to all involved in getting this from initial submission to final checkin :)

Any further changes (including addressing Éric's last few comments) can be undertaken as separate issues.
msg123216 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-12-03 09:34
Oh, I also fixed in issue in both the old and new server code that switched warnings off globally rather than merely for the operation it was trying to disable warnings for.
msg123227 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2010-12-03 11:12
The html_getfile() function is Unix-specific: it constructs paths like

path = os.sep + path.replace('%20', ' ')

Consequently, its test fails on Windows:

http://www.python.org/dev/buildbot/all/builders/x86%20XP-4%203.x/builds/3703/steps/test/logs/stdio
msg123231 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-12-03 12:30
Nick, you seem to have forgotten to svn add the CSS file.
msg123251 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-12-03 14:36
Added the missing CSS file in r86971. Hopefully that will make the buildbots a little happier.
msg123254 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-12-03 16:13
r86975 should fix the problem with Windows paths.

I also found an issue where many of the emitted HTML pages contained two HTML header sections. The tests were just written in a way that they only looked at the second of these header sections, while my browser only looked at the first.

The same commit adjusts the tests to look at the first emitted header section and changes the flow in the URL handler to only ever invoke html.page() once per request.
msg123255 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-12-03 16:14
Oh, that would be Éric's point 4 that I fixed. OK, no need to create a new issue for that one then :)
msg389453 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-03-24 13:47
The "getfile" feature has a directory traversal vulnerability and so I propose to remove the feature: see bpo-42988.
History
Date User Action Args
2022-04-11 14:56:30adminsetgithub: 46285
2021-03-24 13:47:05vstinnersetnosy: + vstinner
messages: + msg389453
2010-12-03 16:14:30ncoghlansetmessages: + msg123255
2010-12-03 16:13:11ncoghlansetstatus: open -> closed

messages: + msg123254
2010-12-03 14:36:10ncoghlansetmessages: + msg123251
2010-12-03 12:30:05eric.araujosetmessages: + msg123231
2010-12-03 11:27:31giampaolo.rodolasetnosy: - giampaolo.rodola
2010-12-03 11:12:23georg.brandlsetstatus: closed -> open
keywords: + buildbot, - patch
messages: + msg123227
2010-12-03 09:34:44ncoghlansetmessages: + msg123216
2010-12-03 09:33:27ncoghlansetstatus: open -> closed
resolution: accepted
messages: + msg123215

stage: patch review -> resolved
2010-11-28 06:11:35ron_adamsetmessages: + msg122637
2010-11-28 02:20:28ncoghlansetmessages: + msg122603
2010-11-28 01:43:23eric.araujosetfiles: + issue_2001_g.diff
2010-11-28 01:42:57eric.araujosetfiles: + issue_2001_f+1.diff

messages: + msg122597
2010-11-23 19:35:38eric.araujosetnosy: + giampaolo.rodola
messages: + msg122238
2010-11-21 07:18:21ron_adamsetfiles: + issue2001_f.diff

messages: + msg121873
2010-11-19 21:25:35ron_adamsetfiles: + issue2001_e.diff

messages: + msg121570
2010-11-19 17:59:49ron_adamsetfiles: + issue2001_d.diff

messages: + msg121557
2010-11-19 15:40:11ncoghlansetmessages: + msg121541
2010-11-19 15:05:56belopolskysetfiles: + issue2001_c.diff

messages: + msg121529
2010-11-19 14:31:36giampaolo.rodolasetnosy: - giampaolo.rodola
2010-11-19 14:25:25belopolskysetmessages: + msg121518
2010-11-19 14:21:47belopolskysetmessages: + msg121516
2010-11-18 07:37:26ron_adamsetmessages: + msg121447
2010-11-18 06:54:14ron_adamsetmessages: + msg121445
2010-11-18 05:56:10belopolskysetmessages: + msg121443
2010-11-18 05:45:32sridsetnosy: - srid
2010-11-18 05:44:36belopolskysetfiles: + pydoc.png

messages: + msg121442
2010-11-18 04:04:29eric.araujosetnosy: - BreamoreBoy
messages: + msg121435
2010-11-18 02:15:22belopolskysetmessages: + msg121418
2010-11-18 00:35:48ron_adamsetmessages: + msg121395
2010-11-15 17:55:47ron_adamsetfiles: - pydoc_r86133.diff
2010-11-15 17:55:28ron_adamsetfiles: - issue2001_a.diff
2010-11-15 17:55:01ron_adamsetfiles: + issue2001_b.diff

messages: + msg121237
2010-11-14 07:40:10ncoghlansetmessages: + msg121181
2010-11-14 07:26:39ron_adamsetfiles: + issue2001_a.diff

messages: + msg121180
2010-11-09 04:40:45ncoghlansetmessages: + msg120840
2010-11-08 11:40:06ncoghlansetmessages: + msg120731
2010-11-08 09:41:08ron_adamsetmessages: + msg120723
2010-11-07 16:07:00r.liebschersetmessages: + msg120692
2010-11-07 14:18:35ncoghlansetfiles: + issue2001_ncoghlan_cleanup.diff
2010-11-07 14:17:25ncoghlansetmessages: + msg120681
2010-11-03 06:26:46ron_adamsetfiles: + pydoc_r86133.diff

messages: + msg120310
2010-11-03 06:20:20ron_adamsetfiles: - pydoc_server4.diff
2010-11-03 06:20:15ron_adamsetfiles: - pydoc_server3.diff
2010-11-03 06:20:10ron_adamsetfiles: - pydoc_gui.diff
2010-11-02 22:02:49ncoghlansetmessages: + msg120262
2010-11-02 16:17:56ron_adamsetmessages: + msg120238
2010-10-27 13:50:14ncoghlansetassignee: ping -> ncoghlan
messages: + msg119713
2010-08-19 07:15:53r.liebschersetnosy: + r.liebscher
2010-08-19 06:10:11BreamoreBoysetmessages: + msg114326
2010-07-30 17:42:30ron_adamsetmessages: + msg112093
2010-07-30 17:36:37ron_adamsetfiles: + pydoc_server4.diff

messages: + msg112090
2010-07-29 21:09:31ron_adamsetmessages: + msg112008
2010-07-29 20:58:03ron_adamsetfiles: - pydoc_server2.diff
2010-07-29 20:57:56ron_adamsetfiles: - pydoc_server.diff
2010-07-24 22:49:44ron_adamsetmessages: + msg111510
2010-07-24 22:00:03belopolskysetmessages: + msg111506
2010-07-24 21:53:58belopolskysetmessages: + msg111505
2010-07-23 21:10:48ron_adamsetfiles: + pydoc_server3.diff

messages: + msg111387
2010-07-23 20:35:03belopolskysetresolution: accepted -> (no value)
messages: + msg111385
stage: commit review -> patch review
2010-07-23 20:29:54belopolskysetmessages: + msg111384
2010-07-23 20:17:35ron_adamsetmessages: + msg111382
2010-07-23 20:15:11belopolskysetmessages: + msg111381
2010-07-23 20:10:48belopolskysetmessages: + msg111380
2010-07-23 20:08:00ron_adamsetfiles: + pydoc_server2.diff

messages: + msg111379
2010-07-23 18:31:03belopolskysetresolution: accepted

messages: + msg111371
nosy: + belopolsky
2010-07-23 18:09:27ron_adamsetfiles: + pydoc_server.diff

messages: + msg111370
2010-07-23 12:50:47BreamoreBoysetnosy: + BreamoreBoy

messages: + msg111305
stage: patch review -> commit review
2010-03-12 04:09:06benjamin.petersonsetstage: patch review
2010-03-11 21:52:12sridsetnosy: + srid
2010-03-10 05:13:50ron_adamsetfiles: + pydoc_gui.diff

messages: + msg100778
2010-03-10 05:05:15ron_adamsetfiles: - pydoc_gui.diff
2010-03-02 12:12:41ncoghlansetnosy: + ncoghlan
2010-03-02 01:37:57ron_adamsetmessages: + msg100284
2010-03-02 01:36:59ron_adamsetnosy: + georg.brandl
2010-03-02 01:29:51ron_adamsetfiles: + pydoc_gui.diff

messages: + msg100283
versions: + Python 3.2, - Python 2.6
2010-03-02 01:27:14ron_adamsetfiles: - pydoc_gui_.diff
2010-02-16 05:40:06eric.araujosetnosy: + eric.araujo
2008-08-31 22:39:07ron_adamsetfiles: - pydocnotk.diff
2008-08-31 22:39:01ron_adamsetfiles: - pydocnotk.diff
2008-08-31 22:38:55ron_adamsetfiles: - pydocnotk.diff
2008-08-31 22:38:47ron_adamsetfiles: + pydoc_gui_.diff
messages: + msg72216
2008-03-20 03:29:59jafosetpriority: normal
assignee: ping
nosy: + ping
2008-02-26 19:06:13akuchlingsetkeywords: + patch
2008-02-17 17:19:58ron_adamsetfiles: + pydocnotk.diff
messages: + msg62497
2008-02-13 07:02:25ron_adamsetfiles: + pydocnotk.diff
messages: + msg62351
2008-02-04 16:06:01ron_adamsetversions: + Python 2.6
2008-02-03 20:25:40giampaolo.rodolasetnosy: + giampaolo.rodola
2008-02-02 23:05:52ron_adamcreate