Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

documentation bug: HTMLParser needs to document unknown_decl #48124

Closed
freyley mannequin opened this issue Sep 15, 2008 · 7 comments
Closed

documentation bug: HTMLParser needs to document unknown_decl #48124

freyley mannequin opened this issue Sep 15, 2008 · 7 comments
Labels
docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error

Comments

@freyley
Copy link
Mannequin

freyley mannequin commented Sep 15, 2008

BPO 3874
Nosy @birkenfeld, @terryjreedy, @kimbongnam
PRs
  • bpo-39874: Creation heappush of maxheap version #18805
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-07-29.13:38:55.249>
    created_at = <Date 2008-09-15.21:52:28.957>
    labels = ['type-bug', 'docs']
    title = 'documentation bug: HTMLParser needs to document unknown_decl'
    updated_at = <Date 2020-03-06.10:23:14.752>
    user = 'https://bugs.python.org/freyley'

    bugs.python.org fields:

    activity = <Date 2020-03-06.10:23:14.752>
    actor = 'vbnmzx1'
    assignee = 'docs@python'
    closed = True
    closed_date = <Date 2010-07-29.13:38:55.249>
    closer = 'georg.brandl'
    components = ['Documentation']
    creation = <Date 2008-09-15.21:52:28.957>
    creator = 'freyley'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 3874
    keywords = ['patch']
    message_count = 7.0
    messages = ['73282', '107972', '108052', '108068', '111707', '111714', '111922']
    nosy_count = 5.0
    nosy_names = ['georg.brandl', 'terry.reedy', 'freyley', 'docs@python', 'vbnmzx1']
    pr_nums = ['18805']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue3874'
    versions = ['Python 2.6', 'Python 3.1', 'Python 2.7', 'Python 3.2']

    @freyley
    Copy link
    Mannequin Author

    freyley mannequin commented Sep 15, 2008

    the unknown_decl function is critical to dealing with MS Office
    generated HTML files. There's no documentation of that. The default
    behavior of the function is to error, which is reasonable, but it should
    be stated in the documentation.

    @freyley freyley mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Sep 15, 2008
    @terryjreedy
    Copy link
    Member

    Documentation issues should be component: documentation rather than library. When submitting one, please at least indicate the module or class concerned. I have never heard of 'unknown_decl' function.

    Preferably, indicate the specific section you want modified, by version, number and name. Best is to submit a suggested text to be inserted. You may know better than most issue reviewers what should be said. Someone else will add markup and possibly edit.

    If you respond, this will be reopened. Please indicate whether this issue applies to 3.x and add 3.1 and 3.2 if it does.

    @terryjreedy terryjreedy added docs Documentation in the Doc dir and removed stdlib Python modules in the Lib dir labels Jun 17, 2010
    @freyley
    Copy link
    Mannequin Author

    freyley mannequin commented Jun 17, 2010

    On Wed, Jun 16, 2010 at 5:55 PM, Terry J. Reedy <report@bugs.python.org> wrote:

    Terry J. Reedy <tjreedy@udel.edu> added the comment:

    Documentation issues should be component: documentation rather than library. When submitting one, please at least indicate the module or class concerned. I have never heard of 'unknown_decl' function.

    It's your bug tracker. This sort of statement that says that I should
    know exactly how you want bugs reported only serves to tell people
    like me not to even try. In addition, it's inaccurate in this case, as
    the title of the bug is that HTMLParser, which is a module in the
    standard library, needs a function documented.

    HTMLParser runs over HTML and calls internal functions when certain
    events occur. unknown_decl is called when an unknown declaration is
    found, and by default, it throws an exception. Thus, to correctly use
    HTMLParser, when subclassing it, you need to override unknown_decl if
    there are any unknown declarations in your HTML (or if you think there
    might be).

    Preferably, indicate the specific section you want modified, by version, number and name. Best is to submit a suggested text to be inserted.  You may know better than most issue reviewers what should be said. Someone else will add markup and possibly edit.

    It's been almost 2 years since I submitted this bug. I don't know if
    it applies to Python 3, and at this point I find it difficult to care.

    Thanks,

    Jeff

    @terryjreedy
    Copy link
    Member

    I understand that getting no response to a submission is not pleasant. I do not like it either. That is partly why I have started reviewing old issues. In the past couple of weeks, I have gotten old two orphaned patches applied by updating the headers, reading the patch, and adding a first-response approval message that got the attention of someone with code-commit privileges. I hope you agree that late is better than never.

    I just discovered the nosy-count box on the search page. 351 open issues with a nosy count of 1 (which means no response unless someone responded and then removed themself) is too many. We need more issue reviewers.

    As to your message: this is *our* tracker, not my tracker. My participation is as much voluntary as yours. I hope you do not really give up on improving Python and its documentation.

    I did not expect that you *should* have known submission details. That is why I tried to inform you. In particular, when an issue is marked as 'documentation', it is automatically assigned to 'docs@python', a pseudo-user standing in for people who handle doc revisions. Now they will see this issue, whereas they would not have before.

    Please excuse me for not remembering the title as I responded to the message. It is best if message text stands alone. Again, I hope you would agree that an somewhat ignorant response may be better than none.

    In order for the doc maintainers to add an entry, someone knowledgeable must write it. Your paragraph of explanation is a start, but more editing is needed.

    Looking at dir(html.parser.HTMLParser) and help(...), I see that there are several public internal methods. Some have doc strings that show up with help(), some do not. I thing all should. Some are defined on HTMLParser and some inherited from the undocumented (I believe) _markupbase.ParserBase.

    I see that there are also several (completely undocumented except fir dir()) private ('_xyz') internal methods. This implies to me that the public internal methods were made public rather than private because there might be reason to override them. If so, perhaps there should be a new subsection on public internal methods to explain what is what with them. What do you think? Document just one, some, or all?

    @freyley
    Copy link
    Mannequin Author

    freyley mannequin commented Jul 27, 2010

    On Thu, Jun 17, 2010 at 3:30 PM, Terry J. Reedy <report@bugs.python.org> wrote:

    In order for the doc maintainers to add an entry, someone knowledgeable must write it. Your paragraph of explanation is a start, but more editing is needed.

    Looking at dir(html.parser.HTMLParser) and help(...), I see that there are several public internal methods. Some have doc strings that show up with help(), some do not. I thing all should. Some are defined on HTMLParser and some inherited from the undocumented (I believe) _markupbase.ParserBase.

    I see that there are also several (completely undocumented except fir dir()) private ('_xyz') internal methods. This implies to me that the public internal methods were made public rather than private because there might be reason to override them. If so, perhaps there should be a new subsection on public internal methods to explain what is what with them. What do you think? Document just one, some, or all?

    Terry,

    I'm looking at the HTMLParser code, and I only see unknown_decl as a
    method in there that is: a) not marked as internal or doing a lot, b)
    not documented. There are a number of methods which should probably be
    refactored to be _methodname rather than methodname, but that's beyond
    the scope of this report.

    HTMLParser.unknown_decl(data)¶
    Method called when an unrecognized SGML declaration is read by the
    parser. The data parameter will be the entire contents of the
    declaration inside the <!...> markup. It is sometimes useful to be be
    overridden by a derived class; the base class implementation throws an
    HTMLParseError.

    There may be other undocumented methods showing up, but if so they're
    part of a parent class.

    Thanks,

    Jeff

    @terryjreedy
    Copy link
    Member

    OK, your recommendation is to add one entry with the text suggested in the message. Given the name, the text seems reasonable. I will leave it to a doc person to format and apply.

    @birkenfeld
    Copy link
    Member

    Applied with some tweaks in r83223. Thanks Jeff and Terry!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants