classification
Title: table header in output of difflib.HtmlDiff.make_table is not escaped and can be rendered as code in the browser
Type: security Stage: resolved
Components: Documentation Versions: Python 3.8, Python 3.7, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, larry, mdk, miss-islington, ned.deily, serhiy.storchaka, xtreak
Priority: normal Keywords: patch, patch, patch

Created on 2018-12-28 09:18 by xtreak, last changed 2019-09-22 06:30 by xtreak. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 11341 merged xtreak, 2018-12-28 09:44
PR 11341 merged xtreak, 2018-12-28 09:44
PR 11341 merged xtreak, 2018-12-28 09:44
PR 11353 closed miss-islington, 2018-12-29 08:53
PR 11353 closed miss-islington, 2018-12-29 08:53
PR 11354 closed miss-islington, 2018-12-29 08:53
PR 11354 closed miss-islington, 2018-12-29 08:53
PR 11356 merged serhiy.storchaka, 2018-12-29 15:42
PR 11356 merged serhiy.storchaka, 2018-12-29 15:42
PR 11356 merged serhiy.storchaka, 2018-12-29 15:42
PR 11439 merged xtreak, 2019-01-05 19:51
PR 11439 merged xtreak, 2019-01-05 19:51
PR 11439 merged xtreak, 2019-01-05 19:52
PR 15922 merged miss-islington, 2019-09-11 11:21
Messages (12)
msg332648 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2018-12-28 09:18
HtmlDiff.make_table takes fromdesc and todesc that are not escaped causing problems while rendering html when they contain tags like fromdesc="<from>", todesc="<to>". There is no validation for them to be filenames so they could be arbitrary strings. Since contents of the table are escaped I think it's good to escape headers too since they might lead to the browser to execute the headers as code and potential XSS. I don't think it's worthy of adding security type so I am adding behavior. Feel free to change the type if needed.

I could see no test failures on applying my patch and I will push a PR with a test.

Current output : (<from> and <to> are not escaped in the output)

$ python3 -c 'import difflib; print(difflib.HtmlDiff().make_table(["<a> hello </a>"], ["<b> hello </b>"], fromdesc="<from>", todesc="<to>"))'

    <table class="diff" id="difflib_chg_to0__top"
           cellspacing="0" cellpadding="0" rules="groups" >
        <colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
        <colgroup></colgroup> <colgroup></colgroup> <colgroup></colgroup>
        <thead><tr><th class="diff_next"><br /></th><th colspan="2" class="diff_header"><from></th><th class="diff_next"><br /></th><th colspan="2" class="diff_header"><to></th></tr></thead>
        <tbody>
            <tr><td class="diff_next" id="difflib_chg_to0__0"><a href="#difflib_chg_to0__top">t</a></td><td class="diff_header" id="from0_1">1</td><td nowrap="nowrap">&lt;<span class="diff_chg">a</span>&gt;&nbsp;hello&nbsp;&lt;/<span class="diff_chg">a</span>&gt;</td><td class="diff_next"><a href="#difflib_chg_to0__top">t</a></td><td class="diff_header" id="to0_1">1</td><td nowrap="nowrap">&lt;<span class="diff_chg">b</span>&gt;&nbsp;hello&nbsp;&lt;/<span class="diff_chg">b</span>&gt;</td></tr>
        </tbody>
    </table>
msg332707 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-29 08:53
New changeset 78de01198b047347abc5e458851bb12c48429e24 by Serhiy Storchaka (Xtreak) in branch 'master':
bpo-35603: Escape table header of make_table output that can cause potential XSS. (GH-11341)
https://github.com/python/cpython/commit/78de01198b047347abc5e458851bb12c48429e24
msg332708 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-29 08:56
Is 2.7 affected?

This looks like potentially security issue, so it may be worth to backport the fix to 3.6 and 3.5.
msg332709 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2018-12-29 09:11
Yes, 2.7 is also affected at https://github.com/python/cpython/blob/befe3f7afdc5279b320af88a9e57f682c0172599/Lib/difflib.py#L2001 . I think it was missed during the initial addition in issue914575 and no one would be using it as a feature to have custom HTML directly injected in table header using fromdesc and todesc. So I hope this won't be breaking anyone on 2.7 and even with that headers can still be customized with CSS. Also as an additional note HtmlDiff.make_file internally uses make_table to construct full HTML file and that is also fixed. Maybe if one of the original authors of difflib would like to confirm it then it would be great.
msg332722 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-12-29 15:41
After some thoughts, I'm no longer sure that this is a correct change. The user can use this for highlighting the part of the column header, adding links or images, displaying complex multiline headers. I suspect that this change will break some existing code that uses difflib.HtmlDiff.make_table() properly, passing escaped strings.

I think now that this change should be reverted, and we should document instead that "fromdesc" and "todesc" are interpreted as an HTML text, and user data should be properly escaped for using in headers.
msg332724 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2018-12-29 16:09
Thanks Serhiy for the input. I initially thought this should be escaped since content was escaped and the same for header since user input taken directly could result in XSS. Maybe someone might using this undocumented feature intentionally that might not be worth breaking.

I will make a PR for this to be noted in docs that the parameters are interpreted as HTML.
msg332872 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-01-02 12:49
New changeset 830ddc74c495ac1a5c03172a31006074967571a3 by Serhiy Storchaka in branch 'master':
Revert "bpo-35603: Escape table header of make_table output that can cause potential XSS. (GH-11341)" (GH-11356)
https://github.com/python/cpython/commit/830ddc74c495ac1a5c03172a31006074967571a3
msg333069 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2019-01-05 19:19
Yes, please make a documentation PR. Since this behavior can cause a security hole in the user code, make this note visually attractive (maybe use the "note" directive).
msg333072 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2019-01-05 19:56
I have added a doc note under make_file docs which has docs for fromdesc and todesc. make_table refers to make_file for fromdesc and todesc docs. I have added a screenshot of the rendering using note and warning directive. I feel note directive is sufficient for this. Placement and wording suggestions are welcome.
msg351827 - (view) Author: Julien Palard (mdk) * (Python committer) Date: 2019-09-11 11:21
New changeset c78dae8d2b890d487e428dce00c7f600612cce7b by Julien Palard (Xtreak) in branch 'master':
bpo-35603: Add a note on difflib table header interpreted as HTML (GH-11439)
https://github.com/python/cpython/commit/c78dae8d2b890d487e428dce00c7f600612cce7b
msg351839 - (view) Author: miss-islington (miss-islington) Date: 2019-09-11 12:24
New changeset 44e36e80456dabaeb59c6e2a93e0c1322bfeb179 by Miss Islington (bot) in branch '3.8':
bpo-35603: Add a note on difflib table header interpreted as HTML (GH-11439)
https://github.com/python/cpython/commit/44e36e80456dabaeb59c6e2a93e0c1322bfeb179
msg352962 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2019-09-22 06:30
Closing this as the docs are merged with the initial patch to escape header reverted. Thanks all for the reviews.
History
Date User Action Args
2019-09-22 06:30:45xtreaksetstatus: open -> closed
messages: + msg352962

keywords: patch, patch, patch
resolution: fixed
stage: patch review -> resolved
2019-09-11 12:24:57miss-islingtonsetnosy: + miss-islington
messages: + msg351839
2019-09-11 11:21:43miss-islingtonsetpull_requests: + pull_request15563
2019-09-11 11:21:33mdksetnosy: + mdk
messages: + msg351827
2019-01-05 19:56:40xtreaksetkeywords: patch, patch, patch

messages: + msg333072
2019-01-05 19:52:13xtreaksetstage: needs patch -> patch review
pull_requests: + pull_request10880
2019-01-05 19:52:05xtreaksetstage: needs patch -> needs patch
pull_requests: + pull_request10879
2019-01-05 19:51:57xtreaksetstage: needs patch -> needs patch
pull_requests: + pull_request10878
2019-01-05 19:19:38serhiy.storchakasetkeywords: patch, patch, patch

messages: + msg333069
stage: patch review -> needs patch
2019-01-02 12:52:04serhiy.storchakasetversions: + Python 2.7, - Python 3.5, Python 3.6
nosy: + docs@python

assignee: docs@python
components: + Documentation, - Library (Lib)
keywords: patch, patch, patch
2019-01-02 12:49:28serhiy.storchakasetmessages: + msg332872
2018-12-29 16:09:24xtreaksetkeywords: patch, patch, patch

messages: + msg332724
2018-12-29 15:42:32serhiy.storchakasetpull_requests: + pull_request10669
2018-12-29 15:42:22serhiy.storchakasetpull_requests: + pull_request10668
2018-12-29 15:42:13serhiy.storchakasetpull_requests: + pull_request10667
2018-12-29 15:41:34serhiy.storchakasetkeywords: patch, patch, patch

messages: + msg332722
2018-12-29 09:11:11xtreaksetkeywords: patch, patch, patch

messages: + msg332709
2018-12-29 08:56:18serhiy.storchakasetversions: + Python 3.5, Python 3.6
nosy: + larry, ned.deily

messages: + msg332708

keywords: patch, patch, patch
type: behavior -> security
2018-12-29 08:53:47miss-islingtonsetpull_requests: + pull_request10662
2018-12-29 08:53:44miss-islingtonsetpull_requests: + pull_request10661
2018-12-29 08:53:39miss-islingtonsetpull_requests: + pull_request10660
2018-12-29 08:53:35miss-islingtonsetpull_requests: + pull_request10659
2018-12-29 08:53:19serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg332707
2018-12-28 09:44:37xtreaksetkeywords: + patch
stage: patch review
pull_requests: + pull_request10626
2018-12-28 09:44:33xtreaksetkeywords: + patch
stage: (no value)
pull_requests: + pull_request10625
2018-12-28 09:44:30xtreaksetkeywords: + patch
stage: (no value)
pull_requests: + pull_request10624
2018-12-28 09:18:00xtreakcreate