classification
Title: datetime: Tests for potential crashes due to non-UTF-8-encodable strings
Type: security Stage: resolved
Components: Extension Modules Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: belopolsky, izbyshev, miss-islington, p-ganssle, serhiy.storchaka, taleinat, vstinner
Priority: normal Keywords: patch

Created on 2018-08-23 17:41 by izbyshev, last changed 2018-11-27 16:17 by izbyshev. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 8878 merged izbyshev, 2018-08-23 17:49
PR 8850 closed izbyshev, 2018-08-23 17:53
PR 10049 merged miss-islington, 2018-10-23 06:36
Messages (8)
msg323964 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-08-23 17:41
This is a follow-up of #34454. 'datetime' extension module attempts to encode input strings into UTF-8 in several places, which requires special care because some valid Python strings can't be represented in UTF-8. It makes sense to add more tests for methods dealing with strings.

Note that my PR doesn't attempt to deal with #34481. In cases where behavior differs between C and Python datetime impls the tests check only for  absence of crashes.
msg324201 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2018-08-27 21:54
Somewhat related: #6697.

Turns out there are already some tests here for this, specifically for the C version only: https://github.com/python/cpython/blob/master/Lib/test/datetimetester.py#L3328
msg324264 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-08-28 17:09
Yes, I've referenced the relevant message from #6697 in #34454.

The specific test you've referenced should be changed after #34481 is fixed -- it highlights inconsistency between C and Python implementations.
msg328282 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2018-10-23 06:36
New changeset 3b0047d8e982b10b34ab05fd207b7d513cc1188a by Tal Einat (Alexey Izbyshev) in branch 'master':
bpo-34482: test datetime classes' handling of non-UTF-8-encodable strings (GH-8878)
https://github.com/python/cpython/commit/3b0047d8e982b10b34ab05fd207b7d513cc1188a
msg328288 - (view) Author: miss-islington (miss-islington) Date: 2018-10-23 07:04
New changeset 313e5015d258778737bff766a8ccf997a0cc20c7 by Miss Islington (bot) in branch '3.7':
bpo-34482: test datetime classes' handling of non-UTF-8-encodable strings (GH-8878)
https://github.com/python/cpython/commit/313e5015d258778737bff766a8ccf997a0cc20c7
msg328289 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2018-10-23 07:06
Thanks for the PR, Alexey!
msg330484 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-11-27 00:25
It seems like thie issue introduced a memory leak: bpo-35322 .
msg330535 - (view) Author: Alexey Izbyshev (izbyshev) * (Python triager) Date: 2018-11-27 16:17
The added test exposed a leak in unicode_encode_locale(). See msg330534.
History
Date User Action Args
2018-11-27 16:17:11izbyshevsetmessages: + msg330535
2018-11-27 00:25:05vstinnersetnosy: + vstinner
messages: + msg330484
2018-10-23 07:06:16taleinatsetstatus: open -> closed
versions: - Python 3.6
messages: + msg328289

resolution: fixed
stage: patch review -> resolved
2018-10-23 07:04:27miss-islingtonsetnosy: + miss-islington
messages: + msg328288
2018-10-23 06:36:21miss-islingtonsetpull_requests: + pull_request9387
2018-10-23 06:36:12taleinatsetmessages: + msg328282
2018-08-28 17:09:07izbyshevsetmessages: + msg324264
2018-08-27 21:54:55p-gansslesetmessages: + msg324201
2018-08-23 17:53:12izbyshevsetpull_requests: + pull_request8355
2018-08-23 17:49:49izbyshevsetkeywords: + patch
stage: patch review
pull_requests: + pull_request8353
2018-08-23 17:41:50izbyshevcreate