Issue 30717: Add unicode grapheme cluster break algorithm

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/74902

classification

Title:	Add unicode grapheme cluster break algorithm
Type:	enhancement	Stage:	patch review
Components:	Interpreter Core, Unicode	Versions:	Python 3.7

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	Bert JW Regeer, Guillaume Sanchez, Manishearth, Socob, _savage, benjamin.peterson, bianjp, ezio.melotti, jwilk, lemburg, loewis, mcepl, methane, mrabarnett, p-ganssle, r.david.murray, scoder, serhiy.storchaka, steven.daprano, terry.reedy, vstinner, xiang.zhang
Priority:	normal	Keywords:

Created on 2017-06-20 19:15 by Guillaume Sanchez, last changed 2022-04-11 14:58 by admin.

Pull Requests
URL	Status	Linked	Edit
PR 2673	open	python-dev, 2017-07-11 23:14

Messages (27)
msg296478 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-06-20 19:15
"a⃑".center(width=5, fillchar=".") produces '..a⃑.' instead of '..a⃑..' The reason is that "a⃑" is composed of two code points (2 UCS4 chars), one 'a' and one combining code point "above arrow". str.center() counts the size of the string and fills it both sides with `fillchar` until the size reaches `width`. However, this size is certainly intended to be the number of characters and not the number of code points. The correct way to count characters is to use the grapheme clustering algorithm from UAX TR29. Turns out I implemented this myself already, and might do the PR if asked so, with a little help to make the C <-> Python glue. Thanks for your time.
msg296479 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-06-20 19:27
Obviously, I'm talking about str.center() but all functions needing a count of graphemes are then not totally correct. I can fix that and add the corresponding function, or an iterator over graphemes, or whatever seems right :)
msg296503 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2017-06-21 00:06
I don't think graphemes is the right term here. Graphemes are language dependent, for instance "ǆ" may be considered a grapheme in Croatian. https://en.wikipedia.org/wiki/D%C5%BE http://www.unicode.org/glossary/#grapheme I believe you are referring to combining characters: http://www.unicode.org/faq/char_combmark.html It is unfortunate that Python's string methods are naive about combining characters, and just count code points, but I'm not sure what the alternative is. For example the human reader may be surprised that these give two different results: py> len("naïve") 5 py> len("naïve") 6 I'm not sure if the effect will survive copying and pasting, but the first string uses U+00EF LATIN SMALL LETTER I WITH DIAERESIS while the second uses: U+0069 LATIN SMALL LETTER I + U+0308 COMBINING DIAERESIS And check out this surprising result: py> "xïoz"[::-1] 'zöix' It seems to me that it would be great if Python was fully aware of combining characters, its not so great if it is naïve, but it would be simply terrible if only a few methods were aware and the rest naïve. I don't have a good solution to this, but perhaps an iterator over (base character + combining marks) would be a good first step. Something like this? import unicodedata def chars(string): accum = [] for c in string: cat = unicodedata.category(c) if cat == 'Mn': accum.append(c) else: if accum: yield accum accum = [] accum.append(c) if accum: yield accum
msg296504 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-06-21 00:55
Thanks for all those interesting cases you brought here! I didn't think of that at all! I'm using the word "grapheme" as per the definition given in UAX TR29 which is not language/locale dependant [1]. This annex is very specific and precise about where to break "grapheme cluster" aka "when does a character starts and ends". Sadly, it's a bit more complex than just accumulating based on the Combining property. This annex gives a set of rules to implement, based on Grapheme_Cluster_Break property, and while those rules may naively be implemented as comparing adjacent pairs of code points, this is wrong and can be correctly and efficiently implemented as an automaton. My code [2] passes all tests from GraphemeBreakTests.txt (provided by Unicode). We can definitely do a generator like you propose, or rather do it in the C layer to gain more efficiency and coherence since the other string / Unicode operations are in the C layer (upper, lower, casefold, etc) Let me know what you guys think, what (and if) I should contribute :) [1] http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries [2] https://github.com/Vermeille/batriz/blob/master/src/str/grapheme_iterator.h#L31
msg296505 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2017-06-21 01:34
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries talks about grapheme clusters, not "graphemes" alone, and it seems clear to me that they are language dependent. For example, it says: The Unicode Standard provides default algorithms for determining grapheme cluster boundaries, with two variants: legacy grapheme clusters and extended grapheme clusters. The most appropriate variant depends on the language and operation involved. ... These algorithms can be adapted to produce tailored grapheme clusters for specific locales... Nevertheless, even just a basic API to either the legacy grapheme cluster or the extended grapheme cluster algorithms would be a good start. Can I suggest that the unicodedata module might be the right place for it? And thank you for volunteering to do the work on this!
msg297488 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2017-07-01 17:48
See also issue 12568.
msg298190 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-07-11 23:43
Hello to all of you, sorry for the delay. Been busy. I added the base code needed to built the grapheme cluster break algorithm. We now have the GraphemeBreakProperty available via unicodedata.grapheme_cluster_break() Can you check that the implementation correctly fits the design? I was not sure about adding that prop to unicodedata_db ou unicodectype_db, tbh. If it's all correct, I'll move forward with the automaton and the grapheme cluster breaking algorithm. Thanks!
msg298321 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-07-13 23:42
Hello, I implemented unicodedata.break_graphemes() that returns an iterators that spits consecutive graphemes. This is a "test" implementation meant to see what doesn't fits Python's style and design, to discuss naming and implementation details. https://github.com/python/cpython/pull/2673 Thanks for your time and interest
msg298325 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2017-07-14 00:36
Thank you, but I cannot review your C code. Can you start by telling us what the two functions: unicodedata.grapheme_cluster_break() unicodedata.break_graphemes() take as arguments, and what they return? If we were to call help(function), what would we see?
msg298326 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-07-14 00:43
Hello Steven! Thanks for your reactivity! unicodedata.grapheme_cluster_break() takes a unicode code point as an argument and return its GraphemeBreakProperty as a string. Possible values are listed here: http://www.unicode.org/reports/tr29/#CR help(unicodedata.grapheme_cluster_break) says: grapheme_cluster_break(chr, /) Returns the GraphemeBreakProperty assigned to the character chr as string. ==== unicodedata.break_graphemes() takes a unicode string as argument and returns an GraphemeClusterIterator that spits consecutive graphemes clusters. help(unicodedata.break_graphemes) says: break_graphemes(unistr, /) Returns an iterator to iterate over grapheme clusters in unistr. It uses extended grapheme cluster rules from TR29. Is there anything else you would like to know? Don't hesitate to ask :) Thank you for your time!
msg298336 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2017-07-14 04:25
I think it at least plausible that we should add implementations of some of the unicode standard's algorithms. Victor and Serhiy, as two of the active core devs most involved with unicode issues, what do you think?
msg299661 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-08-02 15:45
Hi, Are you guys still interested? I haven't heard from you in a while
msg299699 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-08-03 11:21
Issue18406 is closed as a duplicate of this issue. There are useful links in issue18406. In particular see a proto-PEP of Unicode Indexing Helper Module: http://mail.python.org/pipermail/python-dev/2001-July/015938.html I agreed that providing grapheme iterator would be useful. But it would be useful to provide also word and sentence iterators. Should iterators provide just substrings or their positions? I think emitting a pair (pos, substring) would be more useful. It is easier to create an iterator of substrings from the iterator of pairs than opposite. Alternatively an iterator could emit slice objects. Or special objects similar to re match objects.
msg299703 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-08-03 12:34
Thanks for your consideration. I'm currently fixing what's been asked in the reviews. > But it would be useful to provide also word and sentence iterators. I'll gladly do that as well! > I think emitting a pair (pos, substring) would be more useful. That means emitting a pair like ((start, end), substr) ? Is it pythonic to return a structure like this? For what it's worth, I don't like it, but I definitely understand the value of it. I'd prefer having two versions. One returning indexes, the other returning substrings. But... > Alternatively an iterator could emit slice objects. I really like that. Do we have a clear common agreement or preference on any option?
msg299706 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-08-03 13:05
I have a few criticism to do against that proto-PEP http://mail.python.org/pipermail/python-dev/2001-July/015938.html In particular, the fact that all those functions return an index prevents any state keeping. That's a problem because: > next_<indextype>(u, index) -> integer As you've seen it, in grapheme clustering (as well as words and line breaking), we have to have an automaton to decide on the breaking point. Which means that starting at an arbitrary index is not possible. > prev_<indextype>(u, index) -> integer Is it really necessary? It means implementing the same logic to go backward. In our current case, we'd need a backward grapheme cluster break automaton too. > <indextype>_start(u, index) -> integer > <indextype>_end(u, index) -> integer Not doable in O(1) for the same reason as next_<indextype>(). We need a context, and the code point itself cannot give enough information to know if it's the start/end of a given indextype.
msg299707 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2017-08-03 13:30
On Thu, Aug 03, 2017 at 11:21:38AM +0000, Serhiy Storchaka wrote: > Should iterators provide just substrings or their positions? [...] I think we're breaking new ground here and I'm not sure what the right API should be. Should we follow Perl 6? https://docs.perl6.org/type/Str Go has a "norm" package for dealing with normalised "characters" (graphemes). https://blog.golang.org/normalization http://godoc.org/golang.org/x/text/unicode/norm Are my comments unacceptible scope-creep? We've gone from talking about a grapheme cluster break algorithm to me talking about Perl6 and Go which have rich string APIs based on graphemes. I'm not even sure of the best place for this: - unicodedata - string - a new module? I don't think unicodedata is the right place -- that should be for data and processing of individual unicode code points, not string handling, and it shouldn't become a grab-bag of random unrelated functions just because they have something to do with Unicode. Can we mark this as having a Provisional API to give us time to decide on the best API before locking it in permanently? https://www.python.org/dev/peps/pep-0411/ I'm reluctant to say this, because it's a lot more work, but maybe this is complicated enough that we should go through a PEP.
msg299708 - (view)	Author: Marc-Andre Lemburg (lemburg) *	Date: 2017-08-03 14:11
On 03.08.2017 15:05, Guillaume Sanchez wrote: > > Guillaume Sanchez added the comment: > > I have a few criticism to do against that proto-PEP > > http://mail.python.org/pipermail/python-dev/2001-July/015938.html > > In particular, the fact that all those functions return an index prevents any state keeping. If you want state keeping for iterating over multiple <indextype> parts of the string, you can use an iterator. The APIs were inspired by the standard string.find() APIs, that's why they work on indexes and don't return Unicode strings. As such, they serve a different use case than an iterator. With the APIs, scanning would always start at the given index in the string and move forward/backward to the start of the next <indextype>.
msg299731 - (view)	Author: Stefan Behnel (scoder) *	Date: 2017-08-04 06:25
Wouldn't this be a typical case where we'd expect a module to evolve and gain usage on PyPI first, before adding it to the stdlib? Searching for "grapheme" in PyPI gives some results for me. Even if they do not cover what this ticket asks for, they might give inspiration for a suitable API design. And I'm probably missing other related packages by lack of a better search term. https://pypi.python.org/pypi?%3Aaction=search&term=grapheme
msg299734 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2017-08-04 06:46
The well known library for Unicode support in C++ and Java is ICU (International Components for Unicode). There is a Python wrapper [1]. This is a large complex library that covers many aspects of Unicode support. It's interface looks rather Javaic than Pythonic. Some parts of it already are covered by other parts of the stdlib (the str class, the codecs and locale modules). [1] https://pypi.python.org/pypi/PyICU/
msg299848 - (view)	Author: Guillaume Sanchez (Guillaume Sanchez) *	Date: 2017-08-07 13:47
> I don't think unicodedata is the right place I do agree with that. A new module sounds good, would it be a problem if that module would contain very few functions at first? > Can we mark this as having a Provisional API to give us time to decide on the best API before locking it in permanently? I'm not sure it's my call to make, but I would gladly consider that option. > we should go through a PEP. Why not. I may need a bit of guidance though. > If you want state keeping for iterating over multiple <indextype> parts of the string, you can use an iterator. Sure thing. It just wasn't specified like this in the proto-PEP. > The APIs were inspired by the standard string.find() APIs, that's why they work on indexes and don't return Unicode strings. As such, they serve a different use case than an iterator. I personally like having a generator returning slice objects, as suggested above. What would be some good objections to this? > Wouldn't this be a typical case where we'd expect a module to evolve and gain usage on PyPI first, before adding it to the stdlib? [...] they might give inspiration for a suitable API design I'll give it a look. > The well known library for Unicode support in C++ and Java is ICU Yes. I clearly don't want this PR to be interpreted as "we're needing ICU". ICU provides much much more than what I'm willing to provide. My goal here is just to "fix" how the python's standard library iterates over characters. Noting more, nothing less. One might think that splitlines() should be "fixed" too, and there is clearly matter to discuss here. Same for words splitting. However, I do not intend to bring normalization, which you already have, collations, locale dependant upcasing or lowercasing, etc. We might need a wheel, but we don't have to take the whole truck. How do we discuss all of this? Who's in charge of making decisions? How long should we debate? That's my first time contributing to Python and I'm new to all of that. Thanks for your time.
msg312035 - (view)	Author: Inada Naoki (methane) *	Date: 2018-02-12 09:52
We missed 3.7 train. I'm sorry about I couldn't review it. But I have many shine features I want in 3.7 and I have no time to review all. Especially, I need to understand tr29. It was hard job to me. I think publishing this (and any other functions relating to unicode) to PyPI is better than waiting 3.8. It make possible to discuss API design with working code, and make it "battle tested" before adding it to standard library.
msg359397 - (view)	Author: Manish Goregaokar (Manishearth) *	Date: 2020-01-06 03:21
Hi, Unicodey person here, I'm involved in Unicode itself and also maintain an implementation of this particular spec[1]. So, firstly, > "a⃑".center(width=5, fillchar=".") If you're trying to do terminal width stuff, extended grapheme clusters will not solve the problem for you. There is no algorithm specified in Unicode that does this, because this is font dependent. Extended grapheme clusters are better than code points for this, however, and will not ever produce worse results. It's fine to expose this, but it's worth adding caveats. Also, yes, please do not expose a direct indexing function. Aside from almost all Unicode algorithms being streaming algorithms and thus inefficient to index directly, needing to directly index a grapheme cluster is almost always a sign that you are making a mistake. > Yes. I clearly don't want this PR to be interpreted as "we're needing ICU". ICU provides much much more than what I'm willing to provide. My goal here is just to "fix" how the python's standard library iterates over characters. Noting more, nothing less. I think it would be a mistake to make the stdlib use this for most notions of what a "character" is, as I said this notion is also inaccurate. Having an iterator library somewhere that you can use and compose is great, changing the internal workings of string operations would be a major change, and not entirely productive. There's only one language I can think of that uses extended grapheme clusters as its default notion of "character": Swift. Swift is largely designed for UI stuff, and it makes sense in this context. This is also baked in very deeply to the language (e.g. their Character class is a thin wrapper around String, since grapheme clusters can be arbitrarily large). You'd need a pretty major paradigm shift for python to make a similar change, and it doesn't make as much sense for python in the first place. Starting off with a library published to pypi makes sense to me. [1]: https://github.com/unicode-rs/unicode-segmentation/
msg359398 - (view)	Author: Manish Goregaokar (Manishearth) *	Date: 2020-01-06 03:25
Oh, also, if y'all are fine with binding to Rust (through a C ABI) I'd love to help y'all use unicode-segmentation, which is much less work that pulling in ICU. Otherwise if y'all have implementation questions I can answer them. This spec is kinda tricky to implement efficiently, but it's not super hard.
msg359408 - (view)	Author: Steven D'Aprano (steven.daprano) *	Date: 2020-01-06 08:44
> I think it would be a mistake to make the stdlib use this for most > notions of what a "character" is, as I said this notion is also > inaccurate. Having an iterator library somewhere that you can use and > compose is great, changing the internal workings of string operations > would be a major change, and not entirely productive. Agreed. I won't pretend to be able to predict what Python 5.0 will bring wink but there's too much history around the "code point = character" notion for the language to change now. If the language can expose a grapheme iterator, then people can experiment with grapheme-based APIs in libraries. (By grapheme I mean "extended grapheme cluster", but that's a mouthful. Sorry linguists.) What do you think of these as a set of grapheme primitives? (1) is_grapheme_break(string, i) Return True if a grapheme break would occur before string[i]. (2) graphemes(string, start=0, end=len(string)) Iterate over graphemes in string[start:end]. (3) graphemes_reversed(string, start=0, end=len(string)) Iterate over graphemes in reverse order. I think is_grapheme_break would be enough for people to implement their own versions of graphemes and graphemes_reversed. Here's an untested version: def graphemes(string, start, end): cluster = [] for i in range(start, end): c = string[i] if is_grapheme_break(string, i): if i != start: # don't yield the empty cluster at Start Of Text yield ''.join(cluster) cluster = [c] else: cluster.append(c) if cluster: yield ''.join(cluster) Regarding is_grapheme_break, if I understand the note here: https://www.unicode.org/reports/tr29/#Testing one never needs to look at more than two adjacent code points to tell whether or not a grapheme break will occur between them, so this ought to be pretty efficient. At worst, it needs to look at string[i-1] and string[i], if they exist.
msg359409 - (view)	Author: Manish Goregaokar (Manishearth) *	Date: 2020-01-06 09:13
> one never needs to look at more than two adjacent code points to tell whether or not a grapheme break will occur between them, so this ought to be pretty efficient. That note is outdated (and has been outdated since Unicode 9). The regional indicator rules (GB12 and GB13) and the emoji rule (GB11) require arbitrary lookbehind (though thankfully not arbitrary lookahead). I think the ideal API surface is an iterator and nothing else. Everything else can be derived from the iterator. It's theoretically possible to expose an is_grapheme_break that's faster than just iterating -- look at the code in unicode-segmentation's _reverse_ iterator to see how -- but it's going to be tricky to get right. Building the iterator on top of is_grapheme_break is not a good idea.
msg359450 - (view)	Author: Paul Ganssle (p-ganssle) *	Date: 2020-01-06 18:52
> Oh, also, if y'all are fine with binding to Rust (through a C ABI) I'd love to help y'all use unicode-segmentation, which is much less work that pulling in ICU. Otherwise if y'all have implementation questions I can answer them. This spec is kinda tricky to implement efficiently, but it's not super hard. Is the idea here that we'd take on a new dependency on the compiled `unicode-segmentation` binary, rather than adding Rust into our build system? Does `unicode-segmentation` support all platforms that CPython supports? I was under the impression that Rust requires llvm and llvm doesn't necessarily have the same support matrix as CPython (I'd love to be corrected if I'm wrong on this). (Note: I don't actually know what the process is for taking on new dependencies like this, just trying to point at one possible stumbling block.)
msg359496 - (view)	Author: Manish Goregaokar (Manishearth) *	Date: 2020-01-07 08:20
> Does `unicode-segmentation` support all platforms that CPython supports? It's no-std, so it supports everything the base Rust compiler supports (which is basically everything llvm supports). And yeah, if there's something that doesn't match with the support matrix this isn't going to work. However, I suggested this more for the potential PyPI package. If you're working this into CPython you'd have to figure out how best to include Rust stuff in your build system, which seems like a giant chunk of scope creep :) For including in CPython I'd suggest looking through unicode-segmentation and writing a C version of it. We use a python script[1] to generate the data tables, this might be something y'all can use. Swift's UAX 29 implementation is also quite interesting, however it's baked in deeply to the language so it's less useful as a starting point. [1]: https://github.com/unicode-rs/unicode-segmentation/blob/master/scripts/unicode.py

History
Date	User	Action	Args
2022-04-11 14:58:47	admin	set	github: 74902
2021-06-29 16:33:48	jwilk	set	nosy: + jwilk
2020-01-07 08:20:10	Manishearth	set	messages: + msg359496
2020-01-06 18:52:06	p-ganssle	set	messages: + msg359450
2020-01-06 09:13:05	Manishearth	set	messages: + msg359409
2020-01-06 08:44:47	steven.daprano	set	messages: + msg359408
2020-01-06 03:25:02	Manishearth	set	messages: + msg359398
2020-01-06 03:21:29	Manishearth	set	nosy: + Manishearth messages: + msg359397
2019-02-20 00:47:32	Bert JW Regeer	set	nosy: + Bert JW Regeer
2019-02-19 22:53:00	_savage	set	nosy: + _savage
2018-09-09 17:47:32	mcepl	set	nosy: + mcepl
2018-08-23 21:01:13	p-ganssle	set	nosy: + p-ganssle
2018-08-20 03:51:21	xiang.zhang	set	nosy: + xiang.zhang
2018-08-18 08:09:28	bianjp	set	nosy: + bianjp
2018-02-12 09:52:06	methane	set	nosy: + methane messages: + msg312035
2017-08-07 13:47:49	Guillaume Sanchez	set	messages: + msg299848
2017-08-04 06:46:38	serhiy.storchaka	set	messages: + msg299734
2017-08-04 06:25:20	scoder	set	nosy: + scoder messages: + msg299731
2017-08-03 14:11:17	lemburg	set	messages: + msg299708
2017-08-03 13:30:56	steven.daprano	set	messages: + msg299707
2017-08-03 13:05:32	Guillaume Sanchez	set	messages: + msg299706
2017-08-03 12:34:04	Guillaume Sanchez	set	messages: + msg299703
2017-08-03 11:21:37	serhiy.storchaka	set	nosy: + mrabarnett messages: + msg299699
2017-08-03 11:07:26	serhiy.storchaka	link	issue18406 superseder
2017-08-03 11:03:59	serhiy.storchaka	set	nosy: + lemburg, loewis, benjamin.peterson, ezio.melotti stage: needs patch -> patch review components: + Unicode title: str.center() is not unicode aware -> Add unicode grapheme cluster break algorithm
2017-08-02 15:45:54	Guillaume Sanchez	set	messages: + msg299661
2017-07-24 02:18:38	Socob	set	nosy: + Socob
2017-07-15 11:06:10	christian.heimes	set	assignee: christian.heimes -> components: + Interpreter Core, - Tests, Tkinter, SSL nosy: - christian.heimes
2017-07-14 04:25:14	terry.reedy	set	nosy: + terry.reedy, vstinner, serhiy.storchaka messages: + msg298336
2017-07-14 00:43:36	Guillaume Sanchez	set	nosy: + christian.heimes messages: + msg298326 assignee: christian.heimes components: + Tests, Tkinter, SSL, - Library (Lib)
2017-07-14 00:36:45	steven.daprano	set	messages: + msg298325
2017-07-13 23:42:08	Guillaume Sanchez	set	messages: + msg298321
2017-07-11 23:43:00	Guillaume Sanchez	set	messages: + msg298190
2017-07-11 23:14:06	python-dev	set	pull_requests: + pull_request2741
2017-07-01 17:48:23	r.david.murray	set	nosy: + r.david.murray messages: + msg297488
2017-06-21 03:53:00	Mariatta	set	type: enhancement stage: needs patch
2017-06-21 01:34:08	steven.daprano	set	messages: + msg296505
2017-06-21 00:55:41	Guillaume Sanchez	set	messages: + msg296504
2017-06-21 00:06:50	steven.daprano	set	nosy: + steven.daprano messages: + msg296503
2017-06-20 19:27:45	Guillaume Sanchez	set	messages: + msg296479
2017-06-20 19:15:22	Guillaume Sanchez	create