Issue5381
Created on 2009-02-27 08:37 by rhettinger, last changed 2009-03-29 22:37 by bob.ippolito.
| File name |
Uploaded |
Description |
Edit |
Remove |
|
json_hook.diff
|
rhettinger,
2009-02-27 08:37
|
proof-of-concept patch: object_pair_hook() |
|
|
|
json_hook.diff
|
rhettinger,
2009-03-18 03:55
|
pairs hook patch with tests and docs |
|
|
|
msg82825 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-02-27 08:37 |
|
If PEP372 goes through, Python is going to gain an ordered dict soon.
The json module's encoder works well with it:
>>> items = [('one', 1), ('two', 2), ('three',3), ('four',4), ('five',5)]
>>> json.dumps(OrderedDict(items))
'{"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}'
But the decoder doesn't fare so well. The existing object_hook for the
decoder passes in a dictionary instead of a list of pairs. So, all the
ordering information is lost:
>>> jtext = '{"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}'
>>> json.loads(jtext, object_hook=OrderedDict)
OrderedDict({u'four': 4, u'three': 3, u'five': 5, u'two': 2, u'one': 1})
A solution is to provide an alternate hook that emits a sequence of
pairs. If present, that hook should run instead of object_hook. A
rough proof-of-concept patch is attached.
FWIW, sample ordered dict code is at:
http://code.activestate.com/recipes/576669/
|
|
msg82860 - (view) |
Author: Bob Ippolito (bob.ippolito) |
Date: 2009-02-27 18:48 |
|
Why? According to RFC (emphasis mine):
An object is an *unordered* collection of zero or more name/value
pairs, where a name is a string and a value is a string, number,
boolean, null, object, or array.
|
|
msg82864 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-02-27 19:59 |
|
Same reason as for config files and yaml files. Sometimes those files
represent human edited input and if a machine re-edits, filters, or
copies, it is nice to keep the original order (though it may make no
semantic difference to the computer).
For example, jsonrpc method invocations are done with objects having
three properties (method, params, id). The machine doesn't care about
the order of the properties but a human reader prefers the order listed:
--> {"method": "postMessage", "params": ["Hello all!"], "id": 99}
<-- {"result": 1, "error": null, "id": 99}
If you're testing a program that filters json data (like a typical xml
task), it is nice to write-out data in the same order received (failing
to do that is a common complaint about misdesigned xml filters):
--> {{"title": "awk", "author":"aho", "isbn":"123456789X"},
{"title": "taocp", "author":"knuth", "isbn":"987654321X"}"
<-- {{"title": "awk", "author":"aho"},
{"title": "taocp", "author":"knuth"}}
Semantically, those entries can be scrambled; however, someone reading
the filtered result desires that the input and output visually
correspond as much as possible. An object_pairs_hook makes this possible.
|
|
msg82865 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-02-27 20:11 |
|
FWIW, here's the intended code for the filter in the last post:
books = json.loads(infile, object_hook=OrderedDict)
for book in books:
del book['isbn']
json.dumps(books, outfile)
|
|
msg82870 - (view) |
Author: Bob Ippolito (bob.ippolito) |
Date: 2009-02-27 20:48 |
|
Fair enough, but the patch isn't usable because the decoder was rewritten
in a later version of simplejson. There's another issue with patch to
backport those back into Python http://bugs.python.org/issue4136 or you
could just use the simplejson source here http://code.google.com/p/simplejson/
|
|
msg82872 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-02-27 20:57 |
|
Thanks. I'll write-up a patch against
http://code.google.com/p/simplejson/ and assign it back to you for review.
|
|
msg82885 - (view) |
Author: Armin Ronacher (aronacher) |
Date: 2009-02-27 23:38 |
|
Motivation:
Yes. JSON says it's unordered. However Hashes in Ruby are ordered
since 1.9 and they were since the very beginning in JavaScript and PHP.
|
|
msg83164 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-03-04 23:39 |
|
After enhancing namedtuple and ConfigParser, I found a simpler approach
that doesn't involve extending the API. The simple way is to use
ordered dictionaries directly.
With a small tweak to OD's repr, it is fully substitutable for a dict
without changing any client code or doctests (the OD loses its own
eval/repr order-preserving roundtrip but what json already gives now).
See attached patch.
|
|
msg83165 - (view) |
Author: Bob Ippolito (bob.ippolito) |
Date: 2009-03-04 23:46 |
|
Unfortunately this is a patch for the old json lib... the new one has a C
API and an entirely different method of parsing documents (for performance
reasons).
|
|
msg83166 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-03-05 00:15 |
|
When do you expect the new C version to go in? I'm looking forward to it.
|
|
msg83170 - (view) |
Author: Bob Ippolito (bob.ippolito) |
Date: 2009-03-05 00:29 |
|
Whenever someone applies the patch for http://bugs.python.org/issue4136 --
I don't know when that will happen.
|
|
msg83733 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-03-18 03:55 |
|
Bob would you please take a look at the attached patch.
|
|
msg83819 - (view) |
Author: Bob Ippolito (bob.ippolito) |
Date: 2009-03-19 18:56 |
|
This patch looks good to me, my only comment is that the patch mixes tabs
and spaces in the C code in a file that had no tabs previously
|
|
msg83820 - (view) |
Author: Raymond Hettinger (rhettinger) |
Date: 2009-03-19 19:19 |
|
Thanks for looking at this.
Fixed the tab/space issue.
Committed in r70471
|
|
msg84441 - (view) |
Author: Bob Ippolito (bob.ippolito) |
Date: 2009-03-29 22:37 |
|
I fixed two problems with this that didn't show up in the test suite, this
feature didn't work in load() and there was a problem with the pure python
code path because the Python scanner needed a small change. Unfortunately
I'm not sure how to best test the pure python code path with Python's test
suite, but I ran across it when backporting to simplejson.
r70702
|
|
| Date |
User |
Action |
Args |
| 2009-03-29 22:37:55 | bob.ippolito | set | messages:
+ msg84441 |
| 2009-03-22 05:51:36 | cheeaun | set | nosy:
+ cheeaun
|
| 2009-03-19 19:19:28 | rhettinger | set | status: open -> closed resolution: accepted messages:
+ msg83820
|
| 2009-03-19 18:56:55 | bob.ippolito | set | messages:
+ msg83819 |
| 2009-03-18 03:55:07 | rhettinger | set | priority: normal -> high assignee: rhettinger -> bob.ippolito messages:
+ msg83733
files:
+ json_hook.diff |
| 2009-03-18 02:01:21 | rhettinger | set | files:
- json_ordered.diff |
| 2009-03-05 04:06:04 | rhettinger | set | title: json need object_pairs_hook -> json needs object_pairs_hook |
| 2009-03-05 00:29:21 | bob.ippolito | set | messages:
+ msg83170 |
| 2009-03-05 00:15:07 | rhettinger | set | messages:
+ msg83166 |
| 2009-03-04 23:46:19 | bob.ippolito | set | messages:
+ msg83165 |
| 2009-03-04 23:39:56 | rhettinger | set | files:
+ json_ordered.diff messages:
+ msg83164 |
| 2009-02-27 23:38:55 | aronacher | set | nosy:
+ aronacher messages:
+ msg82885 |
| 2009-02-27 20:57:26 | rhettinger | set | assignee: bob.ippolito -> rhettinger messages:
+ msg82872 |
| 2009-02-27 20:48:23 | bob.ippolito | set | resolution: invalid -> (no value) messages:
+ msg82870 |
| 2009-02-27 20:11:16 | rhettinger | set | messages:
+ msg82865 |
| 2009-02-27 19:59:12 | rhettinger | set | messages:
+ msg82864 |
| 2009-02-27 18:48:08 | bob.ippolito | set | resolution: invalid messages:
+ msg82860 |
| 2009-02-27 08:37:54 | rhettinger | create | |
|