Message350567
This is still broken. With pandas being popular, it's more likely someone might hit it. Can we fix this?
At the very least, the error message needs to be made much more specific.
I have created a dictionary containing pandas stats.
```
def summary_stats(s):
"""
Calculate summary statistics for a series or list, s
returns a dictionary
"""
stats = {
'count': 0,
'max': 0,
'min': 0,
'mean': 0,
'median': 0,
'mode': 0,
'std': 0,
'z': (0,0)
}
stats['count'] = s.count()
stats['max'] = s.max()
stats['min'] = s.min()
stats['mean'] = round(s.mean(),3)
stats['median'] = s.median()
stats['mode'] = s.mode()[0]
stats['std'] = round(s.std(),3)
std3 = 3* stats['std']
low_z = round(stats['mean'] - (std3),3)
high_z = round(stats['mean'] + (std3),3)
stats['z'] = (low_z, high_z)
return(stats)
```
Apparently, pandas (sometimes) returns numpy ints and numpy floats.
Here's a piece of the dictionary:
```
{'count': 597,
'max': 0.95,
'min': 0.01,
'mean': 0.585,
'median': 0.58,
'mode': 0.59,
'std': 0.122,
'z': (0.219, 0.951)}
````
It looks fine, but when I try to dump the dict to json
```
with open('Data/station_stats.json', 'w') as fp:
json.dump(station_stats, fp)
```
I get this error
```
TypeError: Object of type int64 is not JSON serializable
```
**Much searching** led me to discover that I apparently have numpy ints which I have confirmed.
```
for key, value in station_stats['657']['Fluorescence'].items():
print(key, value, type(value))
count 3183 <class 'numpy.int64'>
max 2.8 <class 'float'>
min 0.02 <class 'float'>
mean 0.323 <class 'float'>
median 0.28 <class 'float'>
mode 0.24 <class 'numpy.float64'>
std 0.194 <class 'float'>
z (-0.259, 0.905) <class 'tuple'>
```
```
#### Problem description
pandas statistics sometimes produce numpy numerics.
numpy ints are not supported by json.dump
#### Expected Output
I expect ints, floats, strings, ... to be JSON srializable.
<details>
INSTALLED VERSIONS
------------------
commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Darwin
OS-release : 15.6.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 0.25.0
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.12
pytest : 5.0.1
hypothesis : None
sphinx : 2.1.2
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.4
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.7.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.1.0
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.0
sqlalchemy : 1.3.5
tables : 3.5.2
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8
</details> |
|
Date |
User |
Action |
Args |
2019-08-26 20:49:43 | vlbrown | set | recipients:
+ vlbrown, pitrou, r.david.murray, njs, Eli_B, serhiy.storchaka, thomas-arildsen, Amit Feller |
2019-08-26 20:49:43 | vlbrown | set | messageid: <1566852583.58.0.852592338962.issue24313@roundup.psfhosted.org> |
2019-08-26 20:49:43 | vlbrown | link | issue24313 messages |
2019-08-26 20:49:42 | vlbrown | create | |
|