Skip to content

bug in dataframe.to_json causing segfault with 0.16.0 #9781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nevermindewe opened this issue Apr 1, 2015 · 6 comments · Fixed by #9805
Closed

bug in dataframe.to_json causing segfault with 0.16.0 #9781

nevermindewe opened this issue Apr 1, 2015 · 6 comments · Fixed by #9805
Labels
Bug IO JSON read_json, to_json, json_normalize
Milestone

Comments

@nevermindewe
Copy link

calling to_json on an apparently empty dataframe may segfault.
using the same dataframe with 0.15.2 works.

A git bisect says that commit a67bef4 is when the problem first appeared.

segfault:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fd42aae8735 in PdBlock_iterBegin (_obj=0x7fd423d22410, tc=0x7fff1bf57310) at pandas/src/ujson/python/objToJSON.c:1057

(gdb) list
1052 // init a dedicated context for this column
1053 NpyArr_iterBegin(obj, tc);
1054 npyarr = GET_TC(tc)->npyarr;
1055
1056 // set the dataptr to our desired column and initialise
1057 npyarr->dataptr += npyarr->stride * idx;
1058 NpyArr_iterNext(obj, tc);
1059 GET_TC(tc)->itemValue = NULL;
1060 ((PyObjectEncoder*) tc->encoder)->npyCtxtPassthru = NULL;
1061

Following is the test code and ascii-pickle file that reproduces the behavior:

import pickle
p = pickle.load(open("p.pkl", "rb"))
p.to_json("/tmp/foo.json", date_format='iso', orient='records')

below is the pickled dataframe that causes the problem.
(I've been unable to construct another dataframe that exhibits this behavior)

ccopy_reg
_reconstructor
p0
(cpandas.core.frame
DataFrame
p1
c__builtin__
object
p2
Ntp3
Rp4
g0
(cpandas.core.internals
BlockManager
p5
g2
Ntp6
Rp7
((lp8
cpandas.core.index
_new_Index
p9
(cpandas.core.index
Index
p10
(dp11
S'data'
p12
cnumpy.core.multiarray
_reconstruct
p13
(cnumpy
ndarray
p14
(I0
tp15
S'b'
p16
tp17
Rp18
(I1
(I5
tp19
cnumpy
dtype
p20
(S'O8'
p21
I0
I1
tp22
Rp23
(I3
S'|'
p24
NNNI-1
I-1
I63
tp25
bI00
(lp26
S'symbol'
p27
aS'side'
p28
aS'shares'
p29
aS'entry_price'
p30
aS'entry_time'
p31
atp32
bsS'name'
p33
Nstp34
Rp35
ag9
(cpandas.core.index
Int64Index
p36
(dp37
g12
g13
(g14
(I0
tp38
g16
tp39
Rp40
(I1
(I0
tp41
g20
(S'i8'
p42
I0
I1
tp43
Rp44
(I3
S'<'
p45
NNNI-1
I-1
I0
tp46
bI00
S''
p47
tp48
bsg33
Nstp49
Rp50
a(lp51
g13
(g14
(I0
tp52
g16
tp53
Rp54
(I1
(I3
I0
tp55
g44
I00
g47
tp56
bag13
(g14
(I0
tp57
g16
tp58
Rp59
(I1
(I2
I0
tp60
g23
I00
(lp61
tp62
ba(lp63
g9
(g10
(dp64
g12
g13
(g14
(I0
tp65
g16
tp66
Rp67
(I1
(I3
tp68
g23
I00
(lp69
g28
ag29
ag31
atp70
bsg33
Nstp71
Rp72
ag9
(g10
(dp73
g12
g13
(g14
(I0
tp74
g16
tp75
Rp76
(I1
(I2
tp77
g23
I00
(lp78
g27
ag30
atp79
bsg33
Nstp80
Rp81
a(dp82
S'0.14.1'
p83
(dp84
S'axes'
p85
g8
sS'blocks'
p86
(lp87
(dp88
S'mgr_locs'
p89
g13
(g14
(I0
tp90
g16
tp91
Rp92
(I1
(I3
tp93
g44
I00
S'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00'
p94
tp95
bsS'values'
p96
g54
sa(dp97
g89
c__builtin__
slice
p98
(I0
I6
I3
tp99
Rp100
sg96
g59
sasstp101
bb.
@jreback
Copy link
Contributor

jreback commented Apr 2, 2015

pls show the original frame and its programatic construction. need to establish that this is not a result of something else.

@jreback
Copy link
Contributor

jreback commented Apr 2, 2015

cc @Komnomnomnom

@jreback jreback added Bug IO JSON read_json, to_json, json_normalize labels Apr 2, 2015
@nevermindewe
Copy link
Author

I don't think I can show you its programmatic construction.
The code that does the construction is IP encumbered.
I'll check and see if I can get permission, but it is unlikely.

n [6]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 0 entries
Data columns (total 5 columns):
symbol 0 non-null object
side 0 non-null int64
shares 0 non-null int64
entry_price 0 non-null object
entry_time 0 non-null int64
dtypes: int64(3), object(2)
memory usage: 0.0+ bytes

I have not been able to construct a dataframe on my own that exhibits this problem.

@jreback
Copy link
Contributor

jreback commented Apr 2, 2015

Here's a repro. Multi-dtype but empty columns.

In [1]: df = DataFrame({'A' : Series([],dtype='int64'), 'B' : Series([],dtype='object')})

In [2]: df.to_json()

@jreback jreback added this to the 0.16.1 milestone Apr 2, 2015
@nevermindewe
Copy link
Author

Excellent! Thanks!

@shoyer
Copy link
Member

shoyer commented Apr 3, 2015

Fixed by #9805.

@shoyer shoyer closed this as completed Apr 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants