-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
cannot output csv with IntervalIndex #28210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks, for the report. I can confirm this behavior on master: In [2]: df = pd.DataFrame({'a': [1, 2, 3]}, index=pd.interval_range(0, 3))
In [3]: df.to_csv()
---------------------------------------------------------------------------
TypeError: Argument 'data_index' has incorrect type (expected numpy.ndarray, got list) A temporary workaround is to cast the index to string prior to writing the output: In [4]: df.index = df.index.astype(str)
In [5]: df.to_csv()
Out[5]: ',a\n"(0, 1]",1\n"(1, 2]",2\n"(2, 3]",3\n' |
In the example
is the expected output
or is that just a temporary workaround to avoid an error? |
I think the string representation would be fine as the expected output. If a user wants a non-string output they can break the IntervalIndex out into |
Yes, that should be the expected output, or at the very least is consistent with previous versions where this was working: In [1]: import pandas as pd; pd.__version__
Out[1]: '0.23.4'
In [2]: df = pd.DataFrame({'a': [1, 2, 3]}, index=pd.interval_range(0, 3))
In [3]: df.to_csv()
Out[3]: ',a\n"(0, 1]",1\n"(1, 2]",2\n"(2, 3]",3\n Note that my example produces string output because I didn't pass anything for |
I search everywhere about how to read a csv file with IntervalIndex. Eventually, I found the solution, so I wanna share it here. def to_interval(istr):
c_left = istr[0]=='['
c_right = istr[-1]==']'
closed = {(True, False): 'left',
(False, True): 'right',
(True, True): 'both',
(False, False): 'neither'
}[c_left, c_right]
left, right = map(float, istr[1:-1].split(','))
return pd.Interval(left, right, closed)
# the IntervalIndex is the first column
df = pd.read_csv('data.csv', index_col=0, converters={0: to_interval}) |
Code Sample, a copy-pastable example if possible
Using
pd.interval_range
:Using
pd.IntervalIndex.from_arrays
:Problem description
Expected Output
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: