-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Supplying an xarray Dataset to DataFrame constructor breaks #12353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you have to call this never worked before. |
oh, this inherits from |
TBC, |
ok #12356 fixes for DataFrame, can you give me examples that should work for |
Great! Thanks. First, series: https://github.com/pydata/pandas/blob/master/pandas/core/series.py#L166 In [12]: series = pd.Series(range(10))
In [13]: series
Out[13]:
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
dtype: int64
In [14]: xr.Dataset(series)
Out[14]:
<xarray.Dataset>
Dimensions: ()
Coordinates:
*empty*
Data variables:
0 int64 0
1 int64 1
2 int64 2
3 int64 3
4 int64 4
5 int64 5
6 int64 6
7 int64 7
8 int64 8
9 int64 9
In [15]: pd.Series(xr.Dataset(series))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-15-60e12a391f83> in <module>()
----> 1 pd.Series(xr.Dataset(series))
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
223 else:
224 data = _sanitize_array(data, index, dtype, copy,
--> 225 raise_cast_failure=True)
226
227 data = SingleBlockManager(data, index, fastpath=True)
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
2855
2856 # scalar like
-> 2857 if subarr.ndim == 0:
2858 if isinstance(data, list): # pragma: no cover
2859 subarr = np.array(data, dtype=object)
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/xarray/core/common.py in __getattr__(self, name)
135 return source[name]
136 raise AttributeError("%r object has no attribute %r" %
--> 137 (type(self).__name__, name))
138
139 def __setattr__(self, name, value):
AttributeError: 'Dataset' object has no attribute 'ndim' |
Panel: https://github.com/pydata/pandas/blob/master/pandas/core/panel.py#L160 In [18]: xr.Dataset(pd.Panel(pd.np.random.rand(5,4,3)))
Out[18]:
<xarray.Dataset>
Dimensions: (dim_0: 4, dim_1: 3)
Coordinates:
* dim_0 (dim_0) int64 0 1 2 3
* dim_1 (dim_1) int64 0 1 2
Data variables:
0 (dim_0, dim_1) float64 0.8917 0.4159 0.6102 0.2616 0.2068 ...
1 (dim_0, dim_1) float64 0.4132 0.7464 0.6103 0.7006 0.8255 0.63 ...
2 (dim_0, dim_1) float64 0.7507 0.8742 0.1039 0.2819 0.06264 ...
3 (dim_0, dim_1) float64 0.3035 0.3156 0.8926 0.0023 0.05565 ...
4 (dim_0, dim_1) float64 0.6555 0.8872 0.04457 0.7503 0.8936 ...
In [19]: pd.Panel(xr.Dataset(pd.Panel(pd.np.random.rand(5,4,3))))
---------------------------------------------------------------------------
PandasError Traceback (most recent call last)
<ipython-input-19-ce579b9d3522> in <module>()
----> 1 pd.Panel(xr.Dataset(pd.Panel(pd.np.random.rand(5,4,3))))
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/panel.py in __init__(self, data, items, major_axis, minor_axis, copy, dtype)
133 copy=False, dtype=None):
134 self._init_data(data=data, items=items, major_axis=major_axis,
--> 135 minor_axis=minor_axis, copy=copy, dtype=dtype)
136
137 def _init_data(self, data, copy, dtype, **kwargs):
/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/panel.py in _init_data(self, data, copy, dtype, **kwargs)
173 copy = False
174 else: # pragma: no cover
--> 175 raise PandasError('Panel constructor not properly called!')
176
177 NDFrame.__init__(self, mgr, axes=axes, copy=copy, dtype=dtype)
PandasError: Panel constructor not properly called! |
the
|
This goes a bit deeper than I thought, but bear with me:
So my proposed changes are to do both of those - does that seem reasonable? ref pydata/xarray#740 on going the other direction |
pls update when you can |
The Mapping stuff is easy, I can do that. That coves
|
what is a dict with 0th dims? |
In [42]: ds = xr.Dataset(dict(zip(list('abcde'), range(4))))
In [43]: ds
Out[43]:
<xarray.Dataset>
Dimensions: ()
Coordinates:
*empty*
Data variables:
b int64 1
d int64 3
c int64 2
a int64 0
In [44]: list(dict(ds).values())[0]
Out[44]:
<xarray.DataArray 'b' ()>
array(1)
In [45]: list(dict(ds).values())[0].ndim
Out[45]: 0 Each of |
@jreback As discussed, we're using Should pandas objects be dict-like? I think that may break some code (i.e. in a Series constructor, supplying a Series is not like supplying a dictionary, given metadata). I think ideally pandas objects should be dict-like (and potentially inherit from Do you agree? |
no. a |
its not strictly necessary to inherit from |
OK great. I may have to fix some more issues then, where
Can you clarify? That we check for |
no I think
|
As covered here: #12056, you do need to inherit from Other abc.collections objects, like I think it would be good to have that inheritance - do you agree? |
yes I don't see a problem here |
It would be an API change though. Once you inherit from |
actually that's a good point |
@MaximilianR @jreback status of this issue? |
#12400 is half-finished; I'll try and finish it up in the next couple of weeks |
In sanitize_array we check for a |
Agreed, since a |
I think because this looks for
dict
rather than 'dict-like' orMapping
: https://github.com/pydata/pandas/blob/master/pandas/core/frame.py#L222The text was updated successfully, but these errors were encountered: