Skip to content

ExtensionArray series.align(frame) not working #20576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Apr 2, 2018 · 0 comments
Closed

ExtensionArray series.align(frame) not working #20576

jorisvandenbossche opened this issue Apr 2, 2018 · 0 comments
Labels
ExtensionArray Extending pandas with custom dtypes or arrays.
Milestone

Comments

@jorisvandenbossche
Copy link
Member

Aligning a Series containing extension array data with a dataframe does not work if there is actually something to be aligned:

In [27]: from pandas.tests.extension.decimal.array import DecimalArray, make_data

In [28]: dec_arr = DecimalArray(make_data()[:3])

In [29]: s = pd.Series(dec_arr)

In [30]: s
Out[30]: 
0    0.29242561210243966929311909552779980003833770...
1    0.34798224977276304148432473084540106356143951...
2    0.04963128775050906771326708621927537024021148...
dtype: decimal

In [31]: frame = pd.DataFrame({'col': np.arange(3)})

In [32]: s.align(frame)
Out[32]: 
(0    0.29242561210243966929311909552779980003833770...
 1    0.34798224977276304148432473084540106356143951...
 2    0.04963128775050906771326708621927537024021148...
 dtype: decimal,    col
 0    0
 1    1
 2    2)

In [33]: frame = pd.DataFrame({'col': np.arange(4)})

In [34]: s.align(frame)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-34-94a43a65d337> in <module>()
----> 1 s.align(frame)

/home/joris/scipy/pandas/pandas/core/series.py in align(self, other, join, axis, level, copy, fill_value, method, limit, fill_axis, broadcast_axis)
   3231                                          fill_value=fill_value, method=method,
   3232                                          limit=limit, fill_axis=fill_axis,
-> 3233                                          broadcast_axis=broadcast_axis)
   3234 
   3235     def rename(self, index=None, **kwargs):

/home/joris/scipy/pandas/pandas/core/generic.py in align(self, other, join, axis, level, copy, fill_value, method, limit, fill_axis, broadcast_axis)
   7174                                      copy=copy, fill_value=fill_value,
   7175                                      method=method, limit=limit,
-> 7176                                      fill_axis=fill_axis)
   7177         elif isinstance(other, Series):
   7178             return self._align_series(other, join=join, axis=axis, level=level,

/home/joris/scipy/pandas/pandas/core/generic.py in _align_frame(self, other, join, axis, level, copy, fill_value, method, limit, fill_axis)
   7210         left = self._reindex_with_indexers(reindexers, copy=copy,
   7211                                            fill_value=fill_value,
-> 7212                                            allow_dups=True)
   7213         # other must be always DataFrame
   7214         right = other._reindex_with_indexers({0: [join_index, iridx],

/home/joris/scipy/pandas/pandas/core/generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
   3812                                                 fill_value=fill_value,
   3813                                                 allow_dups=allow_dups,
-> 3814                                                 copy=copy)
   3815 
   3816         if copy and new_data is self._data:

/home/joris/scipy/pandas/pandas/core/internals.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
   4377         if axis == 0:
   4378             new_blocks = self._slice_take_blocks_ax0(indexer,
-> 4379                                                      fill_tuple=(fill_value,))
   4380         else:
   4381             new_blocks = [blk.take_nd(indexer, axis=axis, fill_tuple=(

/home/joris/scipy/pandas/pandas/core/internals.py in _slice_take_blocks_ax0(self, slice_or_indexer, fill_tuple)
   4411             elif not allow_fill or self.ndim == 1:
   4412                 if allow_fill and fill_tuple[0] is None:
-> 4413                     _, fill_value = maybe_promote(blk.dtype)
   4414                     fill_tuple = (fill_value, )
   4415 

/home/joris/scipy/pandas/pandas/core/dtypes/cast.py in maybe_promote(dtype, fill_value)
    334     elif is_datetimetz(dtype):
    335         pass
--> 336     elif issubclass(np.dtype(dtype).type, string_types):
    337         dtype = np.object_
    338 

TypeError: data type not understood

Other combinations of aligning (frame with series, series with series) do work correctly:

In [36]: frame.align(s, axis=0)
Out[36]: 
(   col
 0    0
 1    1
 2    2
 3    3, 0    0.29242561210243966929311909552779980003833770...
 1    0.34798224977276304148432473084540106356143951...
 2    0.04963128775050906771326708621927537024021148...
 3                                                  NaN
 dtype: decimal)

In [37]: s.align(frame['col'])
Out[37]: 
(0    0.29242561210243966929311909552779980003833770...
 1    0.34798224977276304148432473084540106356143951...
 2    0.04963128775050906771326708621927537024021148...
 3                                                  NaN
 dtype: decimal, 0    0
 1    1
 2    2
 3    3
 Name: col, dtype: int64)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

No branches or pull requests

1 participant