Skip to content

BUG: Transpose fails on DataFrame with timestamps having timezone #17539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pratapvardhan opened this issue Sep 15, 2017 · 4 comments
Closed
Labels
Datetime Datetime data dtype Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype

Comments

@pratapvardhan
Copy link
Contributor

pratapvardhan commented Sep 15, 2017

Code

In [33]: df = pd.DataFrame({'date1': [Timestamp('2017-05-25 14:02:23'), NaT],
             'date2': [Timestamp('2017-05-25 14:34:43'), Timestamp('2017-05-16 19:37:43')]})

In [34]: df2 = df.apply(lambda x: x.dt.tz_localize('UTC'), axis=0)

In [35]: df3 = df2.assign(date1=df2.date2)

In [36]: df
Out[36]:
                date1               date2
0 2017-05-25 14:02:23 2017-05-25 14:34:43
1                 NaT 2017-05-16 19:37:43

In [37]: df2
Out[37]:
                      date1                     date2
0 2017-05-25 14:02:23+00:00 2017-05-25 14:34:43+00:00
1                       NaT 2017-05-16 19:37:43+00:00

In [38]: df3
Out[38]:
                      date1                     date2
0 2017-05-25 14:34:43+00:00 2017-05-25 14:34:43+00:00
1 2017-05-16 19:37:43+00:00 2017-05-16 19:37:43+00:00

In [39]: df.dtypes
Out[39]:
date1    datetime64[ns]
date2    datetime64[ns]
dtype: object

In [40]: df2.dtypes
Out[40]:
date1    datetime64[ns, UTC]
date2    datetime64[ns, UTC]
dtype: object

In [41]: df3.dtypes
Out[41]:
date1    datetime64[ns, UTC]
date2    datetime64[ns, UTC]
dtype: object

In [42]: df.T
Out[42]:
                        0                   1
date1 2017-05-25 14:02:23                 NaT
date2 2017-05-25 14:34:43 2017-05-16 19:37:43

In [43]: df2.T
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-43-7d2317a1beae> in <module>()
----> 1 df2.T

e:\github\pandas\pandas\core\frame.pyc in transpose(self, *args, **kwargs)
   1876         """Transpose index and columns"""
   1877         nv.validate_transpose(args, dict())
-> 1878         return super(DataFrame, self).transpose(1, 0, **kwargs)
   1879
   1880     T = property(transpose)

e:\github\pandas\pandas\core\generic.pyc in transpose(self, *args, **kwargs)
    600
    601         nv.validate_transpose_for_generic(self, kwargs)
--> 602         return self._constructor(new_values, **new_axes).__finalize__(self)
    603
    604     def swapaxes(self, axis1, axis2, copy=True):

e:\github\pandas\pandas\core\frame.pyc in __init__(self, data, index, columns, dtype, copy)
    350             else:
    351                 mgr = self._init_ndarray(data, index, columns, dtype=dtype,
--> 352                                          copy=copy)
    353         elif isinstance(data, (list, types.GeneratorType)):
    354             if isinstance(data, types.GeneratorType):

e:\github\pandas\pandas\core\frame.pyc in _init_ndarray(self, values, index, columns, dtype, copy)
    522             values = maybe_infer_to_datetimelike(values)
    523
--> 524         return create_block_manager_from_blocks([values], [columns, index])
    525
    526     @property

e:\github\pandas\pandas\core\internals.pyc in create_block_manager_from_blocks(blocks, axes)
   4378                                      placement=slice(0, len(axes[0])))]
   4379
-> 4380         mgr = BlockManager(blocks, axes)
   4381         mgr._consolidate_inplace()
   4382         return mgr

e:\github\pandas\pandas\core\internals.pyc in __init__(self, blocks, axes, do_integrity_check, fastpath)
   2880                     raise AssertionError('Number of Block dimensions (%d) '
   2881                                          'must equal number of axes (%d)' %
-> 2882                                          (block.ndim, self.ndim))
   2883
   2884         if do_integrity_check:

AssertionError: Number of Block dimensions (1) must equal number of axes (2)

Problem description

Transpose on dataframe with timestamps columns work. But, if columns consist of timestamps with time-zone, it fails.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.21.0.dev+413.g7f93d2d
pytest: 3.2.0
pip: 9.0.1
setuptools: 36.2.7
Cython: 0.24.1
numpy: 1.12.1
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: 0.999999999
sqlalchemy: 1.0.13
pymysql: 0.7.9.None
psycopg2: 2.7.3.1 (dt dec pq3 ext lo64)
jinja2: 2.8
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@pratapvardhan
Copy link
Contributor Author

pratapvardhan commented Sep 15, 2017

Another case, where error indicates ValueError: the 'axes' parameter is not supported in the pandas implementation of transpose(), this operation works fine on non-timezone set timestamp columns.

In [64]: pd.DataFrame(pd.date_range(pd.to_datetime('now'), periods=2, freq='D'))
Out[64]:
                    0
0 2017-09-15 10:49:12
1 2017-09-16 10:49:12

In [65]: pd.DataFrame(pd.date_range(pd.to_datetime('now'), periods=2, freq='D', tz='UTC'))
Out[65]:
                          0
0 2017-09-15 10:49:14+00:00
1 2017-09-16 10:49:14+00:00

In [66]: pd.DataFrame(pd.date_range(pd.to_datetime('now'), periods=2, freq='D')).T
Out[66]:
                    0                   1
0 2017-09-15 10:49:16 2017-09-16 10:49:16

In [67]: pd.DataFrame(pd.date_range(pd.to_datetime('now'), periods=2, freq='D', tz='UTC')).T
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-67-3ba6caaa2c38> in <module>()
----> 1 pd.DataFrame(pd.date_range(pd.to_datetime('now'), periods=2, freq='D', tz='UTC')).T

e:\github\pandas\pandas\core\frame.pyc in transpose(self, *args, **kwargs)
   1876         """Transpose index and columns"""
   1877         nv.validate_transpose(args, dict())
-> 1878         return super(DataFrame, self).transpose(1, 0, **kwargs)
   1879
   1880     T = property(transpose)

e:\github\pandas\pandas\core\generic.pyc in transpose(self, *args, **kwargs)
    595         new_axes = self._construct_axes_dict_from(self, [self._get_axis(x)
    596                                                          for x in axes_names])
--> 597         new_values = self.values.transpose(axes_numbers)
    598         if kwargs.pop('copy', None) or (len(args) and args[-1]):
    599             new_values = new_values.copy()

e:\github\pandas\pandas\core\base.pyc in transpose(self, *args, **kwargs)
    782     def transpose(self, *args, **kwargs):
    783         """ return the transpose, which is by definition self """
--> 784         nv.validate_transpose(args, kwargs)
    785         return self
    786

e:\github\pandas\pandas\compat\numpy\function.pyc in __call__(self, args, kwargs, fname, max_fname_arg_count, method)
     52                 validate_args_and_kwargs(fname, args, kwargs,
     53                                          max_fname_arg_count,
---> 54                                          self.defaults)
     55             else:
     56                 raise ValueError("invalid validation method "

e:\github\pandas\pandas\util\_validators.pyc in validate_args_and_kwargs(fname, args, kwargs, max_fname_arg_count, compat_args)
    215
    216     kwargs.update(args_dict)
--> 217     validate_kwargs(fname, kwargs, compat_args)
    218
    219

e:\github\pandas\pandas\util\_validators.pyc in validate_kwargs(fname, kwargs, compat_args)
    154     kwds = kwargs.copy()
    155     _check_for_invalid_keys(fname, kwargs, compat_args)
--> 156     _check_for_default_values(fname, kwds, compat_args)
    157
    158

e:\github\pandas\pandas\util\_validators.pyc in _check_for_default_values(fname, arg_val_dict, compat_args)
     66                               "supported in the pandas "
     67                               "implementation of {fname}()".
---> 68                               format(fname=fname, arg=key)))
     69
     70

ValueError: the 'axes' parameter is not supported in the pandas implementation of transpose()

@jreback
Copy link
Contributor

jreback commented Sep 15, 2017

this is a manifestation and duplicate of #13287

@jreback jreback closed this as completed Sep 15, 2017
@jreback
Copy link
Contributor

jreback commented Sep 15, 2017

pull requests to fix are welcome

@jreback jreback added Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode Datetime Datetime data dtype Timezones Timezone data dtype labels Sep 15, 2017
@jreback jreback added this to the No action milestone Sep 15, 2017
@pratapvardhan
Copy link
Contributor Author

Ah, thanks, apologies didn't search enough. Is that issue also causing max(axis=1) on tz-aware timestamps to goof up?

In [108]: pd.DataFrame(pd.date_range('2014', periods=2, freq='D')).max(axis=1)
Out[108]:
0   2014-01-01
1   2014-01-02
dtype: datetime64[ns]

In [109]: pd.DataFrame(pd.date_range('2014', periods=2, freq='D', tz='UTC')).max(axis=1)
Out[109]:
0   NaN
1   NaN
dtype: float64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

2 participants