Skip to content

Inconsistent Behavior of min on Timestamps w/ and w/o timezones #5967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cancan101 opened this issue Jan 16, 2014 · 4 comments
Closed

Inconsistent Behavior of min on Timestamps w/ and w/o timezones #5967

cancan101 opened this issue Jan 16, 2014 · 4 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations Timezones Timezone data dtype

Comments

@cancan101
Copy link
Contributor

related #4147

Observe w/o tz:

In [46]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1")]).min()
Out[46]: Timestamp('2013-01-01 00:00:00', tz=None)

however w/ tz:

In [48]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1", tz="US/Eastern")]).min()
Out[48]: NaT

strangely, max works:

In [60]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1", tz="US/Eastern")]).max()
Out[60]: Timestamp('2013-01-01 00:00:00-0500', tz='US/Eastern')
@jreback
Copy link
Contributor

jreback commented Jan 16, 2014

this is the same reason as #4147 as these series are object dtype

@jreback
Copy link
Contributor

jreback commented Jan 16, 2014

related - the string rep of this type of mixed series is wrong....

In [7]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1", tz="US/Eastern")])
Out[7]: 
0                          NaN
1    2013-01-01 00:00:00-05:00
dtype: object

In [8]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1", tz="US/Eastern")]).iloc[0]
Out[8]: NaT

In [9]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1", tz="US/Eastern")]).iloc[1]
Out[9]: Timestamp('2013-01-01 00:00:00-0500', tz='US/Eastern')

@cancan101
Copy link
Contributor Author

@jreback The issue you refer to in regards to formatting of NaT in an object array is caused by GenericArrayFormatter taking in only one value for na_rep which defaults to NaN. When formatting, that value is used as:

            if self.na_rep is not None and lib.checknull(x):
                if x is None:
                    return 'None'
                return self.na_rep

One option would be to add another argument nat_rep.

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Feb 18, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 1, 2015
@jorisvandenbossche
Copy link
Member

In the meantime, this works as expected:

In [9]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1", tz="US/Eastern")])
Out[9]: 
0                         NaT
1   2013-01-01 00:00:00-05:00
dtype: datetime64[ns, US/Eastern]

In [10]: pd.Series([pd.NaT, pd.Timestamp("2013-1-1", tz="US/Eastern")]).min()
Out[10]: Timestamp('2013-01-01 00:00:00-0500', tz='US/Eastern')

probably since timezone aware datetime is now a proper dtype and no longer object

@jorisvandenbossche jorisvandenbossche modified the milestones: No action, Next Major Release Nov 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

3 participants