Skip to content

Fix bug in contains when looking up a string in a non-monotonic datet… #13574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 45 commits into from

Conversation

tjader
Copy link

@tjader tjader commented Jul 6, 2016

…ime index and the object in question is first in the index.

…ime index and the object in question is first in the index.
@sinhrks sinhrks added Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves labels Jul 6, 2016
@@ -721,6 +721,17 @@ def test_fillna_datetime64(self):
dtype=object)
self.assert_index_equal(idx.fillna('x'), exp)

def test_contains(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u also add tests for TimedeltaIndex and PeriodIndex? (It looks work)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to just move this to the Datetimelike class which tests all datetimelike indexes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

further let's add testing with Timestamp and datetime as well (and a case where things are not there). finally add in a NaT both in the contained index as well as a tests with in

closes pandas-dev#12839

Author: adneu <[email protected]>

Closes pandas-dev#13316 from adneu/12839 and squashes the following commits:

16f5cd3 [adneu] Name change
ac1851a [adneu] Added docstrings/comments, and new tests.
4d73cbf [adneu] Updated tests
9b75df4 [adneu] BUG: Groupby.nth includes group key inconsistently pandas-dev#12839
#GH13572
dates = ['2015-01-03', '2015-01-01', '2015-01-04', '2015-01-05', '2015-01-02']
monotonic = pd.to_datetime(sorted(dates))
non_monotonic = pd.to_datetime(['2015-01-03', '2015-01-01', '2015-01-04', '2015-01-05', '2015-01-02'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can reuse dates here in the to_datetime call.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is NaT supposed to work with period indices?
The only NaT-like object I can successfully test with is this one

>>> pd.Period('NaT', freq='D') in pd.period_range('2015-01-01', periods=5, freq='D').insert(0, pd.NaT)
True

Any of pd.NaT, None, float('nan'), np.nan produce False as result, although they work on Datetimeindex objects with NaT's in them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx to confirm. Pls leave other nat-likes (pd.NaT, None, float('nan'), np.nan) ATM because PeriodIndex is not fully supported them yet (I'll cover in #12759, but PR is appreciated).

Copy link
Author

@tjader tjader Jul 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, TimedeltaIndices have a funny behavior as well. Checking for NaT-like objects always returns true

>>> None in pd.to_timedelta(range(5), unit='d') + pd.offsets.Hour(1)
True

Looks like pandas.index.TimedeltaEngine.get_loc doesn't handle NaT at all. It's probably possible to patch it on a higher level, but instinctively I feel that is where it should be handled.
Any thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sinhrks Thanks for clarifying. I've added #13582 for you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken out the timedelta issues as well, and created a new PR #13603.

jorisvandenbossche and others added 5 commits July 8, 2016 17:08
 - [x] closes pandas-dev#10690   - [x] tests added / passed   - [x] passes ``git
diff upstream/master | flake8 --diff``   - [x] whatsnew entry    the
Datetime64Formatter class did not accept a `formatter` argument, so
custom formatters passed in through `df.to_string` or `df.to_html`
were silently ignored.

Author: Haleemur Ali <[email protected]>

This patch had conflicts when merged, resolved by
Committer: Joris Van den Bossche <[email protected]>

Closes pandas-dev#13567 from haleemur/fix/dt64_outputformat and squashes the following commits:

8d84283 [Haleemur Ali] fix bug in Datetime64Formatter, which affected custom date formatted output for df.to_string, df.to_html methods
@codecov-io
Copy link

codecov-io commented Jul 9, 2016

Current coverage is 84.31%

Merging #13574 into master will decrease coverage by 0.03%

@@             master     #13574   diff @@
==========================================
  Files           138        138          
  Lines         51126      51157    +31   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43122      43132    +10   
- Misses         8004       8025    +21   
  Partials          0          0          

Powered by Codecov. Last updated by d38ee27...3c202b1

sinhrks and others added 8 commits July 10, 2016 17:02
closes pandas-dev#13078

Author: sinhrks <[email protected]>

Closes pandas-dev#13581 from sinhrks/dti_period_error and squashes the following commits:

c957541 [sinhrks] BUG: DatetimeIndex - Period shows ununderstandable error
Title is self-explanatory.  Closes pandas-dev#13352.

Author: gfyoung <[email protected]>

Closes pandas-dev#13425 from gfyoung/to-numeric-enhance and squashes the following commits:

4758dcc [gfyoung] ENH: add 'downcast' to pd.to_numeric
Remove workaround added in pandas-dev#353.

Author: sinhrks <[email protected]>

Closes pandas-dev#13606 from sinhrks/ops_radd_cln and squashes the following commits:

d873aad [sinhrks] CLN: remove radd workaround
closes pandas-dev#12160

Author: sinhrks <[email protected]>

Closes pandas-dev#13593 from sinhrks/depr_timestamp_offset and squashes the following commits:

c7749d5 [sinhrks] DEPR: rename Timestamp.offset to .freq
Author: sinhrks <[email protected]>

Closes pandas-dev#13583 from sinhrks/tzlocal and squashes the following commits:

93f59a3 [sinhrks] BUG: DTI doesnt handle tzlocal properly
@jorisvandenbossche jorisvandenbossche added this to the 0.19.0 milestone Jul 11, 2016
jorisvandenbossche and others added 7 commits July 11, 2016 17:02
* TST: Clean up tests of DataFrame.sort_{index,values}

* Factor out Series sorting tests to own file.

* Delegate deprecated sort() and order() to their own tests.

Before this commit, the `Series.sort_values()` tests relied on deprecated
`Series.sort()` and `Series.order()` as the source of truth. However
they both merely called `Series.sort_values()` under the hood.

This commit consolidates the core test logic against `.sort_values()`
directly, while `.sort()` and `.order()` merely check for equivalence
with `.sort_values()`.

Also removes some no-op assertions that had rotted from the old days of
`sort()`/`order()`.

* Remove 'by' docstring from Series.sort_values

* Document defaults for optional sorting args

* Move more sort_values, sort_index tests to be together.

* Add test for Series.sort_index(sort_remaining=True)

* Improve `sort_values` tests when multiple `by`s

Duplicates values in the test DataFrame are necessary
to fully test this feature.

* PEP8 cleanup

* Annotate tests with GH issue

* Fix indentation - docstring string replacement
Author: sinhrks <[email protected]>

Closes pandas-dev#13624 from sinhrks/timedelta_comp and squashes the following commits:

856df95 [sinhrks] BUG: Invalid Timedelta op may raise ValueError
Author: sinhrks <[email protected]>

Closes pandas-dev#13605 from sinhrks/ops_cln2 and squashes the following commits:

729997b [sinhrks] CLN: Cleanup ops.py
Follows up from pandas-dev#8486 in 0.15.0 by removing outtype in DataFrame.to_dict()
This commit suppresses these warnings

warning: comparison of constant -1 with expression\
of type 'PANDAS_DATETIMEUNIT' is always true\
[-Wtautological-constant-out-of-range-compare]

Author: yui-knk <[email protected]>

Closes pandas-dev#13607 from yui-knk/fix_c_warning and squashes the following commits:

e9eee1d [yui-knk] CLN: Fix compile time warnings
jorisvandenbossche and others added 14 commits July 13, 2016 12:31
closes pandas-dev#12503

Author: Jeff Reback <[email protected]>

Closes pandas-dev#13147 from jreback/types and squashes the following commits:

244649a [Jeff Reback] CLN: reorg type inference & introspection
Author: gfyoung <[email protected]>

Closes pandas-dev#13012 from gfyoung/categorical-reshape-validate and squashes the following commits:

3ad161d [gfyoung] API: Prevent invalid arguments to Categorical.reshape
Author: yui-knk <[email protected]>

Closes pandas-dev#13643 from yui-knk/warning2 and squashes the following commits:

ee3a4fb [yui-knk] CLN: Fix compile time warnings
* CLN: fix params list

* Fix issue in asv.conf.json for win32+other environment

Fix mistaken exclusion of virtualenv or existing:same on win32 in the config.

Credits: @pv

* CLN: remove DataMatrix

* ASV: fix exlusion of tables package for non-conda environments
follow-up for pandas-dev#13593

Author: sinhrks <[email protected]>

Closes pandas-dev#13610 from sinhrks/depr_timestamp_offset2 and squashes the following commits:

28f8d41 [sinhrks] TST: add tests for Timestamp.toordinal
 - [x] tests added / passed   - [x] passes ``git diff upstream/master
| flake8 --diff``    Rebased version of
pandas-dev#10229 which was [actually not](h
ttps://github.com/pandas-dev/pull/10229#issuecomment-131470116)
fixed by pandas-dev#10199.    Nothing
particular relevant, just wanted to delete this branch locally and
noticed it still applies: you'll judge what to do of it.

Author: Pietro Battiston <[email protected]>

Closes pandas-dev#13594 from toobaz/fix_checkunique and squashes the following commits:

a63bd12 [Pietro Battiston] CLN: Initialization coincides with mapping, hence with uniqueness check
closes pandas-dev#12759
closes pandas-dev#13582

Author: sinhrks <[email protected]>

Closes pandas-dev#13609 from sinhrks/period_nat and squashes the following commits:

9305c36 [sinhrks] COMPAT: Period(NaT) now returns pd.NaT
…nt64

closes pandas-dev#13646

Author: Jeff Reback <[email protected]>

Closes pandas-dev#13661 from jreback/foo and squashes the following commits:

e26f9bf [Jeff Reback] BUG: construction of Series with integers on windows not defaulting to int64
Deprecated back in `0.15.0` and therefore long overdue.  Closes pandas-dev#8376.

Author: gfyoung <[email protected]>

Closes pandas-dev#13612 from gfyoung/categorical-levels-remove and squashes the following commits:

f1254df [gfyoung] MAINT: Relocated backwards compat categorical pickle tests
f3321cb [gfyoung] CLN: Removed levels attribute from Categorical
@jreback
Copy link
Contributor

jreback commented Jul 15, 2016

needs a rebase. some of these tests may now be passing. #13609 fixed lots of things.

@tjader
Copy link
Author

tjader commented Jul 16, 2016

Hmm, I think I made a mistake in rebasing. Looks like I pulled in every single change in to this pull request. I'll close this one and make a new one.

@tjader tjader closed this Jul 16, 2016
@tjader tjader deleted the bugfixes branch July 16, 2016 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Not possible to find first element in non-monotonic DateTimeIndex if string is used