-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
allowing datetime and timedelta datatype in pd cut bins #14798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The change is currently WIP, will add tests and other change, @jorisvandenbossche |
Current coverage is 84.64% (diff: 88.88%)@@ master #14798 diff @@
==========================================
Files 144 144
Lines 51021 51030 +9
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43188 43196 +8
- Misses 7833 7834 +1
Partials 0 0
|
@@ -313,6 +313,18 @@ def test_datetime_cut(self): | |||
result, bins = cut(data, 3, retbins=True) | |||
tm.assert_series_equal(Series(result), expected) | |||
|
|||
def test_datetime_bin(self): | |||
data = [np.datetime64('2012-12-13'), np.datetime64('2012-12-15')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue number as a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback i dont think there is an open issue for this change, @jorisvandenbossche had proposed this change when i was making changes to cut to allow datetime and timedelta data types
@@ -313,6 +313,18 @@ def test_datetime_cut(self): | |||
result, bins = cut(data, 3, retbins=True) | |||
tm.assert_series_equal(Series(result), expected) | |||
|
|||
def test_datetime_bin(self): | |||
data = [np.datetime64('2012-12-13'), np.datetime64('2012-12-15')] | |||
bins = [np.datetime64('2012-12-12'), np.datetime64('2012-12-14'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use pd.Timestamp(....)
instead of direct np.datetime64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually you prob want to test with datetime.datetime
, Timestamp
, np.datetime64
(but just put them in a loop something like
data = ['2012-12-12', '2012-12-14']
for conv in [Timestamp(x).to_pydatetime, Timestamp, np.datetime64]:
bins = [ conv(v) for v in data ]
also test
bins = pd.to_datetime(data)
these should all work, because internally you need to wrap a Timestamp
converter around each of fhe bins (if dtype==M8) or Timedelta
if dtype==m8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback made the changes, i added a new method in tile.py to handle the time type bins
@aileronajay can you update this? |
@jorisvandenbossche i am caught up with some stuff right now, i should be able to make these changes next week |
c4c7e9f
to
82bffa1
Compare
thanks! |
xref pandas-dev#14714, follow-on to pandas-dev#14737 Author: Ajay Saxena <[email protected]> Closes pandas-dev#14798 from aileronajay/cut_timetype_bin and squashes the following commits: 82bffa1 [Ajay Saxena] added method for time type bins in pd cut and modified tests ac919cf [Ajay Saxena] added test for datetime bin type 355e569 [Ajay Saxena] allowing datetime and timedelta datatype in pd cut bins
xref #14714, follow-on to #14737