Skip to content

ENH: add end and end_day origin for resample #38408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 22, 2020
Merged

ENH: add end and end_day origin for resample #38408

merged 16 commits into from
Dec 22, 2020

Conversation

GYHHAHA
Copy link
Contributor

@GYHHAHA GYHHAHA commented Dec 11, 2020

@GYHHAHA
Copy link
Contributor Author

GYHHAHA commented Dec 11, 2020

I have revert the backward arg and only add the end and end_day.
Could u take a look at this? Thanks! @jreback

@@ -292,6 +292,7 @@ Other enhancements
- Improved error reporting for subsetting columns of a :class:`.DataFrameGroupBy` with ``axis=1`` (:issue:`37725`)
- Implement method ``cross`` for :meth:`DataFrame.merge` and :meth:`DataFrame.join` (:issue:`5401`)
- When :func:`read_csv/sas/json` are called with ``chuncksize``/``iterator`` they can be used in a ``with`` statement as they return context-managers (:issue:`38225`)
- Added ``end`` and ``end_day`` options for ``origin`` in :meth:`DataFrame.resample` (:issue:`37804`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this will make it in for 1.2 - might need to move to 1.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback keep this in 1.2 or remove to 1.3 whatsnew?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this should target 1.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -611,3 +611,76 @@ def test_resample_agg_readonly():

result = rs.agg("min")
tm.assert_series_equal(result, expected)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to parametrize these tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ts = Series(np.arange(len(rng)) * 3, index=rng)

res = ts.resample("17min", origin="end").sum().astype("int64")
data = [0, 18, 27, 63]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd stick that in the constructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mroeschke said constructing ts in each test was better.

self.origin = origin
else:
try:
self.origin = Timestamp(origin)
except Exception as e:
raise ValueError(
"'origin' should be equal to 'epoch', 'start', 'start_day' or "
"'origin' should be equal to 'epoch', 'start', 'start_day', "
"'end' or 'end_day' "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"'end', 'end_day', or "

@@ -93,6 +94,11 @@ class Grouper:

.. versionadded:: 1.1.0

- 'end': `origin` is the last value of the timeseries
- 'end_day': `origin` is the ceiling midnight of the last day
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment/question here.

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments otherwise LGTM

@GYHHAHA
Copy link
Contributor Author

GYHHAHA commented Dec 12, 2020

Do u think it's worthwhile to explain more about these two backward resample method in ENH of whatsnew (maybe two examples added)?

@mroeschke
Copy link
Member

mroeschke commented Dec 12, 2020

Probably good to add to timeseries.rst#use-origin-or-offset-to-adjust-the-start-of-the-bins explaining the new options

@GYHHAHA
Copy link
Contributor Author

GYHHAHA commented Dec 12, 2020

ok

@jreback jreback added Enhancement Resample resample method labels Dec 13, 2020
@jreback jreback added this to the 1.3 milestone Dec 13, 2020
@GYHHAHA GYHHAHA requested a review from jreback December 14, 2020 07:41
@GYHHAHA
Copy link
Contributor Author

GYHHAHA commented Dec 14, 2020

failures unrelated

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comment, pls merge master and ping on green.

cc @mroeschke if any other comments

@@ -8027,7 +8027,8 @@ def resample(
level : str or int, optional
For a MultiIndex, level (name or number) to use for
resampling. `level` must be datetime-like.
origin : {'epoch', 'start', 'start_day'}, Timestamp or str, default 'start_day'
origin : {'epoch', 'start', 'start_day', 'end', 'end_day'}, Timestamp \
or str, default 'start_day'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you have to indent this 2nd line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here has a line-continuation at the end of 1st line, will this break the display if I indent the 'or str, default "start_day"' with 4 spaces?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah saw that but we don't usually use continuation like that

it will render if you indent on 2nd line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@GYHHAHA
Copy link
Contributor Author

GYHHAHA commented Dec 22, 2020

Now I change it to the following format.

origin : {{'epoch', 'start', 'start_day', 'end', 'end_day'}}, Timestamp
    or str, default 'start_day'
    The timestamp on which to adjust the grouping. The timezone of origin
    must match the timezone of the index.

Will this be wrongly rendered to this?

origin : {{'epoch', 'start', 'start_day', 'end', 'end_day'}}, Timestamp
or str, default 'start_day' The timestamp on which to adjust the grouping. The timezone of origin must match the timezone of the index.

Instead of:

origin : {{'epoch', 'start', 'start_day', 'end', 'end_day'}}, Timestamp or str, default 'start_day'
The timestamp on which to adjust the grouping. The timezone of origin must match the timezone of the index.

@jreback

@GYHHAHA GYHHAHA requested a review from jreback December 22, 2020 05:06
@jreback jreback merged commit 15b5946 into pandas-dev:master Dec 22, 2020
@jreback
Copy link
Contributor

jreback commented Dec 22, 2020

thanks @GYHHAHA

@GYHHAHA GYHHAHA deleted the add-end branch December 22, 2020 14:07
luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Add 'end' option in resample's origin
4 participants