Skip to content

DOC: Update DateOffset intro in timeseries.rst #23385

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Dec 1, 2018

Conversation

mroeschke
Copy link
Member

Updated timeseries.rst introduction to DateOffsets

  • Explain relationship with Timedelta
  • Clarify explanation of rollforward/back
  • Bring all the offsets and their aliases to the front

Misc fixes:

  • Period dtypes are preserved in Series & DataFrames
  • Rendering for timezone nonexistent examples

@mroeschke
Copy link
Member Author

screen shot 2018-10-27 at 3 52 43 pm

screen shot 2018-10-27 at 3 52 57 pm

screen shot 2018-10-27 at 3 53 10 pm

@mroeschke
Copy link
Member Author

@datapythonista if you could have a look

@codecov
Copy link

codecov bot commented Oct 27, 2018

Codecov Report

Merging #23385 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23385   +/-   ##
=======================================
  Coverage   92.24%   92.24%           
=======================================
  Files         161      161           
  Lines       51318    51318           
=======================================
  Hits        47339    47339           
  Misses       3979     3979
Flag Coverage Δ
#multiple 90.63% <ø> (ø) ⬆️
#single 42.31% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/dtypes/common.py 94.37% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe52d9f...1f159b1. Read the comment docs.

@gfyoung gfyoung added Timedelta Timedelta data type 2/3 Compat Docs and removed 2/3 Compat labels Oct 27, 2018
@mroeschke mroeschke added the Frequency DateOffsets label Oct 28, 2018
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, added some comments.

It's probably worth that @jbrockmendel takes a look too.

@@ -8,6 +8,7 @@
import numpy as np
import pandas as pd
from pandas import offsets
from pandas.tseries.offsets import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd avoid importing star even in the documentation.

:class:`Series` and :class:`DataFrame` have extended data type support and functionality for ``datetime`` and ``timedelta``
data when the time data is used as data itself. The ``Period`` and ``DateOffset`` data will be stored as ``object`` data.
:class:`Series` and :class:`DataFrame` have extended data type support and functionality for ``datetime``, ``timedelta``
and ``Period`` data when the time data is used as data itself. The ``DateOffset`` data will be stored as ``object`` data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the "when the time data is used as data itself" not very clear

Subclasses of ``DateOffset`` define the ``apply`` function which dictates
custom date increment logic, such as adding business days:
These frequency strings map to a :class:`DateOffset` object and its subclasses. A :class:`DateOffset`
is similar to a :class:`Timedelta` that represents a duration of time but follows specific calendar duration rules.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an example (explaining, not code) would be helpful here. What calendar duration rules means can not be so obvious. Just saying that in one case we are adding 24 hours, and in the other we are going to the same time of the next day, even if that can be 23 or 25 hours, would make this much clearer I think.

@@ -968,6 +964,7 @@ particular day of the week:

.. ipython:: python

d = datetime(2008, 8, 18, 9, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer to import datetime, and then use datetime.datetime(...). I think it's more explicit, and better to import modules directly.

"""DateOffset increments between business days"""
def apply(self, other):
...
The basic :class:`DateOffset` acts similar to ``dateutil.relativedelta`` that shifts a date time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a link to dateutil docs for this

It's definitely worth exploring the ``pandas.tseries.offsets`` module and the
various docstrings for the classes.
# This particular day contains a day light savings time transition
ts = pd.Timestamp('2016-10-30 00:00:00', tz='Europe/Helsinki')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this section on DST a bit lower, as you really should explain DateOffsets first

@mroeschke
Copy link
Member Author

Updated based on the review.

Also change the page's from datetime import datetime, timedelta, time to import datetime and offsets are called like pd.offsets.* instead of them getting start imported from the top.

:class:`Series` and :class:`DataFrame` have extended data type support and functionality for ``datetime`` and ``timedelta``
data when the time data is used as data itself. The ``Period`` and ``DateOffset`` data will be stored as ``object`` data.
:class:`Series` and :class:`DataFrame` have extended data type support and functionality for ``datetime``, ``timedelta``
and ``Period`` data when passed into those constructors. ``DateOffset``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to be careful about this, as Period only supports a subset of DateOffsets, which has caused confusion in the past.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. This section is more describing what dtypes are supported by Series and DataFrames (i.e. Period dtype is supported but not DateOffset (since it doesnt really have its own data type))

@jreback
Copy link
Contributor

jreback commented Oct 30, 2018

can you post a rendering of this?

@mroeschke
Copy link
Member Author

screen shot 2018-10-30 at 9 57 07 am

screen shot 2018-10-30 at 9 57 14 am

screen shot 2018-10-30 at 9 57 24 am

screen shot 2018-10-30 at 9 57 34 am

screen shot 2018-10-30 at 9 57 40 am

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

is similar to a :class:`Timedelta` that represents a duration of time but follows specific calendar duration rules.
For example, a :class:`Timedelta` day will always increment ``datetimes`` by 24 hours, while a :class:`DateOffset` day
will increment ``datetimes`` to the same time the next day whether a day represents 23, 24 or 25 hours due to daylight
savings time. However, the following date offsets behave like :class:`Timedelta` and respect absolute time:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would explain the logic of this list: all offsets of an hour or smaller

(and maybe just put them in running text instead of a list)

@jreback
Copy link
Contributor

jreback commented Oct 31, 2018

lgtm.

@jreback jreback added this to the 0.24.0 milestone Oct 31, 2018
@jreback
Copy link
Contributor

jreback commented Nov 3, 2018

over to you @jorisvandenbossche

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some small comments

# Bring the date to the closest offset date (Monday)
offset.rollforward(ts)
# Date is brought to the closest offset date first and then the hour is added
ts + offset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't directly understand this difference between rollforward and addition based on this example. Would it be worth expanding more on it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this example is fairly clear when rendered. Feel free to let me know what part isn't totally clear

screen shot 2018-11-05 at 5 35 11 pm


.. ipython:: python

ts = pd.Timestamp('2014-01-01 09:00')
day = Day()
day = pd.offsets.Day()
day.apply(ts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply has not yet been explained (it was there before above, but in a part you deleted)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added this in an earlier section.

increment will be applied multiple times.
* It has :meth:`~pandas.DateOffset.rollforward` and
:meth:`~pandas.DateOffset.rollback` methods for moving a date forward or
backward to the next or previous "offset date".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling this summary list had some value as well. Should we keep it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I demonstrated the first two points in examples above and I think I clarified rollback and rollforward a lot better in its own section.

@mroeschke
Copy link
Member Author

@jorisvandenbossche any additional comments?

@jreback
Copy link
Contributor

jreback commented Nov 23, 2018

@jorisvandenbossche

@datapythonista
Copy link
Member

@jorisvandenbossche do you mind checking if your comments have been addressed and this can be merged? We're making other changes to timeseries.rst and would be nice to avoid conflicts as much as possible.

@mroeschke
Copy link
Member Author

I'll push this in tomorrow if there are no other comments. I can always follow up after merging this PR.

@datapythonista datapythonista merged commit b45eb26 into pandas-dev:master Dec 1, 2018
@datapythonista
Copy link
Member

thanks @mroeschke

@mroeschke mroeschke deleted the offset_intro branch December 2, 2018 02:09
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Frequency DateOffsets Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants