Skip to content

DOC: Add documentation for freq='infer' option of DatetimeIndex and TimedeltaIndex constructors #21128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jschendel opened this issue May 18, 2018 · 5 comments · Fixed by #21566

Comments

@jschendel
Copy link
Member

There currently doesn't appear to be any information regarding passing freq='infer' to the DatetimeIndex or TimedeltaIndex constructors; I only became aware of this option from reading over the code. Would be nice to document this feature in the docstrings for DatetimeIndex and TimedeltaIndex. Possibly in timeseries.rst and timdeltas.rst too.

Essentially, this allows users to set the frequency of the index as the inferred frequency upon creation:

In [2]: pd.DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], freq='infer')
Out[2]: DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], dtype='datetime64[ns]', freq='2D')

Note that the frequency is not automatically inferred if nothing is passed to the freq parameter:

In [3]: pd.DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'])
Out[3]: DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], dtype='datetime64[ns]', freq=None)

Similar example with TimedeltaIndex:

In [4]: pd.TimedeltaIndex(['0 days', '10 days', '20 days'], freq='infer')
Out[4]: TimedeltaIndex(['0 days', '10 days', '20 days'], dtype='timedelta64[ns]', freq='10D')

In [5]: pd.TimedeltaIndex(['0 days', '10 days', '20 days'])
Out[5]: TimedeltaIndex(['0 days', '10 days', '20 days'], dtype='timedelta64[ns]', freq=None)
@kirakrishnan
Copy link

kirakrishnan commented May 20, 2018

Hi jschendel, I am newbie to opensource. Can I work on this.?

@jschendel
Copy link
Member Author

@kirakrishnan : Sure, go for it!

@kirakrishnan
Copy link

Thanks @jschendel. This is my understanding. Please tell me if I am wrong

currently the documentation for freq in class DatetimeIndex()(pandas/core/indexes/datetimes.py)
freq : string or pandas offset object, optional
One of pandas date offset strings or corresponding objects

And the documentation for freq in class TimedeltaIndex(pandas/core/indexes/timedeltas.py)
freq: a frequency for the index, optional

And I need to update documentation this way
freq : 'infer', string or pandas offset object, optional
infer allows users to set the frequency of the index as the inferred frequency upon creation
One of pandas date offset strings or corresponding objects

freq: 'infer'
infer allows users to set the frequency of the index as the inferred frequency upon creation

adding this example under Conversions in timedeltas.rst
In [4]: pd.TimedeltaIndex(['0 days', '10 days', '20 days'], freq='infer')
Out[4]: TimedeltaIndex(['0 days', '10 days', '20 days'], dtype='timedelta64[ns]', freq='10D')

In [5]: pd.TimedeltaIndex(['0 days', '10 days', '20 days'])
Out[5]: TimedeltaIndex(['0 days', '10 days', '20 days'], dtype='timedelta64[ns]', freq=None)

adding this example under Generating Ranges of Timestamps in timeseries.rst
In [2]: pd.DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], freq='infer')
Out[2]: DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], dtype='datetime64[ns]', freq='2D')

In [3]: pd.DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'])
Out[3]: DatetimeIndex(['2018-01-01', '2018-01-03', '2018-01-05'], dtype='datetime64[ns]', freq=None)

@jschendel
Copy link
Member Author

Yes, that's more or less what to do. A few small adjustments:

For the DatetimeIndex docstring, I'd leave the freq type line as-is:

freq : string or pandas offset object, optional

And add an additional sentence to the description along the lines of (modify the wording as you see fit):

One of pandas date offset strings or corresponding objects. The string 'infer' 
can be passed in order to...

For the TimedeltaIndex docstring you can just copy and paste the freq portion to be identical to the DatetimeIndex docstring.

For timedeltas.rst, you can put this in the "TimedeltaIndex" section, immediately after the first example where the TimedeltaIndex constructor is used (so your changes would start around line 365).

Not entirely sure where this should go in timeseries.rst, as it looks like the convention is generally to use pd.to_datetime. You can maybe put it at the end of the "Converting to Timestamps" section. You can say something like:

You can also use the `DatetimeIndex` constructor directly:

and first give an example of the regular (freq omitted) usage, followed by an example with freq='infer'. So, a setup similar to what your modified "TimedelaIndex" section will look like.

@kirakrishnan
Copy link

Thank you so much @jschendel . I have made changes and created pull request. Please review it.
#21201

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment