Skip to content

Remove keep_tz kwarg from DatetimeIndex.to_frame #17826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 10, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 0 additions & 40 deletions pandas/core/indexes/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -915,46 +915,6 @@ def to_series(self, keep_tz=False):
index=self._shallow_copy(),
name=self.name)

def to_frame(self, index=True, keep_tz=False):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just a code dupe?

Copy link
Member Author

@jorisvandenbossche jorisvandenbossche Oct 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the keep_tz keyword is removed, then yes, this should be duplicate code.

See #17815 (comment) on removing that keyword or not (you merged the PR before my objection was dismissed or adapted :-))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm do we need to deprecate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, because you only merged it a few hours ago (@gfyoung was adding a new method).
(we might consider deprecating it for Index.to_series, but that is another issue.

Copy link
Member

@gfyoung gfyoung Oct 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure I agree with the choice to remove this argument. It could be confusing to people as to why we support it in one method (to_series) but not the other. Also, your argument as a historical argument was not particularly convincing since you were not 100% sure yourself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think that keep_tz should not be a valid argument, then let's deprecate across the board. I would be okay with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To illustrate my 'suboptimal behaviour' of above (this is current master):

In [1]: dtidx = pd.date_range("2017-01-01", periods=3, tz='Europe/Brussels')

In [2]: dtidx
Out[2]: 
DatetimeIndex(['2017-01-01 00:00:00+01:00', '2017-01-02 00:00:00+01:00',
               '2017-01-03 00:00:00+01:00'],
              dtype='datetime64[ns, Europe/Brussels]', freq='D')

In [3]: dtidx.to_frame()
Out[3]: 
                                            0
2017-01-01 00:00:00+01:00 2016-12-31 23:00:00
2017-01-02 00:00:00+01:00 2017-01-01 23:00:00
2017-01-03 00:00:00+01:00 2017-01-02 23:00:00

In [5]: dtidx.to_frame().dtypes
Out[5]: 
0    datetime64[ns]
dtype: object

so it was a timezone aware index, but it lost its timezone information in conversion to a dataframe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think we can deprecate the existing keep_tz and just always keep it. IIRC I put this in place before we had first class tz support and this was performance (hazy memory though).

@gfyoung do you or @jorisvandenbossche want to add deprecation onto this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably won't have time today, so will already merge this one.
I might have time later this week, but that will depend on when the RC is cut.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened an issue for the other one: #17832

Create a DataFrame with a column containing the DatetimeIndex.

.. versionadded:: 0.21.0

Parameters
----------
index : boolean, default True
Set the index of the returned DataFrame
as the original DatetimeIndex.

keep_tz : optional, defaults False.
return the data keeping the timezone.

If keep_tz is True:

If the timezone is not set, the resulting
Series will have a datetime64[ns] dtype.

Otherwise the DataFrame will have an datetime64[ns, tz] dtype;
the tz will be preserved.

If keep_tz is False:

DataFrame will have a datetime64[ns] dtype. TZ aware
objects will have the tz removed.

Returns
-------
DataFrame : a DataFrame containing the original DatetimeIndex data.
"""

from pandas import DataFrame
result = DataFrame(self._to_embed(keep_tz), columns=[self.name or 0])

if index:
result.index = self
return result

def _to_embed(self, keep_tz=False):
"""
return an array repr of this object, potentially casting to object
Expand Down