Skip to content

API/CLN: add in common operations to Series/Index, refactored as a OpsMixin #6380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 18, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Feb 17, 2014

closes #4551
closes #4056
closes #5519

allow a Series to utilize index methods for its index type, e.g. Series.year is now defined
for a Series with a DatetimeIndex or a PeriodIndex; trying this on an Index type will
now raise a TypeError. The following properties are affected:
date,time,year,month,day,hour,minute,second,weekofyear
week,dayofweek,dayofyear,quarter,microsecond,nanosecond,qyear
and methods: min(),max(),

In [1]: s = Series(np.random.randn(5),index=tm.makeDateIndex(5))

In [2]: s
Out[2]: 
2000-01-03   -0.301523
2000-01-04    1.104868
2000-01-05    0.321592
2000-01-06   -0.565114
2000-01-07    0.888334
Freq: B, dtype: float64

In [3]: s.year
Out[3]: 
2000-01-03    2000
2000-01-04    2000
2000-01-05    2000
2000-01-06    2000
2000-01-07    2000
Freq: B, dtype: int64

In [4]: s.index.year
Out[4]: Int64Index([2000, 2000, 2000, 2000, 2000], dtype='int64')

In [5]: Series(np.random.randn(5)).year
TypeError: cannot perform an year operations on this type <class 'pandas.core.index.Int64Index'>

@jreback jreback added this to the 0.14.0 milestone Feb 17, 2014
@cpcloud
Copy link
Member

cpcloud commented Feb 17, 2014

This looks cool. Slightly OT would it be useful to have a method to split dates into their parts? Something like split on datetime index? Would be useful for group by days months etc. although maybe there's a way to do that now

@cpcloud
Copy link
Member

cpcloud commented Feb 17, 2014

Or date_split since split is a bit too general sounding

@jreback
Copy link
Contributor Author

jreback commented Feb 17, 2014

you can just do index.date and index.time (which are not affected by this....)

I have to put a list of follow up stuff....

@jreback
Copy link
Contributor Author

jreback commented Feb 17, 2014

With this PR (note that DataFrame support missing for this, and I am not sure that we should add it
because then you could easily have name collisions with the method names, much less of an issue with Series)

In [4]: df.groupby(df.index.month).sum()
Out[4]: 
     0
1   12
2   14
3   16
4   18
5   20
6   22
7   24
8   26
9   28
10  30
11  32
12  34

[12 rows x 1 columns]

In [5]: s = Series(np.arange(24),index=date_range('20130101',periods=24,freq='MS'))

In [6]: s.groupby(s.month).sum()
Out[6]: 
1     12
2     14
3     16
4     18
5     20
6     22
7     24
8     26
9     28
10    30
11    32
12    34
dtype: int64

@jreback
Copy link
Contributor Author

jreback commented Feb 17, 2014

@jorisvandenbossche I have to push an update because the doc strings are not set...

so I can get

help(Series.month)

to work properly

but ipython help is odd

Series.month?

any idea how to do this?

@jorisvandenbossche
Copy link
Member

@jreback Do you mean that you get a verbose output fget/fset for properties with the ipython help ?? This is at the moment also the case for other properties. But in the Docstring: .. part, there should be an explanation of the property.

@jreback
Copy link
Contributor Author

jreback commented Feb 17, 2014

yes exactly

hmm ok so this is a known issue then
ok

I put these in the API as well
hopefully will build

@jreback
Copy link
Contributor Author

jreback commented Feb 17, 2014

@cpcloud @jtratner @jorisvandenbossche any more comments on this?

# facilitate the properties on the wrapped ops
def _field_accessor(name, docstring=None):
op_accessor = '_{0}'.format(name)
def f(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to use @wraps here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I originally did, but sort of did it 'manually' by assigning name/doc string....bad?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the source for wraps it basically does that ... only diff is that it updates the __module__ attribute and copies the __dict__ attribute. Not "bad" per se.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine ... I was just asking.

@jreback
Copy link
Contributor Author

jreback commented Feb 18, 2014

@cpcloud I took wraps out...not necessary as I am creating the function wrappers here from an accesssor, rather than wrapping another function

jreback added a commit that referenced this pull request Feb 18, 2014
API/CLN: add in common operations to Series/Index, refactored as a OpsMixin
@jreback jreback merged commit 4ebc5d1 into pandas-dev:master Feb 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Datetime Datetime data dtype Enhancement Frequency DateOffsets Internals Related to non-user accessible pandas implementation Period Period data type
Projects
None yet
3 participants