Skip to content

API: define _constructor_expanddim for subclassing Series and DataFrame #9802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 18, 2015

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Apr 3, 2015

Closes #9762. Does this needs release note?

@jreback
Copy link
Contributor

jreback commented Apr 3, 2015

as an aside can u put a small doc that shows how to subclass DataFrame now that it is reasonably supportered - maybe put it in the FAQ

@sinhrks
Copy link
Member Author

sinhrks commented Apr 3, 2015

Sure, will do. By the way, is there any difference between FAQ and Gotcha? Otherwise, it may better to merge them to be more searchable.

@jreback
Copy link
Contributor

jreback commented Apr 3, 2015

@jorisvandenbossche what about merging faq and gotchas? I would say yes

@sinhrks
Copy link
Member Author

sinhrks commented Apr 4, 2015

Added subclassing documentation. Pls review.

@jreback jreback added this to the 0.16.1 milestone Apr 4, 2015
@sinhrks sinhrks force-pushed the expanddim branch 2 times, most recently from 878199d to 1444fac Compare April 4, 2015 21:45
@shoyer
Copy link
Member

shoyer commented Apr 5, 2015

@sinhrks This documentation on subclassing is great

I don't have much feedback on the particulars because I have not done much subclassing of dataframe myself.

However, we should clarify two things:

  1. Is this subclassing API officially supported? My impression was that up until now, this has not been the case. We should think carefully about what we can guarantee this and in any case include a warning against using other internal methods. Perhaps we should explicitly label this subclassing API as "experimental" (we reserve the right to break it in the future if necessary).
  2. We should encourage using composition instead of inheritance when possible. This is a good opportunity to point out that the Index class, in particular, has a well defined API. For example of using composition with the pandas.Index to enable label based indexing, you could point users to xray.

Subclassing pandas Data Structures
----------------------------------

This section describes how to subclass ``pandas`` data structures to meet more specific needs. There are 2 points to be cared:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a note here, something like:

In general you should prefer to use composition rather than inheritance with pandas data structures., see here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 points which need attention:

@jorisvandenbossche
Copy link
Member

Nice docs! I agree with the questions raised by @shoyer, although I cannot say much about it myself as I never did some subclassing of dataframes.

But one remark: I don't think this belongs in a FAQ, but that we should start with some 'advanced' section in the docs (together with internals or so)

@jreback
Copy link
Contributor

jreback commented Apr 6, 2015

@sinhrks pls add a mention to try using monkey-patching first if you just want to add a function http://pandas.pydata.org/pandas-docs/stable/faq.html#adding-features-to-your-pandas-installation
maybe put this as the first warning box! (2nd would be the composition/inheritance)

@jreback
Copy link
Contributor

jreback commented Apr 6, 2015

@sinhrks ok as @jorisvandenbossche just mentioned, let's move this to the internals.rst section.

@sinhrks
Copy link
Member Author

sinhrks commented Apr 11, 2015

Fixed @jreback 's points, and moved it to internals.rst. And preparing a composition example in separate PR.

~~~~~~~~~~~~~~~~~~~~~~~~~~

To let original data structures have additional properties, you should let ``pandas`` knows what properties are added. ``pandas`` maps unknown properties to data names overriding ``__getattribute__``. Defining original properties can be done either ways:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in one of 2 ways:

@jreback
Copy link
Contributor

jreback commented Apr 17, 2015

@sinhrks lgtm, some minor comments.

I think you can add a release note in enhancements. Trying not to 'advertise' this too much :)

@sinhrks sinhrks force-pushed the expanddim branch 2 times, most recently from 6600e39 to d0099d1 Compare April 17, 2015 16:54
@sinhrks
Copy link
Member Author

sinhrks commented Apr 17, 2015

@jreback Thanks, fixed your points, and added a release note.

@jreback
Copy link
Contributor

jreback commented Apr 17, 2015

@sinhrks awesome.

I sent you an invite to change the pandas push permissions. As soon as you accept, pls make this your first PR to merged!

@sinhrks
Copy link
Member Author

sinhrks commented Apr 18, 2015

Thanks, let me merge this once after rebased.

sinhrks added a commit that referenced this pull request Apr 18, 2015
API: define _constructor_expanddim for subclassing Series and DataFrame
@sinhrks sinhrks merged commit 997426d into pandas-dev:master Apr 18, 2015
@jreback
Copy link
Contributor

jreback commented Apr 18, 2015

@sinhrks awesome!

@shoyer
Copy link
Member

shoyer commented Apr 19, 2015

Did we ever resolve the question of whether the subclassing API is officially guaranteed or not? I suppose the reality is that people are going to subclass whether we tell them it's OK or not.

@jreback
Copy link
Contributor

jreback commented Apr 19, 2015

@shoyer well enough warnings to not encourage it
but u r right people will do it anyways

it's pretty doable nowadays (though imho composition is much more useful)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: DataFrame constructor for subclassing Series
4 participants