-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: should the RangeIndex constructor accept start=range_index? #12067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In general constructors in pandas are "too smart", but we are stuck in a lot of cases. I don't know that Index constructor idempotency is an API guarantee we necessarily need to make. It was part of the original effort to be similar to For internal pandas development, we should probably start biasing toward more precise internal APIs where possible, but when user convenience is concerned (where I've typically leaned toward more "polymorphic" / intuitive constructors at the expense of purity) there's a decent argument. So I know it is a significant improvement and lowers memory usage, but is |
So happy to discuss:
@shoyer made two points: is this user-facing or a performance enhancement. It is both, hence the intertanglement of the two. |
@shoyer |
to satisfy @shoyer concern about magic and my own about idempotency/consistency I would suggest we do this:
if consensus, this is a pretty easy change. |
@jreback the only downside of your new proposal is that the repr now no longer matches the constructor. I suppose we could make it |
hmm, right. could do
bug agreed that is a bit more verbose of course could accept |
xref #11892 (see changes and discussion)
RangeIndex, a new Int64Index subclass that will be used as the new default index (currently) has the following signature:
RangeIndex(start=None, stop=None, step=None, name=None...)
For converting an
xrange
/range
object into a RangeIndex, we have the constructorRangeIndex.from_range
.Although it isn't documented, the current constructor supports accepting a
RangeIndex
instances as thestart
argument, in which case a copy of the original index is produced, e.g.,RangeIndex(RangeIndex(0, 10, 1)) -> RangeIndex(0, 10, 1)
.I think this is a mistake: it is reusing the
start
parameter for a different type of argument, at odds with the documented behavior and predictable function signatures. If a user wants to create another RangeIndex from a RangeIndex, they can simply use the Index construtor, e.g.,Index(RangeIndex(...))
.@jreback argues that not accepting a
RangeIndex
as the first argument toRangeIndex
would be an API break, because it would create an Index constructor that is not idempotent. It would be "a major break with the current Index design".Thoughts from anyone else?
The text was updated successfully, but these errors were encountered: