-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Add references for different index types + examples for pd.Index #17680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8521ce5
to
4a0f354
Compare
pandas/core/indexes/base.py
Outdated
|
||
See Also | ||
--------- | ||
RangeIndex : Index implementing a monotonic integer range (very |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that most of these are self-describing, so you can leave off the description. I would keep the descriptions for
- RangeIndex
- MultiIndex
- IntervalIndex
- CategoricalIndex (maybe)
The rest can be put on a couple lines likes
DatetimeIndex, PeriodIndex, TimedeltaIndex
Int64Index, UInt64Index, Flaot64Index
since the names tell you pretty much everything you need to know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done.
4a0f354
to
723f696
Compare
Codecov Report
@@ Coverage Diff @@
## master #17680 +/- ##
==========================================
+ Coverage 91.23% 91.24% +<.01%
==========================================
Files 163 163
Lines 49810 49810
==========================================
+ Hits 45444 45447 +3
+ Misses 4366 4363 -3
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17680 +/- ##
==========================================
+ Coverage 91.23% 91.24% +<.01%
==========================================
Files 163 163
Lines 49810 49810
==========================================
+ Hits 45444 45447 +3
+ Misses 4366 4363 -3
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #17680 +/- ##
==========================================
- Coverage 91.24% 91.22% -0.03%
==========================================
Files 163 163
Lines 49967 49967
==========================================
- Hits 45593 45583 -10
- Misses 4374 4384 +10
Continue to review full report at Codecov.
|
pandas/core/indexes/base.py
Outdated
memory-efficient) | ||
CategoricalIndex : Index with :class:`Categorical` s. | ||
MultiIndex : A multi-level, or hierarchical, Index | ||
IntervalIndex : an Index of :class:`Interval` s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check locally that those links work inside a 'See also' description ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't. I've started building the docs and will to the check when it's built.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeriodIndex
has no link, others are ok. I assume it's missing in API.rst also?
Make sure to use sphinx 1.5.6, (not 1.6.*) else you'll be waiting a while.
…On Tue, Sep 26, 2017 at 9:56 AM, topper-123 ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pandas/core/indexes/base.py
<#17680 (comment)>:
> +
+ Examples
+ --------
+ >>> pd.Index([1, 2, 3])
+ Int64Index([1, 2, 3], dtype='int64')
+
+ >>> pd.Index(list('abc'))
+ Index(['a', 'b', 'c'], dtype='object')
+
+ See Also
+ ---------
+ RangeIndex : Index implementing a monotonic integer range (very
+ memory-efficient)
+ CategoricalIndex : Index with :class:`Categorical` s.
+ MultiIndex : A multi-level, or hierarchical, Index
+ IntervalIndex : an Index of :class:`Interval` s.
I hadn't. I've started building the docs and will to the check.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#17680 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIk7SdcRbBTmWrAi-UiCpd602c__Sks5smRClgaJpZM4PkEjQ>
.
|
pandas/core/indexes/period.py
Outdated
See Also | ||
--------- | ||
Index : The base pandas Index type | ||
DatetimeIndex : Index with datetime64 data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can u also like to Period
(like TimedeltaIndex
)?
pandas/core/indexes/timedeltas.py
Outdated
Index : The base pandas Index type | ||
TimeDelta : Represents a duration between two dates or times. | ||
DatetimeIndex : Index with datetime64 data | ||
PeriodIndex : Index holding regular time periods such as years, quarters, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer "Index with Period data", detail should be explained in PeriodIndex
.
723f696
to
4f85462
Compare
I've changed according to the various comment. In addition, I've also written a section on EDIT: there were some error for |
One more thing: In the doc string for I assume this is wrong and the phrase should be moved over to EDIT: I've moved that phrase over to |
b8b68f9
to
3055edb
Compare
pandas/core/indexes/base.py
Outdated
See Also | ||
--------- | ||
RangeIndex : Index implementing a monotonic integer range (very | ||
memory-efficient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth mentioning that RangeIndex
also makes some index operations computationally efficient.
pandas/core/indexes/base.py
Outdated
--------- | ||
RangeIndex : Index implementing a monotonic integer range (very | ||
memory-efficient) | ||
CategoricalIndex : Index with :class:`Categorical` s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: The CategoricalIndex
description is structured the same way as IntervalIndex
below, but worded slightly differently, "Index with" vs. "an Index of", which could be made consistent.
pandas/core/indexes/datetimes.py
Outdated
--------- | ||
Index : The base pandas Index type | ||
TimedeltaIndex : Index of timedelta64 data | ||
PeriodIndex : Index with Period data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Slightly different wording between TimedeltaIndex
and PeriodIndex
("Index of" vs. "Index with") could be made consistent.
Index : The base pandas Index type | ||
Period : Represents a period of time | ||
DatetimeIndex : Index with datetime64 data | ||
TimedeltaIndex : Index of timedelta64 data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same ("Index with" vs "Index of")
pandas/core/indexes/range.py
Outdated
|
||
RangeIndex is a memory-saving special case of Int64Index limited to | ||
representing monotonic ranges. This is the default index type used | ||
by DataFrame and Series when no explicit index is provided by the user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as previous RangeIndex
comment: could potentially mention that it makes some index operations computationally efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any examples, so I can understand it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since RangeIndex
is defined entirely from the start
, stop
, and step
, you can use these values to simply compute some properties in constant time instead of needing to do a pass over the index in linear time, e.g. max
, min
. Similarly, operations between two RangeIndex
can be computed in constant time, e.g. intersection
, equals
. You also get some properties for free (or constant time) that can be beneficial for index operations, e.g. monotonic, non-null, uniqueness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see it can be fasterst index type (also simply by virtue of not having to build a Int64-index everytime a df is created), but I feel more complex advantages should be referenced from the doc string to an explanation elsewere rather than being explained in the doc string itself. I.e. the doc string should only make that statement if there's some further explanation elsewere, e.g. in the user guide, which there isn't ATM (AFAICT).
Maybe you could take a stab at explaining this succintly in the user guide and add a later PR that builds on top of mine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't mean to suggest that you add details about it in depth, just merely mention the fact. Like how you're not actually explaining why RangeIndex
is memory-saving, you just mention that it is. So just something along the lines of "memory-efficient" -> "memory and computationally efficient" was all I was suggesting. Again, not something I'm dead set on including, just a thought.
Added some very small comments. This is fine as-is though; you don't need to go out of your way to address my comments unless you want to. |
Yes that is a left-over from when RangeIndex did not exist. |
3055edb
to
ec32a2c
Compare
doc/source/api.rst
Outdated
.. autosummary:: | ||
:toctree: generated/ | ||
|
||
PeriodIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also want to have this as 'class without autosummary' to not include all the attributes (but listing the different ones as you did below is certainly fine!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from adding the "class_without_autosummary" template, you should also add it to the 'hack' in numpydoc, see the diff of #17642
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done (I think, not sure what these actually do, please review it it's as intended).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that looks good
ec32a2c
to
a53a26c
Compare
doc/source/api.rst
Outdated
PeriodIndex.is_leap_year | ||
PeriodIndex.minute | ||
PeriodIndex.month | ||
PeriodIndex.ordinal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeriodIndex.ordinal
does this exist? (doc build gives problems with it, and locally I also don't see it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not there. I'll remove it.
ac31c44
to
e88bb31
Compare
I've added a comment in Is this ok to accept, or are there reservations? |
Yes, thanks a lot @topper-123 ! |
git diff upstream/master -u -- "*.py" | flake8 --diff
Added 'See also'-references for different index types to make it easier to find the correct index type when in doubt + added simple examples for pd.Index